Contained in bytes 0x40 to 0x7F.
The first section is always magic: www.sco.com/developers/gabi/2003-12-17/ch4.sheader.html says:
If the number of sections is greater than or equal to SHN_LORESERVE (0xff00), e_shnum has the value SHN_UNDEF (0) and the actual number of section header table entries is contained in the sh_size field of the section header at index 0 (otherwise, the sh_size member of the initial entry contains 0).
There are also other magic sections detailed in Figure 4-7: Special Section Indexes.
In index 0, SHT_NULL is mandatory. Are there any other uses for it: stackoverflow.com/questions/26812142/what-is-the-use-of-the-sht-null-section-in-elf ?
.data is section 1:
00000080  01 00 00 00 01 00 00 00  03 00 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 02 00 00 00 00 00 00  |................|
000000a0  0d 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000b0  04 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
  • 80 0: sh_name = 01 00 00 00: index 1 in the .shstrtab string table
    Here, 1 says the name of this section starts at the first character of that section, and ends at the first NUL character, making up the string .data.
    .data is one of the section names which has a predefined meaning according to www.sco.com/developers/gabi/2003-12-17/ch4.strtab.html:
    These sections hold initialized data that contribute to the program's memory image.
  • 80 4: sh_type = 01 00 00 00: SHT_PROGBITS: the section content is not specified by ELF, only by how the program interprets it. Normal since a .data section.
  • 80 8: sh_flags = 03 7x 00: SHF_WRITE and SHF_ALLOC: www.sco.com/developers/gabi/2003-12-17/ch4.sheader.html#sh_flags, as required from a .data section
  • 90 0: sh_addr = 8x 00: TODO: standard says:
    If the section will appear in the memory image of a process, this member gives the address at which the section's first byte should reside. Otherwise, the member contains 0.
    but I don't understand it very well yet.
  • 90 8: sh_offset = 00 02 00 00 00 00 00 00 = 0x200: number of bytes from the start of the program to the first byte in this section
  • a0 0: sh_size = 0d 00 00 00 00 00 00 00
    If we take 0xD bytes starting at sh_offset 200, we see:
    00000200  48 65 6c 6c 6f 20 77 6f  72 6c 64 21 0a 00        |Hello world!..  |
    AHA! So our "Hello world!" string is in the data section like we told it to be on the NASM.
    Once we graduate from hd, we will look this up like:
    readelf -x .data hello_world.o
    which outputs:
    Hex dump of section '.data':
      0x00000000 48656c6c 6f20776f 726c6421 0a       Hello world!.
    NASM sets decent properties for that section because it treats .data magically: www.nasm.us/doc/nasmdoc7.html#section-7.9.2
    Also note that this was a bad section choice: a good C compiler would put the string in .rodata instead, because it is read-only and it would allow for further OS optimizations.
    • a0 8: sh_link and sh_info = 8x 0: do not apply to this section type. www.sco.com/developers/gabi/2003-12-17/ch4.sheader.html#special_sections
    • b0 0: sh_addralign = 04 = TODO: why is this alignment necessary? Is it only for sh_addr, or also for symbols inside sh_addr?
    • b0 8: sh_entsize = 00 = the section does not contain a table. If != 0, it means that the section contains a table of fixed size entries. In this file, we see from the readelf output that this is the case for the .symtab and .rela.text sections.
Now that we've done one section manually, let's graduate and use the readelf -S of the other sections:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 2] .text             PROGBITS         0000000000000000  00000210
       0000000000000027  0000000000000000  AX       0     0     16
.text is executable but not writable: if we try to write to it Linux segfaults. Let's see if we really have some code there:
objdump -d hello_world.o
gives:
hello_world.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <_start>:
   0:       b8 01 00 00 00          mov    $0x1,%eax
   5:       bf 01 00 00 00          mov    $0x1,%edi
   a:       48 be 00 00 00 00 00    movabs $0x0,%rsi
  11:       00 00 00
  14:       ba 0d 00 00 00          mov    $0xd,%edx
  19:       0f 05                   syscall
  1b:       b8 3c 00 00 00          mov    $0x3c,%eax
  20:       bf 00 00 00 00          mov    $0x0,%edi
  25:       0f 05                   syscall
If we grep b8 01 00 00 on the hd, we see that this only occurs at 00000210, which is what the section says. And the Size is 27, which matches as well. So we must be talking about the right section.
This looks like the right code: a write followed by an exit.
The most interesting part is line a which does:
movabs $0x0,%rsi
to pass the address of the string to the system call. Currently, the 0x0 is just a placeholder. After linking happens, it will be modified to contain:
4000ba: 48 be d8 00 60 00 00    movabs $0x6000d8,%rsi
This modification is possible because of the data of the .rela.text section.
Sections with sh_type == SHT_STRTAB are called string tables.
They hold a null separated array of strings.
Such sections are used by other sections when string names are to be used. The using section says:
  • which string table they are using
  • what is the index on the target string table where the string starts
So for example, we could have a string table containing:
Data: \0 a b c \0 d e f \0
Index: 0 1 2 3  4 5 6 7  8
The first byte must be a 0. TODO rationale?
And if another section wants to use the string d e f, they have to point to index 5 of this section (letter d).
Notable string table sections:
  • .shstrtab
  • .strtab
Section type: sh_type == SHT_STRTAB.
Common name: "section header string table".
The section name .shstrtab is reserved. The standard says:
This section holds section names.
This section gets pointed to by the e_shstrnd field of the ELF header itself.
String indexes of this section are are pointed to by the sh_name field of section headers, which denote strings.
This section does not have SHF_ALLOC marked, so it will not appear on the executing program.
readelf -x .shstrtab hello_world.o
outputs:
Hex dump of section '.shstrtab':
  0x00000000 002e6461 7461002e 74657874 002e7368 ..data..text..sh
  0x00000010 73747274 6162002e 73796d74 6162002e strtab..symtab..
  0x00000020 73747274 6162002e 72656c61 2e746578 strtab..rela.tex
  0x00000030 7400                                t.
The data in this section has a fixed format: www.sco.com/developers/gabi/2003-12-17/ch4.strtab.html
If we look at the names of other sections, we see that they all contain numbers, e.g. the .text section is number 7.
Then each string ends when the first NUL character is found, e.g. character 12 is \0 just after .text\0.
Section type: sh_type == SHT_SYMTAB.
Common name: "symbol table".
First the we note that:
  • sh_link = 5
  • sh_info = 6
For SHT_SYMTAB sections, those numbers mean that:
  • strings that give symbol names are in section 5, .strtab
  • the relocation data is in section 6, .rela.text
A good high level tool to disassemble that section is:
nm hello_world.o
which gives:
0000000000000000 T _start
0000000000000000 d hello_world
000000000000000d a hello_world_len
This is however a high level view that omits some types of symbols and in which the symbol types . A more detailed disassembly can be obtained with:
readelf -s hello_world.o
which gives:
Symbol table '.symtab' contains 7 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS hello_world.asm
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    2
     4: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    1 hello_world
     5: 000000000000000d     0 NOTYPE  LOCAL  DEFAULT  ABS hello_world_len
     6: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    2 _start
The binary format of the table is documented at www.sco.com/developers/gabi/2003-12-17/ch4.symtab.html
The data is:
readelf -x .symtab hello_world.o
which gives:
Hex dump of section '.symtab':
  0x00000000 00000000 00000000 00000000 00000000 ................
  0x00000010 00000000 00000000 01000000 0400f1ff ................
  0x00000020 00000000 00000000 00000000 00000000 ................
  0x00000030 00000000 03000100 00000000 00000000 ................
  0x00000040 00000000 00000000 00000000 03000200 ................
  0x00000050 00000000 00000000 00000000 00000000 ................
  0x00000060 11000000 00000100 00000000 00000000 ................
  0x00000070 00000000 00000000 1d000000 0000f1ff ................
  0x00000080 0d000000 00000000 00000000 00000000 ................
  0x00000090 2d000000 10000200 00000000 00000000 -...............
  0x000000a0 00000000 00000000                   ........
The entries are of type:
typedef struct {
    Elf64_Word  st_name;
    unsigned char   st_info;
    unsigned char   st_other;
    Elf64_Half  st_shndx;
    Elf64_Addr  st_value;
    Elf64_Xword st_size;
} Elf64_Sym;
Like in the section table, the first entry is magical and set to a fixed meaningless values.
Entry 1 has ELF64_R_TYPE == STT_FILE. ELF64_R_TYPE is continued inside of st_info.
Byte analysis:
  • 10 8: st_name = 01000000 = character 1 in the .strtab, which until the following \0 makes hello_world.asm
    This piece of information file may be used by the linker to decide on which segment sections go: e.g. in ld linker script we write:
    segment_name :
    {
        file(section)
    }
    to pick a section from a given file.
    Most of the time however, we will just dump all sections with a given name together with:
    segment_name :
    {
        *(section)
    }
  • 10 12: st_info = 04
    Bits 0-3 = ELF64_R_TYPE = Type = 4 = STT_FILE: the main purpose of this entry is to use st_name to indicate the name of the file which generated this object file.
    Bits 4-7 = ELF64_ST_BIND = Binding = 0 = STB_LOCAL. Required value for STT_FILE.
  • 10 13: st_shndx = Symbol Table Section header Index = f1ff = SHN_ABS. Required for STT_FILE.
  • 20 0: st_value = 8x 00: required for value for STT_FILE
  • 20 8: st_size = 8x 00: no allocated size
Now from the readelf, we interpret the others quickly.
There are two such entries, one pointing to .data and the other to .text (section indexes 1 and 2).
Num:    Value          Size Type    Bind   Vis      Ndx Name
  2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
  3: 0000000000000000     0 SECTION LOCAL  DEFAULT    2
TODO what is their purpose?
Then come the most important symbols:
Num:    Value          Size Type    Bind   Vis      Ndx Name
  4: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    1 hello_world
  5: 000000000000000d     0 NOTYPE  LOCAL  DEFAULT  ABS hello_world_len
  6: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    2 _start
hello_world string is in the .data section (index 1). It's value is 0: it points to the first byte of that section.
_start is marked with GLOBAL visibility since we wrote:
global _start
in NASM. This is necessary since it must be seen as the entry point. Unlike in C, by default NASM labels are local.
hello_world_len points to the special st_shndx == SHN_ABS == 0xF1FF.
0xF1FF is chosen so as to not conflict with other sections.
st_value == 0xD == 13 which is the value we have stored there on the assembly: the length of the string Hello World!.
This means that relocation will not affect this value: it is a constant.
This is small optimization that our assembler does for us and which has ELF support.
If we had used the address of hello_world_len anywhere, the assembler would not have been able to mark it as SHN_ABS, and the linker would have extra relocation work on it later.
By default, NASM places a .symtab on the executable as well.
This is only used for debugging. Without the symbols, we are completely blind, and must reverse engineer everything.
You can strip it with objcopy, and the executable will still run. Such executables are called "stripped executables".
Holds strings for the symbol table.
This section has sh_type == SHT_STRTAB.
It is pointed to by sh_link == 5 of the .symtab section.
readelf -x .strtab hello_world.o
outputs:
Hex dump of section '.strtab':
  0x00000000 0068656c 6c6f5f77 6f726c64 2e61736d .hello_world.asm
  0x00000010 0068656c 6c6f5f77 6f726c64 0068656c .hello_world.hel
  0x00000020 6c6f5f77 6f726c64 5f6c656e 005f7374 lo_world_len._st
  0x00000030 61727400                            art.
This implies that it is an ELF level limitation that global variables cannot contain NUL characters.
Section type: sh_type == SHT_RELA.
Common name: "relocation section".
.rela.text holds relocation data which says how the address should be modified when the final executable is linked. This points to bytes of the text area that must be modified when linking happens to point to the correct memory locations.
Basically, it translates the object text containing the placeholder 0x0 address:
   a:       48 be 00 00 00 00 00    movabs $0x0,%rsi
  11:       00 00 00
to the actual executable code containing the final 0x6000d8:
4000ba: 48 be d8 00 60 00 00    movabs $0x6000d8,%rsi
4000c1: 00 00 00
It was pointed to by sh_info = 6 of the .symtab section.
readelf -r hello_world.o outputs:
Relocation section '.rela.text' at offset 0x3b0 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
00000000000c  000200000001 R_X86_64_64       0000000000000000 .data + 0
The section does not exist in the executable.
The actual bytes are:
00000370  0c 00 00 00 00 00 00 00  01 00 00 00 02 00 00 00  |................|
00000380  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
The struct represented is:
typedef struct {
    Elf64_Addr  r_offset;
    Elf64_Xword r_info;
    Elf64_Sxword    r_addend;
} Elf64_Rela;
So:
  • 370 0: r_offset = 0xC: address into the .text whose address this relocation will modify
  • 370 8: r_info = 0x200000001. Contains 2 fields:
    • ELF64_R_TYPE = 0x1: meaning depends on the exact architecture.
    • ELF64_R_SYM = 0x2: index of the section to which the address points, so .data which is at index 2.
    The AMD64 ABI says that type 1 is called R_X86_64_64 and that it represents the operation S + A where:
    • S: the value of the symbol on the object file, here 0 because we point to the 00 00 00 00 00 00 00 00 of movabs $0x0,%rsi
    • A: the addend, present in field r_added
    This address is added to the section on which the relocation operates.
    This relocation operation acts on a total 8 bytes.
  • 380 0: r_addend = 0
So in our example we conclude that the new address will be: S + A = .data + 0, and thus the first thing in the data section.
Besides sh_type == SHT_RELA, there also exists SHT_REL, which would have section name .text.rel (not present in this object file).
Those represent the same struct, but without the addend, e.g.:
typedef struct {
    Elf64_Addr  r_offset;
    Elf64_Xword r_info;
} Elf64_Rela;
The ELF standard says that in many cases the both can be used, and it is just a matter of convenience.
This program did not have certain dynamic linking related sections because we linked it minimally with ld.
However, if you compile a C hello world with GCC 8.2:
gcc -o main.out main.c
some other interesting sections would appear.
Contains the path to the dynamic loader, i.e. /lib64/ld-linux-x86-64.so.2 in Ubuntu 18.10. Explained at: stackoverflow.com/questions/8040631/checking-if-a-binary-compiled-with-static/55664341#55664341
Contains a lot of different flag masks.
Seems to be a GNU Binutils extension
Determines if an executable is a position independent executable (PIE).
Seems to be informational only, since not used by Linux kernel 5.0 or glibc 2.29.

Articles by others on the same topic (0)

There are currently no matching articles.