Section type: sh_type == SHT_SYMTAB.
Common name: "symbol table".
First the we note that:
  • sh_link = 5
  • sh_info = 6
For SHT_SYMTAB sections, those numbers mean that:
  • strings that give symbol names are in section 5, .strtab
  • the relocation data is in section 6, .rela.text
A good high level tool to disassemble that section is:
nm hello_world.o
which gives:
0000000000000000 T _start
0000000000000000 d hello_world
000000000000000d a hello_world_len
This is however a high level view that omits some types of symbols and in which the symbol types . A more detailed disassembly can be obtained with:
readelf -s hello_world.o
which gives:
Symbol table '.symtab' contains 7 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS hello_world.asm
     2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
     3: 0000000000000000     0 SECTION LOCAL  DEFAULT    2
     4: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    1 hello_world
     5: 000000000000000d     0 NOTYPE  LOCAL  DEFAULT  ABS hello_world_len
     6: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    2 _start
The binary format of the table is documented at www.sco.com/developers/gabi/2003-12-17/ch4.symtab.html
The data is:
readelf -x .symtab hello_world.o
which gives:
Hex dump of section '.symtab':
  0x00000000 00000000 00000000 00000000 00000000 ................
  0x00000010 00000000 00000000 01000000 0400f1ff ................
  0x00000020 00000000 00000000 00000000 00000000 ................
  0x00000030 00000000 03000100 00000000 00000000 ................
  0x00000040 00000000 00000000 00000000 03000200 ................
  0x00000050 00000000 00000000 00000000 00000000 ................
  0x00000060 11000000 00000100 00000000 00000000 ................
  0x00000070 00000000 00000000 1d000000 0000f1ff ................
  0x00000080 0d000000 00000000 00000000 00000000 ................
  0x00000090 2d000000 10000200 00000000 00000000 -...............
  0x000000a0 00000000 00000000                   ........
The entries are of type:
typedef struct {
    Elf64_Word  st_name;
    unsigned char   st_info;
    unsigned char   st_other;
    Elf64_Half  st_shndx;
    Elf64_Addr  st_value;
    Elf64_Xword st_size;
} Elf64_Sym;
Like in the section table, the first entry is magical and set to a fixed meaningless values.
Entry 1 has ELF64_R_TYPE == STT_FILE. ELF64_R_TYPE is continued inside of st_info.
Byte analysis:
  • 10 8: st_name = 01000000 = character 1 in the .strtab, which until the following \0 makes hello_world.asm
    This piece of information file may be used by the linker to decide on which segment sections go: e.g. in ld linker script we write:
    segment_name :
    {
        file(section)
    }
    to pick a section from a given file.
    Most of the time however, we will just dump all sections with a given name together with:
    segment_name :
    {
        *(section)
    }
  • 10 12: st_info = 04
    Bits 0-3 = ELF64_R_TYPE = Type = 4 = STT_FILE: the main purpose of this entry is to use st_name to indicate the name of the file which generated this object file.
    Bits 4-7 = ELF64_ST_BIND = Binding = 0 = STB_LOCAL. Required value for STT_FILE.
  • 10 13: st_shndx = Symbol Table Section header Index = f1ff = SHN_ABS. Required for STT_FILE.
  • 20 0: st_value = 8x 00: required for value for STT_FILE
  • 20 8: st_size = 8x 00: no allocated size
Now from the readelf, we interpret the others quickly.
There are two such entries, one pointing to .data and the other to .text (section indexes 1 and 2).
Num:    Value          Size Type    Bind   Vis      Ndx Name
  2: 0000000000000000     0 SECTION LOCAL  DEFAULT    1
  3: 0000000000000000     0 SECTION LOCAL  DEFAULT    2
TODO what is their purpose?
Then come the most important symbols:
Num:    Value          Size Type    Bind   Vis      Ndx Name
  4: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    1 hello_world
  5: 000000000000000d     0 NOTYPE  LOCAL  DEFAULT  ABS hello_world_len
  6: 0000000000000000     0 NOTYPE  GLOBAL DEFAULT    2 _start
hello_world string is in the .data section (index 1). It's value is 0: it points to the first byte of that section.
_start is marked with GLOBAL visibility since we wrote:
global _start
in NASM. This is necessary since it must be seen as the entry point. Unlike in C, by default NASM labels are local.
hello_world_len points to the special st_shndx == SHN_ABS == 0xF1FF.
0xF1FF is chosen so as to not conflict with other sections.
st_value == 0xD == 13 which is the value we have stored there on the assembly: the length of the string Hello World!.
This means that relocation will not affect this value: it is a constant.
This is small optimization that our assembler does for us and which has ELF support.
If we had used the address of hello_world_len anywhere, the assembler would not have been able to mark it as SHN_ABS, and the linker would have extra relocation work on it later.
By default, NASM places a .symtab on the executable as well.
This is only used for debugging. Without the symbols, we are completely blind, and must reverse engineer everything.
You can strip it with objcopy, and the executable will still run. Such executables are called "stripped executables".

Articles by others on the same topic (0)

There are currently no matching articles.