This is especially interesting for user mode emulation.
ELF is specified by the LSB:
The LSB basically links to other standards with minor extensions, in particular:
- Generic (both by SCO):- System V ABI 4.1 (1997) www.sco.com/developers/devspecs/gabi41.pdf, no 64 bit, although a magic number is reserved for it. Same for core files. This is the first document you should look at when searching for information.
- System V ABI Update DRAFT 17 (2003) www.sco.com/developers/gabi/2003-12-17/contents.html, adds 64 bit. Only updates chapters 4 and 5 of the previous document: the others remain valid and are still referenced.
 
- Architecture specific (by the processor vendor):
A handy summary can be found at:
man elfSpin like mad between:
- standards
- high level generators. We use the assembler asand linkerld.
- hexdumps
- file decompilers. We use readelf. It makes it faster to read the ELF file by turning it into human readable output. But you must have seen one byte-by-byte example first, and think howreadelfoutput maps to the standard.
- low-level generators: stand-alone libraries that let you control every field of the ELF files you generated. github.com/BR903/ELFkickers, github.com/sqall01/ZwoELF and many more on GitHub.
- consumer: the execsystem call of the Linux kernel can parse ELF files to starts processes: github.com/torvalds/linux/blob/v4.11/fs/binfmt_elf.c, stackoverflow.com/questions/8352535/how-does-kernel-get-an-executable-binary-file-running-under-linux/31394861#31394861
The ELF standard specifies multiple file formats:
- Object files (.o).Intermediate step to generating executables and other formats:Source code | | Compilation | v Object file | | Linking | v ExecutableObject files exist to make compilation faster: withmake, we only have to recompile the modified source files based on timestamps.
- Executable files (no standard Linux extension).This is what the Linux kernel can actually run.
- Archive files (.a).Libraries meant to be embedded into executables during the Linking step.
- Shared object files (.so).Libraries meant to be loaded when the executable starts running.
- Core dumps.Such files may be generated by the Linux kernel when the program does naughty things, e.g. segfault.They exist to help debugging the program.
In this tutorial, we consider only object and executable files.
- Compiler toolchains generate and read ELF files.
- Operating systems read and run ELF files.
- Specialized libraries. Examples:
It is non-trivial to determine what is the smallest legal ELF file, or the smaller one that will do something trivial in Linux.
Some impressive attempts:
hello_world.asm
section .data
    hello_world db "Hello world!", 10
    hello_world_len  equ $ - hello_world
section .text
    global _start
    _start:
        mov rax, 1
        mov rdi, 1
        mov rsi, hello_world
        mov rdx, hello_world_len
        syscall
        mov rax, 60
        mov rdi, 0
        syscallCompiled with:
nasm -w+all -f elf64 -o 'hello_world.o' 'hello_world.asm'
ld -o 'hello_world.out' 'hello_world.o'Running:gives:
hd hello_world.o00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  01 00 3e 00 01 00 00 00  00 00 00 00 00 00 00 00  |..>.............|
00000020  00 00 00 00 00 00 00 00  40 00 00 00 00 00 00 00  |........@.......|
00000030  00 00 00 00 40 00 00 00  00 00 40 00 07 00 03 00  |....@.....@.....|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000080  01 00 00 00 01 00 00 00  03 00 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 02 00 00 00 00 00 00  |................|
000000a0  0d 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000b0  04 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000c0  07 00 00 00 01 00 00 00  06 00 00 00 00 00 00 00  |................|
000000d0  00 00 00 00 00 00 00 00  10 02 00 00 00 00 00 00  |................|
000000e0  27 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |'...............|
000000f0  10 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000100  0d 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
00000110  00 00 00 00 00 00 00 00  40 02 00 00 00 00 00 00  |........@.......|
00000120  32 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |2...............|
00000130  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000140  17 00 00 00 02 00 00 00  00 00 00 00 00 00 00 00  |................|
00000150  00 00 00 00 00 00 00 00  80 02 00 00 00 00 00 00  |................|
00000160  a8 00 00 00 00 00 00 00  05 00 00 00 06 00 00 00  |................|
00000170  04 00 00 00 00 00 00 00  18 00 00 00 00 00 00 00  |................|
00000180  1f 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
00000190  00 00 00 00 00 00 00 00  30 03 00 00 00 00 00 00  |........0.......|
000001a0  34 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |4...............|
000001b0  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001c0  27 00 00 00 04 00 00 00  00 00 00 00 00 00 00 00  |'...............|
000001d0  00 00 00 00 00 00 00 00  70 03 00 00 00 00 00 00  |........p.......|
000001e0  18 00 00 00 00 00 00 00  04 00 00 00 02 00 00 00  |................|
000001f0  04 00 00 00 00 00 00 00  18 00 00 00 00 00 00 00  |................|
00000200  48 65 6c 6c 6f 20 77 6f  72 6c 64 21 0a 00 00 00  |Hello world!....|
00000210  b8 01 00 00 00 bf 01 00  00 00 48 be 00 00 00 00  |..........H.....|
00000220  00 00 00 00 ba 0d 00 00  00 0f 05 b8 3c 00 00 00  |............<...|
00000230  bf 00 00 00 00 0f 05 00  00 00 00 00 00 00 00 00  |................|
00000240  00 2e 64 61 74 61 00 2e  74 65 78 74 00 2e 73 68  |..data..text..sh|
00000250  73 74 72 74 61 62 00 2e  73 79 6d 74 61 62 00 2e  |strtab..symtab..|
00000260  73 74 72 74 61 62 00 2e  72 65 6c 61 2e 74 65 78  |strtab..rela.tex|
00000270  74 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |t...............|
00000280  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000290  00 00 00 00 00 00 00 00  01 00 00 00 04 00 f1 ff  |................|
000002a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002b0  00 00 00 00 03 00 01 00  00 00 00 00 00 00 00 00  |................|
000002c0  00 00 00 00 00 00 00 00  00 00 00 00 03 00 02 00  |................|
000002d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000002e0  11 00 00 00 00 00 01 00  00 00 00 00 00 00 00 00  |................|
000002f0  00 00 00 00 00 00 00 00  1d 00 00 00 00 00 f1 ff  |................|
00000300  0d 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000310  2d 00 00 00 10 00 02 00  00 00 00 00 00 00 00 00  |-...............|
00000320  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000330  00 68 65 6c 6c 6f 5f 77  6f 72 6c 64 2e 61 73 6d  |.hello_world.asm|
00000340  00 68 65 6c 6c 6f 5f 77  6f 72 6c 64 00 68 65 6c  |.hello_world.hel|
00000350  6c 6f 5f 77 6f 72 6c 64  5f 6c 65 6e 00 5f 73 74  |lo_world_len._st|
00000360  61 72 74 00 00 00 00 00  00 00 00 00 00 00 00 00  |art.............|
00000370  0c 00 00 00 00 00 00 00  01 00 00 00 02 00 00 00  |................|
00000380  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000390An ELF file contains the following parts:
- ELF header. Points to the position of the section header table and the program header table.
- Section header table (optional on executable). Each has e_shnumsection headers, each pointing to the position of a section.
- N sections, with N <= e_shnum(optional on executable)
- Program header table (only on executable). Each has e_phnumprogram headers, each pointing to the position of a segment.
- N segments, with N <= e_phnum(only on executable)
The order of those parts is not fixed: the only fixed thing is the ELF header that must be the first thing on the file: Generic docs say:
In pictures: sample object file with three sections:
            +-------------------+
            | ELF header        |---+
+---------> +-------------------+   | e_shoff
|           |                   |<--+
| Section   | Section header 0  |
|           |                   |---+ sh_offset
| Header    +-------------------+   |
|           | Section header 1  |---|--+ sh_offset
| Table     +-------------------+   |  |
|           | Section header 2  |---|--|--+
+---------> +-------------------+   |  |  |
            | Section 0         |<--+  |  |
            +-------------------+      |  | sh_offset
            | Section 1         |<-----+  |
            +-------------------+         |
            | Section 2         |<--------+
            +-------------------+But nothing (except sanity) prevents the following topology:
            +-------------------+
            | ELF header        |---+ e_shoff
            +-------------------+   |
            | Section 1         |<--|--+
+---------> +-------------------+   |  |
|           |                   |<--+  | sh_offset
| Section   | Section header 0  |      |
|           |                   |------|---------+
| Header    +-------------------+      |         |
|           | Section header 1  |------+         |
| Table     +-------------------+                |
|           | Section header 2  |---+            | sh_offset
+---------> +-------------------+   | sh_offset  |
            | Section 2         |<--+            |
            +-------------------+                |
            | Section 0         |<---------------+
            +-------------------+But some newbies may prefer PNGs :-)
- section: exists before linking, in object files.Major information sections contain for the linker: is this section:
- segment: exists after linking, in the executable file.Contains information about how each segment should be loaded into memory by the OS, notably location and permissions.
Running:outputs:
readelf -h hello_world.oMagic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class:                             ELF64
Data:                              2's complement, little endian
Version:                           1 (current)
OS/ABI:                            UNIX - System V
ABI Version:                       0
Type:                              REL (Relocatable file)
Machine:                           Advanced Micro Devices X86-64
Version:                           0x1
Entry point address:               0x0
Start of program headers:          0 (bytes into file)
Start of section headers:          64 (bytes into file)
Flags:                             0x0
Size of this header:               64 (bytes)
Size of program headers:           0 (bytes)
Number of program headers:         0
Size of section headers:           64 (bytes)
Number of section headers:         7
Section header string table index: 3Running:outputs:
readelf -h hello_world.outMagic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class:                             ELF64
Data:                              2's complement, little endian
Version:                           1 (current)
OS/ABI:                            UNIX - System V
ABI Version:                       0
Type:                              EXEC (Executable file)
Machine:                           Advanced Micro Devices X86-64
Version:                           0x1
Entry point address:               0x4000b0
Start of program headers:          64 (bytes into file)
Start of section headers:          272 (bytes into file)
Flags:                             0x0
Size of this header:               64 (bytes)
Size of program headers:           56 (bytes)
Number of program headers:         2
Size of section headers:           64 (bytes)
Number of section headers:         6
Section header string table index: 3Bytes in the object file:
00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  01 00 3e 00 01 00 00 00  00 00 00 00 00 00 00 00  |..>.............|
00000020  00 00 00 00 00 00 00 00  40 00 00 00 00 00 00 00  |........@.......|
00000030  00 00 00 00 40 00 00 00  00 00 40 00 07 00 03 00  |....@.....@.....|Executable:
00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  02 00 3e 00 01 00 00 00  b0 00 40 00 00 00 00 00  |..>.......@.....|
00000020  40 00 00 00 00 00 00 00  10 01 00 00 00 00 00 00  |@...............|
00000030  00 00 00 00 40 00 38 00  02 00 40 00 06 00 03 00  |....@.8...@.....|Structure represented:
# define EI_NIDENT 16
typedef struct {
    unsigned char   e_ident[EI_NIDENT];
    Elf64_Half      e_type;
    Elf64_Half      e_machine;
    Elf64_Word      e_version;
    Elf64_Addr      e_entry;
    Elf64_Off       e_phoff;
    Elf64_Off       e_shoff;
    Elf64_Word      e_flags;
    Elf64_Half      e_ehsize;
    Elf64_Half      e_phentsize;
    Elf64_Half      e_phnum;
    Elf64_Half      e_shentsize;
    Elf64_Half      e_shnum;
    Elf64_Half      e_shstrndx;
} Elf64_Ehdr;Manual breakdown:
- 0 0: EI_MAG=7f 45 4c 46=0x7f 'E', 'L', 'F': ELF magic number
- 0 4: EI_CLASS=02=ELFCLASS64: 64 bit elf
- 0 5: EI_DATA=01=ELFDATA2LSB: little endian data
- 0 6: EI_VERSION=01: format version
- 0 7: EI_OSABI(only in 2003 Update) =00=ELFOSABI_NONE: no extensions.
- 0 8: EI_PAD= 8x00: reserved bytes. Must be set to 0.
- On the executable it is02 00forET_EXEC.Another important possibility for the executable isET_DYNfor PIE executables and shared libraries.ET_DYNtells the Linux kernel that the code is position independent, and can loaded at a random memory location with ASLR.
- 1 2: e_machine=3e 00=62=EM_X86_64: AMD64 architecture
- 1 4: e_version=01 00 00 00: must be 1
- 1 8:e_entry= 8x00: execution address entry point, or 0 if not applicable like for the object file since there is no entry point.On the executable, it isb0 00 40 00 00 00 00 00. The kernel puts the RIP directly on that value when executing. It can be configured by the linker script or-e. But it will segfault if you set it too low: stackoverflow.com/questions/2187484/why-is-the-elf-execution-entry-point-virtual-address-of-the-form-0x80xxxxx-and-n
- 40 00 00 00on the executable, i.e. it starts immediately after the ELF header.
- 2 8: e_shoff=407x00=0x40: section header table file offset, 0 if not present.
- The Intel386 architecture defines no flags; so this member contains zero. 
- 3 4: e_ehsize=40 00: size of this elf header. TODO why this field needed? Isn't the size fixed?
- 38 00on executable: it is 56 bytes long
- 02 00on executable: there are 2 entries.
- 3 A: e_shentsizeande_shnum=40 00 07 00: section header size and number of entries
- 3 E: e_shstrndx(Section Header STRing iNDeX) =03 00: index of the.shstrtabsection.
Array of 
Elf64_Shdr structs.Each entry contains metadata about a given section.
e_shoff of the ELF header gives the starting position, 0x40 here.So the table takes bytes from 0x40 to 
0x40 + 7 + 0x40 - 1 = 0x1FF.Some section names are reserved for certain section types: www.sco.com/developers/gabi/2003-12-17/ch4.sheader.html#special_sections e.g. 
.text requires a SHT_PROGBITS type and SHF_ALLOC + SHF_EXECINSTRRunning:outputs:
readelf -S hello_world.oThere are 7 section headers, starting at offset 0x40:
Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .data             PROGBITS         0000000000000000  00000200
       000000000000000d  0000000000000000  WA       0     0     4
  [ 2] .text             PROGBITS         0000000000000000  00000210
       0000000000000027  0000000000000000  AX       0     0     16
  [ 3] .shstrtab         STRTAB           0000000000000000  00000240
       0000000000000032  0000000000000000           0     0     1
  [ 4] .symtab           SYMTAB           0000000000000000  00000280
       00000000000000a8  0000000000000018           5     6     4
  [ 5] .strtab           STRTAB           0000000000000000  00000330
       0000000000000034  0000000000000000           0     0     1
  [ 6] .rela.text        RELA             0000000000000000  00000370
       0000000000000018  0000000000000018           4     2     4
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)The 
struct represented by each entry is:typedef struct {
    Elf64_Word  sh_name;
    Elf64_Word  sh_type;
    Elf64_Xword sh_flags;
    Elf64_Addr  sh_addr;
    Elf64_Off   sh_offset;
    Elf64_Xword sh_size;
    Elf64_Word  sh_link;
    Elf64_Word  sh_info;
    Elf64_Xword sh_addralign;
    Elf64_Xword sh_entsize;
} Elf64_Shdr;Contained in bytes 0x40 to 0x7F.
The first section is always magic: www.sco.com/developers/gabi/2003-12-17/ch4.sheader.html says:
If the number of sections is greater than or equal to SHN_LORESERVE (0xff00), e_shnum has the value SHN_UNDEF (0) and the actual number of section header table entries is contained in the sh_size field of the section header at index 0 (otherwise, the sh_size member of the initial entry contains 0).
There are also other magic sections detailed in 
Figure 4-7: Special Section Indexes.In index 0, 
SHT_NULL is mandatory. Are there any other uses for it: stackoverflow.com/questions/26812142/what-is-the-use-of-the-sht-null-section-in-elf ?.data is section 1:00000080  01 00 00 00 01 00 00 00  03 00 00 00 00 00 00 00  |................|
00000090  00 00 00 00 00 00 00 00  00 02 00 00 00 00 00 00  |................|
000000a0  0d 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000000b0  04 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|- 80 4: sh_type=01 00 00 00:SHT_PROGBITS: the section content is not specified by ELF, only by how the program interprets it. Normal since a.datasection.
- 80 8: sh_flags=037x00:SHF_WRITEandSHF_ALLOC: www.sco.com/developers/gabi/2003-12-17/ch4.sheader.html#sh_flags, as required from a.datasection
- 90 0: sh_addr= 8x00: TODO: standard says:but I don't understand it very well yet.If the section will appear in the memory image of a process, this member gives the address at which the section's first byte should reside. Otherwise, the member contains 0. 
- 90 8: sh_offset=00 02 00 00 00 00 00 00=0x200: number of bytes from the start of the program to the first byte in this section
- 00000200 48 65 6c 6c 6f 20 77 6f 72 6c 64 21 0a 00 |Hello world!.. |- readelf -x .data hello_world.owhich outputs:- Hex dump of section '.data': 0x00000000 48656c6c 6f20776f 726c6421 0a Hello world!.NASM sets decent properties for that section because it treats- .datamagically: www.nasm.us/doc/nasmdoc7.html#section-7.9.2Also note that this was a bad section choice: a good C compiler would put the string in- .rodatainstead, because it is read-only and it would allow for further OS optimizations.- a0 8: sh_linkandsh_info= 8x 0: do not apply to this section type. www.sco.com/developers/gabi/2003-12-17/ch4.sheader.html#special_sections
- b0 0: sh_addralign=04= TODO: why is this alignment necessary? Is it only forsh_addr, or also for symbols insidesh_addr?
- b0 8: sh_entsize=00= the section does not contain a table. If != 0, it means that the section contains a table of fixed size entries. In this file, we see from thereadelfoutput that this is the case for the.symtaband.rela.textsections.
 
- a0 8: 
Now that we've done one section manually, let's graduate and use the 
readelf -S of the other sections:  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 2] .text             PROGBITS         0000000000000000  00000210
       0000000000000027  0000000000000000  AX       0     0     16.text is executable but not writable: if we try to write to it Linux segfaults. Let's see if we really have some code there:objdump -d hello_world.ohello_world.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_start>:
   0:       b8 01 00 00 00          mov    $0x1,%eax
   5:       bf 01 00 00 00          mov    $0x1,%edi
   a:       48 be 00 00 00 00 00    movabs $0x0,%rsi
  11:       00 00 00
  14:       ba 0d 00 00 00          mov    $0xd,%edx
  19:       0f 05                   syscall
  1b:       b8 3c 00 00 00          mov    $0x3c,%eax
  20:       bf 00 00 00 00          mov    $0x0,%edi
  25:       0f 05                   syscallIf we grep 
b8 01 00 00 on the hd, we see that this only occurs at 00000210, which is what the section says. And the Size is 27, which matches as well. So we must be talking about the right section.The most interesting part is line to pass the address of the string to the system call. Currently, the This modification is possible because of the data of the 
a which does:movabs $0x0,%rsi0x0 is just a placeholder. After linking happens, it will be modified to contain:4000ba: 48 be d8 00 60 00 00    movabs $0x6000d8,%rsi.rela.text section. Pinned article: Introduction to the OurBigBook Project
Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
Intro to OurBigBook
. Source. We have two killer features:
- topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculusArticles of different users are sorted by upvote within each article page. This feature is a bit like:- a Wikipedia where each user can have their own version of each article
- a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
 This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it.Figure 1. Screenshot of the "Derivative" topic page. View it live at: ourbigbook.com/go/topic/derivativeVideo 2. OurBigBook Web topics demo. Source.
- local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.- to OurBigBook.com to get awesome multi-user features like topics and likes
- as HTML files to a static website, which you can host yourself for free on many external providers like GitHub Pages, and remain in full control
 Figure 3. Visual Studio Code extension installation.Figure 4. Visual Studio Code extension tree navigation.Figure 5. Web editor. You can also edit articles on the Web editor without installing anything locally.Video 3. Edit locally and publish demo. Source. This shows editing OurBigBook Markup and publishing it using the Visual Studio Code extension.Video 4. OurBigBook Visual Studio Code extension editing and navigation demo. Source.
- Infinitely deep tables of contents:
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact






