Ciro Santilli @cirosantilli 37

 Incoming links: Compiler

Compiler toolchain Updated 2025-07-16

 View more

Compiler + other closely related crap like linker.

 Read the full article

How computers work? Updated 2025-07-16

 View more

A computer is a highly layered system, and so you have to decide which layers you are the most interested in studying.

Although the layer are somewhat independent, they also sometimes interact, and when that happens it usually hurts your brain. E.g., if compilers were perfect, no one optimizing software would have to know anything about microarchitecture. But if you want to go hardcore enough, you might have to learn some lower layer.

It must also be said that like in any industry, certain layers are hidden in commercial secrecy mysteries making it harder to actually learn them. In computing, the lower level you go, the more closed source things tend to become.

But as you climb down into the abyss of low level hardcoreness, don't forget that making usefulness is more important than being hardcore: Figure 1. "xkcd 378: Real Programmers".

First, the most important thing you should know about this subject: cirosantilli.com/linux-kernel-module-cheat/should-you-waste-your-life-with-systems-programming

Here's a summary from low-level to high-level:

semiconductor physical implementation this level is of course the most closed, but it is fun to try and peek into it from any openings given by commercials and academia:
- photolithography, and notably photomask design
register transfer level
- interactive Verilator fun: Is it possible to do interactive user input and output simulation in VHDL or Verilog?
- more importantly, and much harder/maybe impossible with open source, would be to try and set up a open source standard cell library and supporting software to obtain power, performance and area estimates
  - Are there good open source standard cell libraries to learn IC synthesis with EDA tools? on Quora
  - the most open source ones are some initiatives targeting FPGAs, e.g. symbiflow.github.io/, www.clifford.at/icestorm/
  - qflow is an initiative targeting actual integrated circuits
microarchitecture: a good way to play with this is to try and run some minimal userland examples on gem5 userland simulation with logging, e.g. see on the Linux Kernel Module Cheat:
- cirosantilli.com/linux-kernel-module-cheat/gem5-event-queue-derivo3cpu-syscall-emulation-freestanding-example-analysis
This should be done at the same time as books/website/courses that explain the microarchitecture basics.
This is the level of abstraction that Ciro Santilli finds the most interesting of the hardware stack. Learning it for actual CPUs (which as of 2020 is only partially documented by vendors) could actually be useful in hardcore software optimization use cases.
instruction set architecture: a good approach to learn this is to manually write some userland assembly with assertions as done in the Linux Kernel Module Cheat e.g. at:
- github.com/cirosantilli/linux-kernel-module-cheat/blob/9b6552ab6c66cb14d531eff903c4e78f3561e9ca/userland/arch/x86_64/add.S
- cirosantilli.com/linux-kernel-module-cheat/x86-userland-assembly
- learn a bit about calling conventions, e.g. by calling C standard library functions from assembly:
  - github.com/cirosantilli/linux-kernel-module-cheat/blob/9b6552ab6c66cb14d531eff903c4e78f3561e9ca/userland/arch/aarch64/inline_asm/linux/asm_from_c.c
  - Calling C functions from x86 assembly language
- you can also try and understand what some simple C programs compile to. Things can get a bit hard though when -O3 is used. Some cute examples:
executable file format, notably executable and Linkable Format. Particularly important is to understand the basics of:
- address relocation: How do linkers and address relocation work?
- position independent code: What is the -fPIE option for position-independent executables in GCC and ld?
- how to observe which symbols are present in object files, e.g.:
  - how C++ uses name mangling What is the effect of extern "C" in C++?
  - how C++ template instantiation can help reduce link time and size: Explicit template instantiation - when is it used?
operating system. There are two ways to approach this:
- learn about the Linux kernel Linux kernel. A good starting point is to learn about its main interfaces. This is well shown at Linux Kernel Module Cheat:
  - system calls
    write some system calls in
    pure assembly:
    github.com/cirosantilli/linux-kernel-module-cheat/blob/9b6552ab6c66cb14d531eff903c4e78f3561e9ca/userland/arch/x86_64/freestanding/linux/hello.S
    How should strace be used?
    C GCC inline assembly:
    stackoverflow.com/questions/9506353/how-to-invoke-a-system-call-via-syscall-or-sysenter-in-inline-assembly/54956854#54956854
    github.com/cirosantilli/linux-kernel-module-cheat/blob/9b6552ab6c66cb14d531eff903c4e78f3561e9ca/userland/arch/x86_64/inline_asm/freestanding/linux/hello.c
  - learn about kernel modules and their interfaces. Notably, learn about to demystify special files such /dev/random and so on:
    stackoverflow.com/questions/22632713/how-to-write-a-simple-linux-device-driver/44640466#44640466
    github.com/cirosantilli/linux-kernel-module-cheat/tree/9b6552ab6c66cb14d531eff903c4e78f3561e9ca/kernel_modules
  - learn how to do a minimal Linux kernel disk image/boot to userland hello world: What is the smallest possible Linux implementation?
  - learn how to GDB Step debug the Linux kernel itself. Once you know this, you will feel that "given enough patience, I could understand anything that I wanted about the kernel", and you can then proceed to not learn almost anything about it and carry on with your life
- write your own (mini-) OS, or study a minimal educational OS, e.g. as in:
  - x86 bare metal examples
  - stackoverflow.com/questions/22054578/how-to-run-a-program-without-an-operating-system/32483545#32483545
programming language

Figure 1.
xkcd 378: Real Programmers
. Source.

Video 1.

How low can you go video by Ciro Santilli (2017)

Source. In this infamous video Ciro has summarized the computer hierarchy.

 Read the full article

Molecular biology feels like systems programming Updated 2025-07-16

 View more

Whenever Ciro Santilli learns about molecular biology, he can't help but to feel that it feels like programming, and notably systems programming and computer hardware design.

In some sense, the comparison is obvious: DNA is clearly a programmable medium like any assembly language, but still, systems programming did give Ciro some further feelings.

The most important analogy perhaps is observability, or more precisely the lack of it. For the computer, this is described at: The lower level you go into a computer, the harder it is to observe things.
And then, when Ciro started learning a bit about biology techniques, he started to feel the exact same thing.
For example when he played with E. Coli Whole Cell Model by Covert Lab, the main thing Ciro felt was: it is going to be hard to verify any of this data, because it is hard/impossible to know the concentration of each element in a cell as a function of time.
More generally of course, this is exactly why making any biology discovery is so hard: we can't easily see what's going on inside the cell, and have to resort to indirect ways of doing so..
This exact idea was highlighted by I should have loved biology by James Somers:
For a computer scientist, a biologist's methods can seem insane; the trouble comes from the fact that cells are too small, too numerous, too complex to analyze the way a programmer would, say in a step-by-step debugger.
And then just like in software, some of the methods biologists use to overcome the lack of visibility have direct software analogues:
- add instrumentation to cells, e.g. GFP tagging comes to mind
- emulation, e.g. E. Coli Whole Cell Model by Covert Lab
The boot process is another one. E.g. in x86 the way that you start in 16-bit mode, largely compatible into the 70's, then move to 32-bit and finally 64, does feel a lot the way a earlier stages of embryo development looks more and more like more ancient animals.

Ciro likes to think that maybe that is why a hardcore systems programmer like Bert Hubert got into molecular biology.

Some other people who mention similar things:

I should have loved biology by James Somers highlights the computer abstraction layer analogy between the two:
Everywhere you look - the compiler, the shell, the CPU, the DOM - is an abstraction hiding lifetimes of work.

 Read the full article

x86 Paging Tutorial / Application Updated 2025-07-16

 View more

Paging makes it easier to compile and run two programs or threads at the same time on a single computer.

For example, when you compile two programs, the compiler does not know if they are going to be running at the same time or not.

So nothing prevents it from using the same RAM address, say, 0x1234, to store a global variable.

And thread stacks, that must be contiguous and keep growing down until they overwrite each other, are an even bigger issue!

But if two programs use the same address and run at the same time, this is obviously going to break them!

Paging solves this problem beautifully by adding one degree of indirection:

(logical) ------------> (physical)
             paging

Where:

logical addresses are what userland programs see, e.g. the contents of rsi in mov eax, [rsi].
They are often called "virtual" addresses as well.
physical addresses can be though of the values that go to physical RAM index wires.
But keep in mind that this is not 100% true because of further indirections such as:
- memory-mapped I/O regions
- multi channel memory

Compilers don't need to worry about other programs: they just use simple logical addresses.

As far as programs are concerned, they think they can use any address between 0 and 4 GiB (2^32, FFFFFFFF) on 32-bit systems.

The OS then sets up paging so that identical logical addresses will go into different physical addresses and not overwrite each other.

This makes it much simpler to compile programs and run them at the same time.

Paging achieves that goal, and in addition:

the switch between programs is very fast, because it is implemented by hardware
the memory of both programs can grow and shrink as needed without too much fragmentation
one program can never access the memory of another program, even if it wanted to.
This is good both for security, and to prevent bugs in one program from crashing other programs.

Or if you like non-funny jokes:

Figure 1.
Comparison between the Linux kernel userland memory virtualization and The Matrix
. Source. Is this RAM real?

 Read the full article