These are the best articles ever authored by Ciro Santilli, most of them in the format of Stack Overflow answers.
Ciro posts update about new articles on his Twitter accounts.
A chronological list of all articles is also kept at: Section "Updates".
Some random generally less technical in-tree essays will be present at: Section "Essays by Ciro Santilli".
- Trended on Hacker News:
- CIA 2010 covert communication websites on 2023-06-11. 190 points, a mild success.
- x86 Bare Metal Examples on 2019-03-19. 513 points. The third time something related to that repo trends. Hacker news people really like that repo!
- again 2020-06-27 (archive). 200 points, repository traffic jumped from 25 daily unique visitors to 4.6k unique visitors on the day
- How to run a program without an operating system? on 2018-11-26 (archive). 394 points. Covers x86 and ARM
- ELF Hello World Tutorial on 2017-05-17 (archive). 334 points.
- x86 Paging Tutorial on 2017-03-02. Number 1 Google search result for "x86 Paging" in 2017-08. 142 points.
- x86 assembly
- What does "multicore" assembly language look like?
- What is the function of the push / pop instructions used on registers in x86 assembly? Going down to memory spills, register allocation and graph coloring.
- Linux kernel
- What do the flags in /proc/cpuinfo mean?
- How does kernel get an executable binary file running under linux?
- How to debug the Linux kernel with GDB and QEMU?
- Can the sys_execve() system call in the Linux kernel receive both absolute or relative paths?
- What is the difference between the kernel space and the user space?
- Is there any API for determining the physical address from virtual address in Linux?
- Why do people write the
#!/usr/bin/env
python shebang on the first line of a Python script? - How to solve "Kernel Panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)"?
Figure 2. Path from init/main.c until bzImage in the Linux kernel 4.19. Source. From: What is the difference between the following kernel Makefile terms: vmLinux, vmlinuz, vmlinux.bin, zimage & bzimage?- Single program Linux distro
- QEMU
- gcc and Binutils:
- How do linkers and address relocation works?
- What is incremental linking or partial linking?
- GOLD (
-fuse-ld=gold
) linker vs the traditional GNU ld and LLVM ldd - What is the -fPIE option for position-independent executables in GCC and ld? Concrete examples by running program through GDB twice, and an assembly hello world with absolute vs PC relative load.
- How many GCC optimization levels are there?
- Why does GCC create a shared object instead of an executable binary according to file?
- C/C++: almost all of those fall into "disassemble all the things" category. Ciro also does "standards dissection" and "a new version of the standard is out" answers, but those are boring:
- What does "static" mean in a C program?
- In C++ source, what is the effect of
extern "C"
? - Char array vs Char Pointer in C
- How to compile glibc from source and use it?
- When should
static_cast
,dynamic_cast
,const_cast
andreinterpret_cast
be used? - What exactly is
std::atomic
in C++?. This answer was originally more appropriately entitled "Let's disassemble some stuff", and got three downvotes, so Ciro changed it to a more professional title, and it started getting upvotes. People judge books by their covers. notmain.o 0000000000000000 0000000000000017 W MyTemplate<int>::f(int) main.o 0000000000000000 0000000000000017 W MyTemplate<int>::f(int)
Code 1.. From: What is explicit template instantiation in C++ and when to use it?nm
outputs showing that objects are redefined multiple times across files if you don't use template instantiation properly
- IEEE 754
- What is difference between quiet NaN and signaling NaN?
- In Java, what does NaN mean?
Without subnormals: +---+---+-------+---------------+-------------------------------+ exponent | ? | 0 | 1 | 2 | 3 | +---+---+-------+---------------+-------------------------------+ | | | | | | v v v v v v ----------------------------------------------------------------- floats * **** * * * * * * * * * * * * ----------------------------------------------------------------- ^ ^ ^ ^ ^ ^ | | | | | | 0 | 2^-126 2^-125 2^-124 2^-123 | 2^-127 With subnormals: +-------+-------+---------------+-------------------------------+ exponent | 0 | 1 | 2 | 3 | +-------+-------+---------------+-------------------------------+ | | | | | v v v v v ----------------------------------------------------------------- floats * * * * * * * * * * * * * * * * * ----------------------------------------------------------------- ^ ^ ^ ^ ^ ^ | | | | | | 0 | 2^-126 2^-125 2^-124 2^-123 | 2^-127
Code 2.Visualization of subnormal floating point numbers vs what IEEE 754 would look like without them. From: What is a subnormal floating point number?
- Computer science
- Algorithms
- Is it necessary for NP problems to be decision problems?
- Polynomial time and exponential time. Answered focusing on the definition of "exponential time".
- What is the smallest Turing machine where it is unknown if it halts or not?. Answer focusing on "blank tape" initial condition only. Large parts of it are summarizing the Busy Beaver Challenge, but some additions were made.
- Algorithms
- Git
| 0 | 4 | 8 | C | |-------------|--------------|-------------|----------------| 0 | DIRC | Version | File count | ctime ...| 0 | ... | mtime | device | 2 | inode | mode | UID | GID | 2 | File size | Entry SHA-1 ...| 4 | ... | Flags | Index SHA-1 ...| 4 | ... |
tree {tree_sha} {parents} author {author_name} <{author_email}> {author_date_seconds} {author_date_timezone} committer {committer_name} <{committer_email}> {committer_date_seconds} {committer_date_timezone} {commit message}
Code 4.Description of the Git commit object binary data structure. From: What is the file format of a git commit object data structure?- How do I clone a subdirectory only of a Git repository?
- Python
- Web technology
- OpenGL
Figure 8. Example of a texture atlas containing glyphs. Source.Image by Nicolas P. Rougier, author of Freetype GL.Used on Ciro Santilli's answer: How to draw text using only OpenGL methods?- What are shaders in OpenGL?
- Why do we use 4x4 matrices to transform things in 3D?
Figure 10. . Source. - Image Processing with GLSL shaders? Compared the CPU and GPU for a simple blur algorithm.
- Node.js
- Ruby on Rails
- POSIX
- What is POSIX? Huge classified overview of the most important things that POSIX specifies.
- Systems programming
- What do the terms "CPU bound" and "I/O bound" mean?
Figure 12. Plot of "real", "user" and "sys" mean times of the output of time for CPU-bound workload with 8 threads. Source. From: What do 'real', 'user' and 'sys' mean in the output of time?+--------+ +------------+ +------+ | device |>---------------->| function 0 |>----->| BAR0 | | | | | +------+ | |>------------+ | | | | | | | +------+ ... ... | | |>----->| BAR1 | | | | | | +------+ | |>--------+ | | | +--------+ | | ... ... ... | | | | | | | | +------+ | | | |>----->| BAR5 | | | +------------+ +------+ | | | | | | +------------+ +------+ | +--->| function 1 |>----->| BAR0 | | | | +------+ | | | | | | +------+ | | |>----->| BAR1 | | | | +------+ | | | | ... ... ... | | | | | | +------+ | | |>----->| BAR5 | | +------------+ +------+ | | | ... | | | +------------+ +------+ +------->| function 7 |>----->| BAR0 | | | +------+ | | | | +------+ | |>----->| BAR1 | | | +------+ | | ... ... ... | | | | +------+ | |>----->| BAR5 | +------------+ +------+
Code 5.Logical struture PCIe device, functions and BARs. From: What is the Base Address Register (BAR) in PCIe?
- Electronics
- Raspberry Pi
Figure 13. . Image from answer to: How to hook up a Raspberry Pi via Ethernet to a laptop without a router? Figure 14. . Image from answer to: How to hook up a Raspberry Pi via Ethernet to a laptop without a router? Figure 15. . Image from answer to: How to emulate the Raspberry Pi 2 on QEMU? Figure 16. Bare metal LED blinker program running on a Raspberry Pi 2. Image from answer to: How to run a C program with no OS on the Raspberry Pi?
- Raspberry Pi
- Computer security
- Media
Video 2. Canon in D in C. Source.The original question was deleted, lol...: How to programmatically synthesize music?- How to resize a picture using ffmpeg's sws_scale()?
- Is there any decent speech recognition software for Linux? ran a few examples manually on
vosk-api
and compared to ground truth.
- Eclipse
- Computer hardware
- Scientific visualization software
Figure 17. VisIt zoom in 10 million straight line plot with some manually marked points. Source. From: Section "Survey of open source interactive plotting software with a 10 million point scatter plot benchmark by Ciro Santilli"
- Numerical analysis
Video 3. Real-time heat equation OpenGL visualization with interactive mouse cursor using relaxation method by Ciro Santilli (2016)Source.
- Computational physics
Figure 18. gnuplot plot of the y position of a sphere bouncing on a plane simulated in Bullet Physics. Source. From: What is the simplest collision example possible in a Bullet Physics simulation?
- Register transfer level languages like Verilog and VHDL
- Verilog:
Figure 19. . See also: Section "Verilator interactive example"
- Verilog:
- Android
Video 4. Android screen showing live on an Ubuntu laptop through ADB. Source. From: How to see the Android screen live on an Ubuntu desktop through ADB?
- Debugging
- Program optimization
- What is tail call optimization?
Figure 21. gprof2dot image generated from the gprof data of a simple test program. Source.The answer compares gprof, valgrind callgrind, perf and gperftools on a single simple executable.
- Data
Figure 22. Mathematics dump of Wikipedia CatTree. Source.
- Mathematics
Figure 23. Diagram of the fundamental theorem on homomorphisms by Ciro Santilli (2020)Shows the relationship between group homomorphisms and normal subgroups.- Section "Formalization of mathematics": some early thoughts that could be expanded. Ciro almost had a stroke when he understood this stuff in his teens.
Figure 24. Simple example of the Discrete Fourier transform. Source. That was missing from Wikipedia page: en.wikipedia.org/wiki/Discrete_Fourier_transform!
- Network programming
- Physics
- What is the difference between plutonium and uranium?
Figure 25. Spacetime diagram illustrating how faster-than-light travel implies time travel. From: Does faster than light travel imply travelling back in time?
- Biology
Figure 27. Mass fractions in a minimal growth medium vs an amino acid cut in a simulation of the E. Coli Whole Cell Model by Covert Lab. Source. From: Section "E. Coli Whole Cell Model by Covert Lab"
- Quantum computing
- Section "Quantum computing is just matrix multiplication"
Figure 28. Visualization of the continuous deformation of states as we walk around the Bloch sphere represented as photon polarization arrows. From: Understanding the Bloch sphere.
- Bitcoin
- GIMP
Figure 29. GIMP screenshot part of how to combine two images side-by-side in GIMP?.
- Home DIY
Figure 30. Total_Blackout_Cassette_Roller_Blind_With_Curtains.Source. From: Section "How to blackout your window without drilling"
- China
A computer is a highly layered system, and so you have to decide which layers you are the most interested in studying.
Although the layer are somewhat independent, they also sometimes interact, and when that happens it usually hurts your brain. E.g., if compilers were perfect, no one optimizing software would have to know anything about microarchitecture. But if you want to go hardcore enough, you might have to learn some lower layer.
It must also be said that like in any industry, certain layers are hidden in commercial secrecy mysteries making it harder to actually learn them. In computing, the lower level you go, the more closed source things tend to become.
But as you climb down into the abyss of low level hardcoreness, don't forget that making usefulness is more important than being hardcore: Figure 1. "xkcd 378: Real Programmers".
First, the most important thing you should know about this subject: cirosantilli.com/linux-kernel-module-cheat/should-you-waste-your-life-with-systems-programming
Here's a summary from low-level to high-level:
- semiconductor physical implementation this level is of course the most closed, but it is fun to try and peek into it from any openings given by commercials and academia:
- photolithography, and notably photomask design
- register transfer level
- interactive Verilator fun: Is it possible to do interactive user input and output simulation in VHDL or Verilog?
- more importantly, and much harder/maybe impossible with open source, would be to try and set up a open source standard cell library and supporting software to obtain power, performance and area estimates
- Are there good open source standard cell libraries to learn IC synthesis with EDA tools? on Quora
- the most open source ones are some initiatives targeting FPGAs, e.g. symbiflow.github.io/, www.clifford.at/icestorm/
- qflow is an initiative targeting actual integrated circuits
- microarchitecture: a good way to play with this is to try and run some minimal userland examples on gem5 userland simulation with logging, e.g. see on the Linux Kernel Module Cheat:This should be done at the same time as books/website/courses that explain the microarchitecture basics.This is the level of abstraction that Ciro Santilli finds the most interesting of the hardware stack. Learning it for actual CPUs (which as of 2020 is only partially documented by vendors) could actually be useful in hardcore software optimization use cases.
- instruction set architecture: a good approach to learn this is to manually write some userland assembly with assertions as done in the Linux Kernel Module Cheat e.g. at:
- github.com/cirosantilli/linux-kernel-module-cheat/blob/9b6552ab6c66cb14d531eff903c4e78f3561e9ca/userland/arch/x86_64/add.S
- cirosantilli.com/linux-kernel-module-cheat/x86-userland-assembly
- learn a bit about calling conventions, e.g. by calling C standard library functions from assembly:
- you can also try and understand what some simple C programs compile to. Things can get a bit hard though when
-O3
is used. Some cute examples:
- executable file format, notably executable and Linkable Format. Particularly important is to understand the basics of:
- address relocation: How do linkers and address relocation work?
- position independent code: What is the -fPIE option for position-independent executables in GCC and ld?
- how to observe which symbols are present in object files, e.g.:
- how C++ uses name mangling What is the effect of extern "C" in C++?
- how C++ template instantiation can help reduce link time and size: Explicit template instantiation - when is it used?
- operating system. There are two ways to approach this:
- learn about the Linux kernel Linux kernel. A good starting point is to learn about its main interfaces. This is well shown at Linux Kernel Module Cheat:
- system calls
- write some system calls in
- pure assembly:
- C GCC inline assembly:
- write some system calls in
- learn about kernel modules and their interfaces. Notably, learn about to demystify special files such
/dev/random
and so on: - learn how to do a minimal Linux kernel disk image/boot to userland hello world: What is the smallest possible Linux implementation?
- learn how to GDB Step debug the Linux kernel itself. Once you know this, you will feel that "given enough patience, I could understand anything that I wanted about the kernel", and you can then proceed to not learn almost anything about it and carry on with your life
- system calls
- write your own (mini-) OS, or study a minimal educational OS, e.g. as in:
- learn about the Linux kernel Linux kernel. A good starting point is to learn about its main interfaces. This is well shown at Linux Kernel Module Cheat:
- programming language
How low can you go video by Ciro Santilli (2017)
Source. In this infamous video Ciro has summarized the computer hierarchy.One very good thing about this is that it makes it easy to create test cases directly in C++. You just supply inputs and clock the simulation directly in a C++ loop, then read outputs and assert them with
assert()
. And you can inspect variables by printing them or with GDB. This is infinitely more convenient than doing these IO-type tasks in Verilog itself.Some simulation examples under verilog.
First install Verilator. On Ubuntu:Tested on Verilator 4.038, Ubuntu 22.04.
sudo apt install verilator
Run all examples, which have assertions in them:
cd verilator
make run
File structure is for example:
- verilog/counter.v: Verilog file
- verilog/counter.cpp: C++ loop which clocks the design and runs tests with assertions on the outputs
- verilog/counter.params: gcc compilation flags for this example
- verilog/counter_tb.v: Verilog version of the C++ test. Not used by Verilator. Verilator can't actually run out
_tb
files, because they do in Verilog IO things that we do better from C++ in Verilator, so Verilator didn't bother implementing them. This is a good thing.
Example list:
- verilog/negator.v, verilog/negator.cpp: the simplest non-identity combinatorial circuit!
- verilog/counter.v, verilog/counter.cpp: sequential hello world. Synchronous active high reset with active high enable signal. Adapted from: www.asic-world.com/verilog/first1.html
- verilog/subleq.v, verilog/subleq.cpp: subleq one instruction set computer with separated instruction and data RAMs
The example under verilog/interactive showcases how to create a simple interactive visual Verilog example using Verilator and SDL.
You could e.g. expand such an example to create a simple (or complex) video game for example if you were insane enough. But please don't waste your time doing that, Ciro Santilli begs you.
The example is also described at: stackoverflow.com/questions/38108243/is-it-possible-to-do-interactive-user-input-and-output-simulation-in-vhdl-or-ver/38174654#38174654
Usage: install dependencies:then run as either:Tested on Verilator 4.038, Ubuntu 22.04.
sudo apt install libsdl2-dev verilator
make run RUN=and2
make run RUN=move
File overview:
In those examples, the more interesting application specific logic is delegated to Verilog (e.g.: move game character on map), while boring timing and display matters can be handled by SDL and C++.