These are the best articles ever authored by Ciro Santilli, most of them in the format of Stack Overflow answers.
Ciro posts update about new articles on his Twitter accounts.
A chronological list of all articles is also kept at: Section "Updates".
Some random generally less technical in-tree essays will be present at: Section "Essays by Ciro Santilli".
- Trended on Hacker News:
- CIA 2010 covert communication websites on 2023-06-11. 190 points, a mild success.
- x86 Bare Metal Examples on 2019-03-19. 513 points. The third time something related to that repo trends. Hacker news people really like that repo!
- again 2020-06-27 (archive). 200 points, repository traffic jumped from 25 daily unique visitors to 4.6k unique visitors on the day
- How to run a program without an operating system? on 2018-11-26 (archive). 394 points. Covers x86 and ARM
- ELF Hello World Tutorial on 2017-05-17 (archive). 334 points.
- x86 Paging Tutorial on 2017-03-02. Number 1 Google search result for "x86 Paging" in 2017-08. 142 points.
- x86 assembly
- What does "multicore" assembly language look like?
- What is the function of the push / pop instructions used on registers in x86 assembly? Going down to memory spills, register allocation and graph coloring.
- Linux kernel
- What do the flags in /proc/cpuinfo mean?
- How does kernel get an executable binary file running under linux?
- How to debug the Linux kernel with GDB and QEMU?
- Can the sys_execve() system call in the Linux kernel receive both absolute or relative paths?
- What is the difference between the kernel space and the user space?
- Is there any API for determining the physical address from virtual address in Linux?
- Why do people write the
#!/usr/bin/env
python shebang on the first line of a Python script? - How to solve "Kernel Panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)"?
- Single program Linux distro
- QEMU
- gcc and Binutils:
- How do linkers and address relocation works?
- What is incremental linking or partial linking?
- GOLD (
-fuse-ld=gold
) linker vs the traditional GNU ld and LLVM ldd - What is the -fPIE option for position-independent executables in GCC and ld? Concrete examples by running program through GDB twice, and an assembly hello world with absolute vs PC relative load.
- How many GCC optimization levels are there?
- Why does GCC create a shared object instead of an executable binary according to file?
- C/C++: almost all of those fall into "disassemble all the things" category. Ciro also does "standards dissection" and "a new version of the standard is out" answers, but those are boring:
- What does "static" mean in a C program?
- In C++ source, what is the effect of
extern "C"
? - Char array vs Char Pointer in C
- How to compile glibc from source and use it?
- When should
static_cast
,dynamic_cast
,const_cast
andreinterpret_cast
be used? - What exactly is
std::atomic
in C++?. This answer was originally more appropriately entitled "Let's disassemble some stuff", and got three downvotes, so Ciro changed it to a more professional title, and it started getting upvotes. People judge books by their covers. notmain.o 0000000000000000 0000000000000017 W MyTemplate<int>::f(int) main.o 0000000000000000 0000000000000017 W MyTemplate<int>::f(int)
- IEEE 754
- What is difference between quiet NaN and signaling NaN?
- In Java, what does NaN mean?
Without subnormals: +---+---+-------+---------------+-------------------------------+ exponent | ? | 0 | 1 | 2 | 3 | +---+---+-------+---------------+-------------------------------+ | | | | | | v v v v v v ----------------------------------------------------------------- floats * **** * * * * * * * * * * * * ----------------------------------------------------------------- ^ ^ ^ ^ ^ ^ | | | | | | 0 | 2^-126 2^-125 2^-124 2^-123 | 2^-127 With subnormals: +-------+-------+---------------+-------------------------------+ exponent | 0 | 1 | 2 | 3 | +-------+-------+---------------+-------------------------------+ | | | | | v v v v v ----------------------------------------------------------------- floats * * * * * * * * * * * * * * * * * ----------------------------------------------------------------- ^ ^ ^ ^ ^ ^ | | | | | | 0 | 2^-126 2^-125 2^-124 2^-123 | 2^-127
- Computer science
- Algorithms
- Is it necessary for NP problems to be decision problems?
- Polynomial time and exponential time. Answered focusing on the definition of "exponential time".
- What is the smallest Turing machine where it is unknown if it halts or not?. Answer focusing on "blank tape" initial condition only. Large parts of it are summarizing the Busy Beaver Challenge, but some additions were made.
- Algorithms
- Git
| 0 | 4 | 8 | C | |-------------|--------------|-------------|----------------| 0 | DIRC | Version | File count | ctime ...| 0 | ... | mtime | device | 2 | inode | mode | UID | GID | 2 | File size | Entry SHA-1 ...| 4 | ... | Flags | Index SHA-1 ...| 4 | ... |
tree {tree_sha} {parents} author {author_name} <{author_email}> {author_date_seconds} {author_date_timezone} committer {committer_name} <{committer_email}> {committer_date_seconds} {committer_date_timezone} {commit message}
- How do I clone a subdirectory only of a Git repository?
- Python
- Web technology
- OpenGL
- What are shaders in OpenGL?
- Why do we use 4x4 matrices to transform things in 3D?
- Image Processing with GLSL shaders? Compared the CPU and GPU for a simple blur algorithm.
- Node.js
- Ruby on Rails
- POSIX
- What is POSIX? Huge classified overview of the most important things that POSIX specifies.
- Systems programming
- What do the terms "CPU bound" and "I/O bound" mean?
+--------+ +------------+ +------+ | device |>---------------->| function 0 |>----->| BAR0 | | | | | +------+ | |>------------+ | | | | | | | +------+ ... ... | | |>----->| BAR1 | | | | | | +------+ | |>--------+ | | | +--------+ | | ... ... ... | | | | | | | | +------+ | | | |>----->| BAR5 | | | +------------+ +------+ | | | | | | +------------+ +------+ | +--->| function 1 |>----->| BAR0 | | | | +------+ | | | | | | +------+ | | |>----->| BAR1 | | | | +------+ | | | | ... ... ... | | | | | | +------+ | | |>----->| BAR5 | | +------------+ +------+ | | | ... | | | +------------+ +------+ +------->| function 7 |>----->| BAR0 | | | +------+ | | | | +------+ | |>----->| BAR1 | | | +------+ | | ... ... ... | | | | +------+ | |>----->| BAR5 | +------------+ +------+
- Electronics
- Computer security
- Media
- How to resize a picture using ffmpeg's sws_scale()?
- Is there any decent speech recognition software for Linux? ran a few examples manually on
vosk-api
and compared to ground truth.
- Eclipse
- Computer hardware
- Scientific visualization software
- Numerical analysis
- Computational physics
- Register transfer level languages like Verilog and VHDL
- Android
- Debugging
- Program optimization
- Data
- Mathematics
- Section "Formalization of mathematics": some early thoughts that could be expanded. Ciro almost had a stroke when he understood this stuff in his teens.
- Network programming
- Physics
- Biology
- Quantum computing
- Bitcoin
- GIMP
- Home DIY
- China
Ciro Santilli has a bad memory for events that happened a medium time ago, for example in order of months/years. Especially if they are one-off things that have no relation to anything else.
For example, Ciro never remembers which places he travelled to just once, and who was in each trip! He has images of several places he travelled to in his head, and would recognize them, but he just doesn't know where they were!
Another example, Ciro was looking at the carpet at their house, and asked where it came from. His wife replied immeidately: from Bercy shopping quarter in Paris about 10 years ago, and you took it on your back for a long walk until we could find the bus back home because we were concerned it wouldn't fit in the train!
The same goes for scenes from movies and passages from music, which explains why Ciro's art consumption focuses on innovative discrete "what happened" and "general gist" ideas, rather than, analog details such as colors and shapes.
Going back even further in time, Ciro starts to forget the less close friends he had, because the events start to fade away.
Paradoxically however, Ciro believes that this bad memory is one of his greatest strengths and key defining characteristics, because it leads Ciro to want to write down every interesting thing he learns, which motivated OurBigBook.com and his Stack Overflow contributions and his related Ciro Santilli's documentation superpowers.
It also somewhat leads Ciro to like physics and mathematics, because in these fields you "can deduce everything" from very few base principles, so if you forget them, it does not matter that much as you can re-deduce stuff over and over. Which is somewhat where the high flying bird attitude comes from. It is hard to go deep when you have to re-prove everything every time. But the upside is that anything that sticks, does so because it has a broad net to stick to, and therefore allows Ciro to make unusual and unexpected connections that others might not.
Ciro believes that there are two types of people, and most notably software engineers, which are basically data wranglers: those with bad memory and those with good memory.
Those with bad memory, tend to focus on automating and improving their processes a lot. They take much longer to do one-off specific deep knowledge tasks however.
The downside of the good memory ones is that sooner or later they will find tasks that no matter how much memory they have, they cannot solve without automation, and they will fail at those.
Also, good memory people don't enable others to join the project efficiently as much.
This dichotomy also explains why Ciro sucks at code reviews, but is rather the person who runs the interesting patches by himself and finds some critical problems that the more theoretical code reviewers missed.
If Ciro had become a scientist, he would without doubt be an experimentalist, just like in this reality he is a GDB/runtime person rather than a "static source analysis" person. Those who have bad memory prefer to just run experiments over and over and observe system state at runtime.
Other effects of having a bad memory include:
- code duplication, or a constant fear of it at least, because Ciro forgets that some functionality exists already
- meeting aversion, because everything that is not recorded will fade away
- passion for backward design, because by the time a piece of knowledge learnt in school might be useful (and 99.99% won't), it will have been long forgotten
Related: jakobschwichtenberg.com/about/ from Jakob Schwichtenberg:
I'm a physicist and I try to write down things during my own learning process.In some sense, one of the biggest benefits I have over other people in physics is that I'm certainly not the smartest guy! I usually can't grasp complex issues very easily. So I have to break down complex ideas into smaller chunks to understand it myself. This means, whenever I describe something to others, everyone understands, because it's broken down into such simple terms.
On C2 wiki, therefore it cannot be wrong wiki.c2.com/?QuasiGreatTeacher:
Some people have learning disabilities, [... bullshit ...]. A lot of classic spiritual texts have been produced this way. Basically, the stupidest but most dogged disciple, if he has a neurotic habit of writing things down, will make the best teacher for the third and subsequent generations.
A branch of mathematics that attempts to prove stuff about computers.
Unfortunately, all software engineers already know the answer to the useful theorems though (except perhaps notably for cryptography), e.g. all programmers obviously know that iehter P != NP or that this is unprovable or some other "for all practical purposes practice P != NP", even though they don't have proof.
And 99% of their time, software engineers are not dealing with mathematically formulatable problems anyways, which is sad.
The only useful "computer science" subset every programmer ever needs to know is:
- for arrays: dynamic array vs linked list
- for associative array: binary search tree vs hash table. See also Heap vs Binary Search Tree (BST). No need to understand the algorithmic details of the hash function, the NSA has already done that for you.
- don't use Bubble sort for sorting
- you can't parse HTML with regular expressions: stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 because of formal language theory
Funnily, due to the formalization of mathematics, mathematics can be seen as a branch of computer science, just like computer science can be seen as a branch of Mathematics!
Models existence in the context of the formalization of mathematics.
A proof in some system for the formalization of mathematics.
The fundamental concept of calculus!
The reason why the epsilon delta definition is so venerated is that it fits directly into well known methods of the formalization of mathematics, making the notion completely precise.
The proper precise definition of mathematics can be found at: Section "Formalization of mathematics".
The most beautiful things in mathematics are described at: Section "The beauty of mathematics".
This is the part of the formalization of mathematics that deals only with the propositions.
In some systems, e.g. including Metamath, modus ponens alone tends to be enough, everything else can be defined based on it.
Intuitively: unordered container where all the values are unique, just like C++
std::set
.More precisely for set theory formalization of mathematics:
- everything is a set, including the elements of sets
- string manipulation wise:
{}
is an empty set. The natural number0
is defined as{}
as well.{{}}
is a set that contains an empty set{{}, {{}}}
is a set that contains two sets:{}
and{{}}
{{}, {}}
is not well formed, because it contains{}
twice
Notably, given the domain name, it is clear that he likes formalization of mathematics-stuff, like Ciro Santilli.
At first glance, looks a bit dry though, not many examples.
Big goals:
- the pursuit of AGI
- physics simulations, including scientific visualization software
- formalization of mathematics