The best articles by Ciro Santilli Updated +Created
These are the best articles ever authored by Ciro Santilli, most of them in the format of Stack Overflow answers.
Ciro posts update about new articles on his Twitter accounts.
A chronological list of all articles is also kept at: Section "Updates".
Some random generally less technical in-tree essays will be present at: Section "Essays by Ciro Santilli".
Bisection (software engineering) Updated +Created
One of the Holiest age old debugging techniques!
Git has some helpers to help you achieve bisection Nirvana:
Obviously not restricted to software engineering alone, and used in all areas of engineering, e.g. Video "Air-tight vs. Vacuum-tight by AlphaPhoenix (2020)" uses it in vacuum engineering.
The cool thing about bisection is that it is a brainless process: unlike when using a debugger, you don't have to understand anything about the system, and it incredibly narrows down the problem cause for you. Not having to think is great!
Ciro Santilli Updated +Created
Ciro Santilli is a male human software engineer and activist born in Brazil, Earth in 1989 AD.
Quick facts:
Other people with the same name are listed at Section "Ciro Santilli's homonyms".
Figure 1.
19th century illustration of the Journey to the West protagonist Sun Wukong
. Source.
Sun Wukong (孙悟空) is a playful and obscenely powerful monkey Journey to the West. He protects Buddhist monk Tang Sanzang, and likes eating fruit, just like Ciro. Oh, and Goku from Dragon Ball is based on him. His japanese name is "Sun Wukong" (same Chinese characters with different Japanese pronunciation) for the love. His given name "Wukong" means literally "the one who mastered the void", which is clearly a Dharma name and fucking awesome in multiple ways. This is another sad instance of a Chinese thing better known in the West as Japanese.
It is worth noting however that although Wukong is extremely charming, Ciro's favorite novel of the Four Great Classic Novels is Water Margin. Journey to the West is just a monster of the week for kids, but Water Margin is a fight for justice saga. Sorry Wukong!
Figure 2.
Ciro Santilli playing with a pipette at the University of Cambridge circa 2017
The photo was taken in an open event organized by the awesome Cambridge Synthetic Biology outreach group, more or less the same people who organize: and who helped organize Section "How to use an Oxford Nanopore MinION to extract DNA from river water and determine which bacteria live in it".
Taking part in such activities is what Ciro tries to do to overcome his lifelong regret of not having done more experimental stuff at university. Would he have had the patience to handle all the bullshit of the physical word without going back to the informational sciences? Maybe, maybe not. But now he will probably never know?!
Notice the orange high visibility cycling jersey under the lab coat, from someone who had just ridden in from work as fast as possible as part of his "lunch break". It is more fun when it is hard.
Figure 3.
Scribe Jean Miélot, 15th century
Ciro Santilli fantasizes that he would have make a good scribe in the middle ages, partly due to his self diagnosed graphomania, but also appreciation for foreign languages, and his mild obsession with the natural sciences. is Ciro's view of a modern day scriptorium, except that now the illuminations are YouTube videos.
Chill and eat your bread in peace comes to mind. A scribe, in a library, reading and writing the entire day in peace and quiet. The life!
The job of a Internet-age scribe is basically that of making knowledge more open, legally extracting it from closed copyrighted sources, and explaining your understanding of it to the wider world under Creative Commons licenses on the web. And in the process of greater openness, given a well organized system, we are able combine the knowledge of many different people, and thus make things more understandable than any single/few creator closed source source could ever achieve.
Ciro once saw some cartoon on Wikipedia help pages of a turtle with a book in one hand, and typing into Wikipedia on its computer, TODO find it. That cartoon summarizes well the modern scribe life.
Another analogous version of this fantasy more in touch with Ciro's sinophily is the ideal of the Chinese scholar, notably including their stereotypical attributes such as mastery of the Four arts.
Figure 4.
Ciro Santilli piling boxes as a child
. A natural born engineer.
Ciro Santilli's Open Source Enlightenment Updated +Created
Firstly, in 2012, while he was at École Polytechnique, Ciro Santilli was introduced to LaTeX (thank God for French mathematical obsession), and his mind was blown:
Ha, so I can write my own books, and so can anyone, for free?
he though. Why isn't everyone doing that!
One particular event stood out: Ciro made a small change to his teacher's course material, who blessed be him (dude's a legend, Ciro just noticed he has some Chinese publications with another French dude, e.g.高效算法-应试与提高必修128例-克里斯托弗-Christoph-Durr/dp/B078SJQPVK "High-efficiency algorithm competitions 128 examples", did he write it the Chinese himself?? Must be of course to complement the notoriously low French professor salaries), made it available, and then Ciro gave him back the .tex file. Ciro was just a bit worried about how the teacher would be able to tell what he had changed in the file to validate the change. The teacher just said of course, "no problem, I'll just use diff". Ciro had never heard of diff. Let alone Git of course, though yes, this was a bit early in Git's history version control systems had been around since forever of course. This was 2011 or 2012, about 4 or 5 years into a superior education curricula with various courses involving computers, some requiring quite a lot of "fill these empty functions" style programming. Education is a joke. Anyways, this was a prelude to exactly what Ciro wanted to do in This might have been the one actually:
Not long afterwards, Ciro started playing with Linux. Until then, Ciro had had some contacts with the mysterious operating system at university, and was a bit puzzled what the point of it was! He clearly remembers:
  • at the University of São Paulo that they had some "UNIX" computers in some classes, and at the library
  • at École Polytechnique, he took a course about mathematical analysis and there was a "lab" where students were supposed to use FreeFem, great initiative BTW. And Ciro distinctly remembers being paried with a nice Chilian colleague, and the guy was alreay super at ease with the shell: "cd", "ls", etc. WTF was all that!
University should be forced to use only open source software and hardware in undergrad teaching courses by law BTW.
Then came an Ubuntu live disk on his own machine, and finally a measly 40GB dual book partition in a Microsoft Windows machine on a laptop. At first, it took a lot of time to learn all the crazy new terminal stuff! Yes, at this point, Ubuntu was already usable enough without the terminal, an accomplishment actually. But as a programmer, Ciro felt obliged to learn. Many hours were spent reading man pages at the library. But it all just felt so right, and sometimes powerful... true wizardry.
And ten years later, Ciro was seriously considering buying a computer without Windows pre-installed. He had not used Windows a single tie on a personal machine even once in those ten years!
Finally, to finish things off Ciro found two websites that changed his life forever, and made be believe that there was an alternative: Stack Overflow and GitHub.
The brutal openness of it all. The raw high quality content. Ugliness and uselessness too no doubt. But definitely spark in a sea of darkness.
Ciro Santilli's Stack Overflow contributions Updated +Created
Ciro Santilli's Stack Overflow contributions have, unsurprisingly, centered around the subjects he has worked with: systems programming and web development, and necessary tooling to get those done, such as Git, Python, Bash and Ubuntu.
His best answers are listed at: Section "The best articles by Ciro Santilli".
Stack Overflow has been the initial centerpiece of Ciro Santilli's campaign for freedom of speech in China, until Ciro noticed that GitHub might be potentially even more effective for it.
In Stack Overflow Ciro likes to:
  • answer important questions found through Google which he needs to solve an actual problem he has right now, and for which none of the existing answers satisfied him, and close duplicates.
  • monitor less known tags which very few people know a lot about and where the knowledge sharing desperately lacking, but in which Ciro specializes and therefore has some uncommon knowledge to share
In practice it also happens that Ciro:
When he gets an upvote on one of his more obscure answers, Ciro often re-reads it, and often finds improvements to be made and makes them.
He doesn't like to refresh the homepage looking for easy reputation on widely known subjects. See also: online forums that lock threads after some time are evil.
The result is that Ciro ends up getting relatively a lot of reputation without much work! The term passive income, much beloved by fake investment gurus, comes to mind. But now it's "passive reputation"! And it is useless! Yay!
For this reason, Necromancer is Ciro's favorite badge (get 5 upvotes on a question older than 60 days), and as of July 2019, he became the 1 user with the most of this badge. Announcement on Twitter.
The number two at the time was VonC (see also: Section "Epic Stack Overflow users"), who had about 16 times more answers than Ciro in total! From this query: it can be seen that as of July 2019, 1216 out of his 1329 answers were answered 60 days after the questions and constitute potential necromancers! Compare that to VonC's 1643 potential necromancers out of 21767 answers!
VonC eventually took back the lead in 2022, dude's a machine!!!
Someone at Ciro's work once said something along:
The more patents a research project generates, the less actually working products it produces.
and this does ring true in Stack Overflow as well. When you are answering stuff, it means that you either didn't know, or that the information wasn't well available, and so your specific application is progressing slowly because of that. Once the generic prerequisites are well solved and answered, you will spend much more time on your business specific things rather than anything else that can be factored out across projects, and so you will get more "directly useful work" done, and less Stack Overflow answers. Of course, without the prior research in place, you can't get the final product done either.
In terms of per year reputation ranks, Ciro was in the top 100 in of the 2018 ranking with 38,710 reputation gained in that year: (archive). He reached top 50 in 2022. Note that daily reputation is mostly capped to 200 per day, leading to a maximum 73000 per year. It is possible to overcome this limit either with bounties or accepts, and Ciro finds it amazing that some people actually break the 73k limit by far with accepts, e.g. Gordon Linoff reached 135k in 2018 (archive)! However, this is something that Ciro will never do, because it implies answering thousands and thousands of useless semi duplicate questions as fast as possible to get the accept. Ciro's reputation comes purely from upvotes on important question, and is therefore sustainable without any extra effort once achieved. Interestingly, Ciro appeared on top of the quarter SE rankings around 2019-11: but it was just a bug ;-)
There is no joy like answering an old question, and watching your better answer go up little by little until it dominates all others.
Stack Overflow reputation is of course, in itself, meaningless. People who contribute to popular subjects like web development will always have infinitely more reputation than those that contribute to low level subjects.
What happens on the specialized topics though is that you end up getting to know all the 5 users who contribute 95% of the content pretty soon as you study those subjects.
Like everything that man does, the majority of Ciro's answers are more or less superficial subjects that many people know but few have the patience to explain well, or they are updates to important questions reflecting upstream developments. But as long as they save 15 minutes from someone's life, that's fine.
There is great beauty when you are involved in a programming problem, and you suddenly remember: wait, I answered something related a few years ago! And especially so when you can go back and improve your old answer with new insight. This has great value, because when you were more newbie, you would have typed different words into Google Search than you would now. So by updating posts from when you were a newbie, you are helping other newbies more, as they are more likely to be also searching for those keywords. It is also very nice to have some head start on the answer's upvote count and not have to bootstrap yet another answer from 0 upvotes and have to go through all the competition!
For example, Ciro's most upvoted answer as of July 2019 is was written when he spent his first week playing with NodeJS (he was having a look at Overleaf, later merged into Overleaf, for education), which he didn't touch again for several years, and still hasn't "mastered" as of 2019! This did teach a concrete life lesson to Ciro however: it is impossible to know what is the most useful thing you can do right now very precisely. The best bet is to follow your instincts and do as much awesome stuff as you can, and then, with some luck, some of those attempts will cover an use case.
Ciro tends to take most pride on his systems programming answers, which is a subject that truly relatively few people know about. He likes it when he goes insanely deep into a subject, way beyond what OP had in mind, exposing full root causes and broader causes, see e.g.:
Ciro also derives great joy from his "media related answers" (3D graphics, audio, video), which are immensely fun to write, and sometimes borderline art, see answers such as those under "OpenGL" and "Media" under the best articles by Ciro Santillis or even simpler answers such as:
There is something of greater value in perfectly presented technical knowledge, that goes beyond than simply getting something done. The pleasure of understanding and mastering something, and perhaps of the explanation itself. Sometimes when answering, Ciro feels like a tailor, where ASCII is his cloth. See also: Section "The art of programming", Section "Physics and the illusion of life".
Ciro's deep understanding of Stack Overflow mechanisms and its shortcomings also helped shape his ideas for: So it is a bit funny to think that after all time Ciro spent on the website, he actually wants to destroy it and replace it with something better. There can be no innovation without some damage. It also led to Ciro's creation of Stack Overflow Vote Fraud Script.
After answering so many questions, he ended up converging to a more or less consistent style, which he formalized at:
Like any other style guide, this answer style guide, once fully incorporated and memorized, allows Ciro to write answers faster, without thinking about formatting issues.
Ciro also made a question title style guide: but for some reason the Stack Overflow community prefers their semi-defined title meta-language to proper English. Go figure.
Ciro started contributing to Stack Overflow in 2012 when he was at École Polytechnique.
Like all things that end up shaping the course of one's life, Ciro started contributing without thinking too much about it.
His first answer was to the LaTeX question: Standalone diagrams with TikZ?, which reflects the fact that this happened while Ciro was reaching his Ciro Santilli's Open Source Enlightenment.
Ciro's first upvote was for his 2012 question: How to run a Python script portably without specifying its full path?
When he started contributing, Ciro was still a newbie. One early event he will never forget was when someone mentioned a "man page", and Ciro commented saying that there was a typo!
When Ciro reached 15 points and gained the ability to upvote, it felt like a major milestone, he even took a screenshot of the browser! 1k, 10k and 100k were also particularly exciting. When the 100k cup (archive) arrived in 2018, Ciro made a show-off Facebook post (archive). At some point though, your brain stops caring, and automatically filters out any upvotes you get except on the answers that you are really proud of and which don't yet have lots of upvotes. The last remaining useless gamed achievement that Ciro looked forward to was legendary (archive), and which he achieved on 2021-02-16.
Figure 4.
Ciro Santilli with his Stack Overflow 100k reputation cup
From the start, Ciro's motivations for contributing to Stack Overflow have been a virtuous circle of:
  • save the world through free education
  • It feels especially amazing when people in the real world start taking note of you, and either close friends tell you straight out that you're a Stack Overflow God, or as you slowly and indirectly find out that less close know or came to you due to your amazing contributions.
It is also amazing when you start having a repertoire of answers, and as you are writing a new answer, you remember: "hey, the knowledge of that answer would be so welcome here", and so you link to the other answer as well at the perfect point. This somewhat achieves does what aims to do: for each small section of a tutorial, gather the best answers by multiple people.
Ciro feels that his Stack Overflow alter ego is kenorb.
Another one is Aaron Hall, who is also very high on the necromancer list, answers in Python which is a topic Ciro cares about, and states on his profile:
Follow me on Twitter and tell me what canonical questions you would like me to respond to!
so another necromancer.
Way to go.
Ciro also asks some questions on a ratio of about 1 question per 10 answers. But Ciro's questions tend to be about extremely niche that no one knows/cares about, and a high percentage of them ends up getting self answered either at asking time or after later research.
Some fun reactions to Ciro's Stack Overflow activity:
Things that are not nice such as:
Git tips Updated +Created
This is a quick presentation that goes over some of the most common difficulties people find with Git.
Bioinformatics Updated +Created
Because Ciro's a software engineer, and he's done enough staring in computers for a lifetime already, and he believes in the power of Git, he didn't pay much attention to this part ;-)
According to the eLife paper, the code appears to have been uploaded to: TODO at least mention the key algorithms used more precisely.
Ciro can however see that it does present interesting problems!
Because it was necessary to wait for 2 days to get our data, the workshop first reused sample data from previous collections done earlier in the year to illustrate the software.
First there is some signal processing/machine learning required to do the base calling, which is not trivial in the Oxford Nanopore, since neighbouring bases can affect the signal of each other. This is mostly handled by Oxford Nanopore itself, or by hardcore programmers in the field however.
After the base calling was done, the data was analyzed using computer programs that match the sequenced 16S sequences to a database of known sequenced species.
This is of course not just a simple direct string matching problem, since like any in experiment, the DNA reads have some errors, so the program has to find the best match even though it is not exact.
The PuntSeq team would later upload the data to well known open databases so that it will be preserved forever! When ready, a link to the data would be uploaded to:
Quantum Field Theory lecture notes by David Tong (2007) Updated +Created
Author: David Tong.
Number of pages circa 2021: 155.
It should also be noted that those notes are still being updated circa 2020 much after original publication. But without Git to track the LaTeX, it is hard to be sure how much. We'll get there one day, one day.
Some quotes self describing the work:
A follow up course in the University of Cambridge seems to be the "Advanced QFT course" (AQFT, Quantum field theory II) by David Skinner: content uploaded to Updated +Created
Managed to upload the content from the static website (OurBigBook Markup source at to
Although most of the key requirements were already in place since the last update, as usual doing things with the complex reference content stresses the system further and leads to the exposition of several new bugs.
The upload of OurBigBook Markup files to was done with the newly added OurBigBook CLI ourbigbook --web option. Although fully exposed to end users, the setup is not super efficient: a trully decent implementation should only upload changed files, and would basically mean reimplementing/using Git, since version diffing is what Git shines at. But I've decided not to put much emphasis on CLI upload for now, since it is expected that initially the majority of users will use the Web UI only. The functionality was added primarily to upload the reference content.
This is a major milestone, as the new content can start attracting new users, and makes the purpose of the website much clearer. Just having this more realistic content also immediately highlighted what the next development steps need to be.
Once v1.0 is reached, I will actually make all internal links of to point to to try and drive some more traffic.
The new content blows up by far the limit of the free Heroku PostgreSQL database of 10k lines. This meant that I needed to upgrade the Heroku Postgres plugin from the free Hobby Dev to the 9 USD/month Hobby Basic:, so now hosting costs will increase from 7 USD/month for the dyno to 7 + 9 = 16 UDS/month. After this upgrade and uploading all of to, Heroku dashboard reads reads:
  • 30,918 rows out of 10,000,000
  • 61.0 MB (out of 10 GB)
so clearly if we are ever forced to upgrade plans again, it means that a bunch of people are using the website and that things are going very very well! Happy how this storage cost turned out so far.
One key limitation found was that Heroku RAM memory is quite limited at 512MB, and JavaScript is not exactly the most memory economical language out there. Started investigation at: Initially working around that by simply splitting the largest files. We were just on the verge of what could be ran however luckily, so a few dozen splits was enough, it managed to handle 70 kB OurBigBook Markup inputs. So hopefully if we manage to optimize a bit more we will be able to set a maximum size of 100 kB and still have a good safety margin.
SQLite Updated +Created
The minimalism, serverlessness/lack of temporary caches/lack of permission management, Hipp's religious obsession with efficiency, the use of their own pure Fossil version control[ref]. Wait, scrap that last one. Pure beauty!
Official Git mirror:
Create a table
sqlite3 db.sqlite3 "
CREATE TABLE 'IntegerNames' (int0 INT, char0 CHAR(16));
INSERT INTO 'IntegerNames' (int0, char0) VALUES (2, 'two'), (3, 'three'), (5, 'five'), (7, 'seven');
List tables:
sqlite3 db.sqlite3 '.tables'
Show schema of a table:
sqlite3 db.sqlite3 '.schema IntegerNames'
outputs the query that would generate that table:
CREATE TABLE IF NOT EXISTS 'IntegerNames' (int0 INT, char0 CHAR(16));
Show all data in a table:
sqlite3 db.sqlite3 'SELECT * FROM IntegerNames'
The Machiavellian Stack Overflow contributor Updated +Created
  • always upvote questions you care about, to increase the probability that they will get answered
  • never upvote other people's answers unless you might gain from it somehow, otherwise you are just giving other high reputation users more reputation relative to you
  • only mark something to close or as a duplicate if it will bring you some advantage, because closing things creates enemies, especially if the OP has a high profile
    One example advantage is if you have already answered the question (and the duplicate as well in case of duplicates), because this will prevent competitors from adding new better answers to overtake you.
  • protect questions you've answered whenever someone with less than 10 reputation answers it with a bad answer, to prevent other good contributors from coming along and beating you
  • when you find a duplicate pool answer every question with similar answers.
    Alter each answer slightly to avoid the idiotic duplicate answer detector.
    If one of the question closes, it is not too bad, as it continues netting you to upvotes, and prevents new answers from coming in.
  • follow on Twitter/RSS someone who comments on the top features of new software releases. E.g. for Git, follow GitHub on Twitter, C++ on Reddit. Then run back to any question which has a new answer.
  • always upvote the question when you answer it:
    • the more upvotes, more likely people are to click it.
    • the OP is more likely to see your answer and feel good and upvote you
  • if a niche question only has few answers and you come with a good one, upvote the existing ones by other high profile users.
    This may lead to them upvoting or liking you.
    Even if they don't, other people will still see your answer anyway, and this will lead to people to upvoting you more just to make your great answer surpass the current ones, especially if the accepted one has less upvotes than yours. Being second is often an asset.
  • always upvote comments that favor you:
    • "I like this answer!" on your answers
    • "also look at that question" when you have answered that question
  • don't invest a lot in edits. They don't give you rep, and they can get reverted and waste your time.
    Why are you trying to help other people's answers to get rep anyways? Just make a separate answer instead! :-)
  • if you answer a question by newbie without 15 reputation, find their other questions if any and upvote them, so that the OP can upvote your answer in addition to just accepting
  • If you haven't answered a question, link to related questions you've answered on question comments, so more people will come to your answers.
    If you have answered the question, only link to other questions at the bottom of your answer, so that people won't go away before they reach your answer, and so as to strengthen your answer.
  • if a question has 50 million answers and you answer it (often due to a new feature), make a comment on the question pointing to your answer
  • if you get a downvote, always leave a comment asking why. It is not because you care about their useless opinion, but because other readers might see the comment, feel sorry for you, and upvote.
  • ask any questions under a separate anonymous accounts. Because:
    • intelligent people are born knowing, and don't ever ask any questions, so that would hurt your reputation
    • downvoting questions does not take 1 reputation away from the downvoter, and so it greatly opens the door for your opponents to downvote you without any cost.
Website front-end for a mathematical formal proof system Updated +Created
When Ciro Santilli first learnt the old Zermelo-Fraenkel set theory and the idea of formal proofs, his teenager mind was completely blown.
Finally, there it was: a proper and precise definition of mathematics, including a definition of integers, reals and limits!
Theorems are strings, proofs are string manipulations, and axioms are the initial strings that you can use.
Once proved, press a button on your computer, and the proof is automatically verified. No messy complicated "group of savants" reading it for 4 years and looking for flaws!
There are a few proof assistant systems with several theorems in their Git tracked standard library. The hottest ones circa 2020 are:
And here are some more interesting links:
However, as expressed by the QED manifesto, is unbelievable that there isn't one awesome and dominating website, that hosts all those proofs, possibly an on the browser editor, and which all mathematicians in the world use as the one golden reference of mathematics to rule them all!
Just imagine the impact.
Standard library maintainers don't have to deal with the impossible question of what is "beautiful" or "useful" enough mathematics to deserve merged: users just push content to the online database, and star what they like!
We then just use GitHub-like namespaces for each person's theorem, e.g. "cirosantilli/fundmaental-theorem-of-calculus" or "johndoe/fundmaental-theorem-of-calculus" so that each person owns their own preferred definition IDs, which others can reuse.
No more endless bikeshedding over what insane level of generality do your analysis theorems need to be (Ciro Santilli attended at talk about Lean where the speaker mentioned this was a problem)!
This would move things more out of the "pull request and Git tracked code" approach, into a more "database with entries" version of things.
Furthermore, it is just a matter of time until the "single standard library" approach starts to break down, as the git clone becomes impossibly large. At this point, people have to start publishing separate packages. And when this happens, you would need to retest every package that you add to your project. This is why a centralized database is just inevitable at some point, it just scales better.
Interested in a conjecture? No problem: just subscribe to its formal statement + all known equivalents, and get an email on your inbox when it gets proved!
Are you a garage mathematician and have managed to prove a hard theorem, but no "real" mathematician will read your proof because your unknown? Fuck that, just publish it on the system and let it get auto verified. Overnight fame awaits.
Notation incompatibility hell? A thing of the past, just automatically convert to your preferred representation.
Such a system would be the perfect companion to Just like computer code offers the backbone of Linux Kernel Module Cheat Linux kernel tutorials, a formal proof system website would be the backbone of mathematics tutorials! You know what, if becomes insanely successful, Ciro is going to add this to it later on.
Furthermore, it would not be too hard to achieve this system!
All we would need would be something analogous to a package registry like PyPI or NodeJS' registry.
Then, each person can publish packages containing proofs.
Packages can rely on other packages that contain pre-requisites definition or theorem.
Packages are just regular git repos, with some metadata. One notable metadata would be a human readable description of the theorems the package provides.
The package registry would then in addition to most package registries have a CI server in it, that checks the correctness of all proofs, generates a web-page showing each theorem.
All proofs can be conditional: the package registry simply shows clearly what axiom set a theorem is based on.
This is a close as we can get to Erdős' book.
Maybe Ciro will just stuff this into once that takes over the world.
This project could be seen as a more automated/less moderated version of ProofWiki.
Ciro Santilli pinging people: