These are the best articles ever authored by Ciro Santilli, most of them in the format of Stack Overflow answers.
Ciro posts update about new articles on his Twitter accounts.
A chronological list of all articles is also kept at: Section "Updates".
Some random generally less technical in-tree essays will be present at: Section "Essays by Ciro Santilli".
- Trended on Hacker News:
- CIA 2010 covert communication websites on 2023-06-11. 190 points, a mild success.
- x86 Bare Metal Examples on 2019-03-19. 513 points. The third time something related to that repo trends. Hacker news people really like that repo!
- again 2020-06-27 (archive). 200 points, repository traffic jumped from 25 daily unique visitors to 4.6k unique visitors on the day
- How to run a program without an operating system? on 2018-11-26 (archive). 394 points. Covers x86 and ARM
- ELF Hello World Tutorial on 2017-05-17 (archive). 334 points.
- x86 Paging Tutorial on 2017-03-02. Number 1 Google search result for "x86 Paging" in 2017-08. 142 points.
- x86 assembly
- What does "multicore" assembly language look like?
- What is the function of the push / pop instructions used on registers in x86 assembly? Going down to memory spills, register allocation and graph coloring.
- Linux kernel
- What do the flags in /proc/cpuinfo mean?
- How does kernel get an executable binary file running under linux?
- How to debug the Linux kernel with GDB and QEMU?
- Can the sys_execve() system call in the Linux kernel receive both absolute or relative paths?
- What is the difference between the kernel space and the user space?
- Is there any API for determining the physical address from virtual address in Linux?
- Why do people write the
#!/usr/bin/env
python shebang on the first line of a Python script? - How to solve "Kernel Panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)"?
- Single program Linux distro
- QEMU
- gcc and Binutils:
- How do linkers and address relocation works?
- What is incremental linking or partial linking?
- GOLD (
-fuse-ld=gold
) linker vs the traditional GNU ld and LLVM ldd - What is the -fPIE option for position-independent executables in GCC and ld? Concrete examples by running program through GDB twice, and an assembly hello world with absolute vs PC relative load.
- How many GCC optimization levels are there?
- Why does GCC create a shared object instead of an executable binary according to file?
- C/C++: almost all of those fall into "disassemble all the things" category. Ciro also does "standards dissection" and "a new version of the standard is out" answers, but those are boring:
- What does "static" mean in a C program?
- In C++ source, what is the effect of
extern "C"
? - Char array vs Char Pointer in C
- How to compile glibc from source and use it?
- When should
static_cast
,dynamic_cast
,const_cast
andreinterpret_cast
be used? - What exactly is
std::atomic
in C++?. This answer was originally more appropriately entitled "Let's disassemble some stuff", and got three downvotes, so Ciro changed it to a more professional title, and it started getting upvotes. People judge books by their covers. notmain.o 0000000000000000 0000000000000017 W MyTemplate<int>::f(int) main.o 0000000000000000 0000000000000017 W MyTemplate<int>::f(int)
- IEEE 754
- What is difference between quiet NaN and signaling NaN?
- In Java, what does NaN mean?
Without subnormals: +---+---+-------+---------------+-------------------------------+ exponent | ? | 0 | 1 | 2 | 3 | +---+---+-------+---------------+-------------------------------+ | | | | | | v v v v v v ----------------------------------------------------------------- floats * **** * * * * * * * * * * * * ----------------------------------------------------------------- ^ ^ ^ ^ ^ ^ | | | | | | 0 | 2^-126 2^-125 2^-124 2^-123 | 2^-127 With subnormals: +-------+-------+---------------+-------------------------------+ exponent | 0 | 1 | 2 | 3 | +-------+-------+---------------+-------------------------------+ | | | | | v v v v v ----------------------------------------------------------------- floats * * * * * * * * * * * * * * * * * ----------------------------------------------------------------- ^ ^ ^ ^ ^ ^ | | | | | | 0 | 2^-126 2^-125 2^-124 2^-123 | 2^-127
- Computer science
- Algorithms
- Is it necessary for NP problems to be decision problems?
- Polynomial time and exponential time. Answered focusing on the definition of "exponential time".
- What is the smallest Turing machine where it is unknown if it halts or not?. Answer focusing on "blank tape" initial condition only. Large parts of it are summarizing the Busy Beaver Challenge, but some additions were made.
- Algorithms
- Git
| 0 | 4 | 8 | C | |-------------|--------------|-------------|----------------| 0 | DIRC | Version | File count | ctime ...| 0 | ... | mtime | device | 2 | inode | mode | UID | GID | 2 | File size | Entry SHA-1 ...| 4 | ... | Flags | Index SHA-1 ...| 4 | ... |
tree {tree_sha} {parents} author {author_name} <{author_email}> {author_date_seconds} {author_date_timezone} committer {committer_name} <{committer_email}> {committer_date_seconds} {committer_date_timezone} {commit message}
- How do I clone a subdirectory only of a Git repository?
- Python
- Web technology
- OpenGL
- What are shaders in OpenGL?
- Why do we use 4x4 matrices to transform things in 3D?
- Image Processing with GLSL shaders? Compared the CPU and GPU for a simple blur algorithm.
- Node.js
- Ruby on Rails
- POSIX
- What is POSIX? Huge classified overview of the most important things that POSIX specifies.
- Systems programming
- What do the terms "CPU bound" and "I/O bound" mean?
+--------+ +------------+ +------+ | device |>---------------->| function 0 |>----->| BAR0 | | | | | +------+ | |>------------+ | | | | | | | +------+ ... ... | | |>----->| BAR1 | | | | | | +------+ | |>--------+ | | | +--------+ | | ... ... ... | | | | | | | | +------+ | | | |>----->| BAR5 | | | +------------+ +------+ | | | | | | +------------+ +------+ | +--->| function 1 |>----->| BAR0 | | | | +------+ | | | | | | +------+ | | |>----->| BAR1 | | | | +------+ | | | | ... ... ... | | | | | | +------+ | | |>----->| BAR5 | | +------------+ +------+ | | | ... | | | +------------+ +------+ +------->| function 7 |>----->| BAR0 | | | +------+ | | | | +------+ | |>----->| BAR1 | | | +------+ | | ... ... ... | | | | +------+ | |>----->| BAR5 | +------------+ +------+
- Electronics
- Computer security
- Media
- How to resize a picture using ffmpeg's sws_scale()?
- Is there any decent speech recognition software for Linux? ran a few examples manually on
vosk-api
and compared to ground truth.
- Eclipse
- Computer hardware
- Scientific visualization software
- Numerical analysis
- Computational physics
- Register transfer level languages like Verilog and VHDL
- Android
- Debugging
- Program optimization
- Data
- Mathematics
- Section "Formalization of mathematics": some early thoughts that could be expanded. Ciro almost had a stroke when he understood this stuff in his teens.
- Network programming
- Physics
- Biology
- Quantum computing
- Bitcoin
- GIMP
- Home DIY
- China
One of the Holiest age old debugging techniques!
Git has some helpers to help you achieve bisection Nirvana: stackoverflow.com/questions/4713088/how-to-use-git-bisect/22592593#22592593
Obviously not restricted to software engineering alone, and used in all areas of engineering, e.g. Video "Air-tight vs. Vacuum-tight by AlphaPhoenix (2020)" uses it in vacuum engineering.
The cool thing about bisection is that it is a brainless process: unlike when using a debugger, you don't have to understand anything about the system, and it incredibly narrows down the problem cause for you. Not having to think is great!
Quick facts:
- Nationalities: Italian and Brazilian
- Grew up in: Brazil
- Relationship status 2017-: married
- Given name pronunciation: take your pick from Ciro Santilli's given name
- Chinese name: 三西猴, means "three western monkeys". Phonetic approximation to SANtilli CIRO. More info at: Ciro Santilli's Chinese name. Semi-unintentionally reminds Chinese people of Sun Wukong (孙悟空). This association is further slightly strengthened by the phonetic choice of 三 San, which Ciro later noticed matches the middle character of Tang Sanzang (唐三藏), the monk in Journey to the West. The given name 西猴 was given by Ciro Santilli's wife, then recent girlfriend, as a semi-joke, and he took it up because the best way to take a joke is to play along with the joker. 三 was chosen by Ciro himself.
- laptop: high end Lenovo ThinkPad
- distro: latest Ubuntu release
- Vim or Emacs: vi/vim. But for The Love, will someone please make an open source C++ integrated development environment that actually just works?
- tabs or spaces: spaces
- Mailing list or Git(Hub|Lab>): Git(Hub|Lab), with passion, see Section "Mailing list"
- system or unit tests: system
- programming languages: Python and C++. He'll learn Rust and Haskell once he's rich. As of the 2020s, Rust was picking up some serious steam, so Ciro might end up eating his own words there.
- musical instruments to listen: Chinese Guqin and electric Jazz-fusion guitar
- metric or imperial: metric, for The Love. Science? Standardization? 21st century anyone?
- QWERTY or Dvorak: QWERTY, alas
- birth name: Ciro Duran Santilli
Other people with the same name are listed at Section "Ciro Santilli's homonyms".
Firstly, in 2012, while he was at École Polytechnique, Ciro Santilli was introduced to LaTeX (thank God for French mathematical obsession), and his mind was blown:he though. Why isn't everyone doing that!
Ha, so I can write my own books, and so can anyone, for free?
One particular event stood out: Ciro made a small change to his teacher's course material, who blessed be him (dude's a legend, Ciro just noticed he has some Chinese publications with another French dude, e.g. www.amazon.co.uk/高效算法-应试与提高必修128例-克里斯托弗-Christoph-Durr/dp/B078SJQPVK "High-efficiency algorithm competitions 128 examples", did he write it the Chinese himself?? Must be of course to complement the notoriously low French professor salaries), made it available, and then Ciro gave him back the .tex file. Ciro was just a bit worried about how the teacher would be able to tell what he had changed in the file to validate the change. The teacher just said of course, "no problem, I'll just use
diff
". Ciro had never heard of diff. Let alone Git of course, though yes, this was a bit early in Git's history version control systems had been around since forever of course. This was 2011 or 2012, about 4 or 5 years into a superior education curricula with various courses involving computers, some requiring quite a lot of "fill these empty functions" style programming. Education is a joke. Anyways, this was a prelude to exactly what Ciro wanted to do in OurBigBook.com. This might have been the one actually: webia.lip6.fr/~durrc/Iut/Notes580.pdfNot long afterwards, Ciro started playing with Linux. Until then, Ciro had had some contacts with the mysterious operating system at university, and was a bit puzzled what the point of it was! He clearly remembers:University should be forced to use only open source software and hardware in undergrad teaching courses by law BTW.
- at the University of São Paulo that they had some "UNIX" computers in some classes, and at the library
- at École Polytechnique, he took a course about mathematical analysis and there was a "lab" where students were supposed to use FreeFem, great initiative BTW. And Ciro distinctly remembers being paried with a nice Chilian colleague, and the guy was alreay super at ease with the shell: "cd", "ls", etc. WTF was all that!
Then came an Ubuntu live disk on his own machine, and finally a measly 40GB dual book partition in a Microsoft Windows machine on a laptop. At first, it took a lot of time to learn all the crazy new terminal stuff! Yes, at this point, Ubuntu was already usable enough without the terminal, an accomplishment actually. But as a programmer, Ciro felt obliged to learn. Many hours were spent reading man pages at the library. But it all just felt so right, and sometimes powerful... true wizardry.
And ten years later, Ciro was seriously considering buying a computer without Windows pre-installed. He had not used Windows a single tie on a personal machine even once in those ten years!
Finally, to finish things off Ciro found two websites that changed his life forever, and made be believe that there was an alternative: Stack Overflow and GitHub.
The brutal openness of it all. The raw high quality content. Ugliness and uselessness too no doubt. But definitely spark in a sea of darkness.
Ciro Santilli's Stack Overflow contributions have, unsurprisingly, centered around the subjects he has worked with: systems programming and web development, and necessary tooling to get those done, such as Git, Python, Bash and Ubuntu.
His best answers are listed at: Section "The best articles by Ciro Santilli".
Stack Overflow has been the initial centerpiece of Ciro Santilli's campaign for freedom of speech in China, until Ciro noticed that GitHub might be potentially even more effective for it.
In Stack Overflow Ciro likes to:
- answer important questions found through Google which he needs to solve an actual problem he has right now, and for which none of the existing answers satisfied him, and close duplicates.
- monitor less known tags which very few people know a lot about and where the knowledge sharing desperately lacking, but in which Ciro specializes and therefore has some uncommon knowledge to share
In practice it also happens that Ciro:
- Googles for his own answers to remember some detail he wrote down but with slightly different terms that were closer to mind at the time, and find other similar questions for which he has the perfect answer.
- learns something new by chance, e.g. some new flashy feature of a new version of the C++ standard, thinks "this is awesome, there must be a Stack Overflow question for it", and then there is a question and he answers it
When he gets an upvote on one of his more obscure answers, Ciro often re-reads it, and often finds improvements to be made and makes them.
He doesn't like to refresh the homepage looking for easy reputation on widely known subjects. See also: online forums that lock threads after some time are evil.
The result is that Ciro ends up getting relatively a lot of reputation without much work! The term passive income, much beloved by fake investment gurus, comes to mind. But now it's "passive reputation"! And it is useless! Yay!
For this reason, Necromancer is Ciro's favorite badge (get 5 upvotes on a question older than 60 days), and as of July 2019, he became the 1 user with the most of this badge. Announcement on Twitter.
The number two at the time was VonC (see also: Section "Epic Stack Overflow users"), who had about 16 times more answers than Ciro in total! From this query: data.stackexchange.com/stackoverflow/query/1072396?&Date=2019-07-01&UserId=895245 it can be seen that as of July 2019, 1216 out of his 1329 answers were answered 60 days after the questions and constitute potential necromancers! Compare that to VonC's 1643 potential necromancers out of 21767 answers!
VonC eventually took back the lead in 2022, dude's a machine!!! twitter.com/cirosantilli/status/1546389532014247936
Someone at Ciro's work once said something along:and this does ring true in Stack Overflow as well. When you are answering stuff, it means that you either didn't know, or that the information wasn't well available, and so your specific application is progressing slowly because of that. Once the generic prerequisites are well solved and answered, you will spend much more time on your business specific things rather than anything else that can be factored out across projects, and so you will get more "directly useful work" done, and less Stack Overflow answers. Of course, without the prior research in place, you can't get the final product done either.
The more patents a research project generates, the less actually working products it produces.
In terms of per year reputation ranks, Ciro was in the top 100 in of the 2018 ranking with 38,710 reputation gained in that year: stackexchange.com/leagues/1/year/stackoverflow/2018-01-01?sort=reputationchange&page=4 (archive). He reached top 50 in 2022. Note that daily reputation is mostly capped to 200 per day, leading to a maximum 73000 per year. It is possible to overcome this limit either with bounties or accepts, and Ciro finds it amazing that some people actually break the 73k limit by far with accepts, e.g. Gordon Linoff reached 135k in 2018 (archive)! However, this is something that Ciro will never do, because it implies answering thousands and thousands of useless semi duplicate questions as fast as possible to get the accept. Ciro's reputation comes purely from upvotes on important question, and is therefore sustainable without any extra effort once achieved. Interestingly, Ciro appeared on top of the quarter SE rankings around 2019-11: web.archive.org/web/20191112100606/https://stackexchange.com/leagues but it was just a bug ;-)
There is no joy like answering an old question, and watching your better answer go up little by little until it dominates all others.
Stack Overflow reputation is of course, in itself, meaningless. People who contribute to popular subjects like web development will always have infinitely more reputation than those that contribute to low level subjects.
What happens on the specialized topics though is that you end up getting to know all the 5 users who contribute 95% of the content pretty soon as you study those subjects.
Like everything that man does, the majority of Ciro's answers are more or less superficial subjects that many people know but few have the patience to explain well, or they are updates to important questions reflecting upstream developments. But as long as they save 15 minutes from someone's life, that's fine.
There is great beauty when you are involved in a programming problem, and you suddenly remember: wait, I answered something related a few years ago! And especially so when you can go back and improve your old answer with new insight. This has great value, because when you were more newbie, you would have typed different words into Google Search than you would now. So by updating posts from when you were a newbie, you are helping other newbies more, as they are more likely to be also searching for those keywords. It is also very nice to have some head start on the answer's upvote count and not have to bootstrap yet another answer from 0 upvotes and have to go through all the competition!
For example, Ciro's most upvoted answer as of July 2019 is stackoverflow.com/questions/18875674/whats-the-difference-between-dependencies-devdependencies-and-peerdependencies/22004559#22004559 was written when he spent his first week playing with NodeJS (he was having a look at Overleaf, later merged into Overleaf, for education), which he didn't touch again for several years, and still hasn't "mastered" as of 2019! This did teach a concrete life lesson to Ciro however: it is impossible to know what is the most useful thing you can do right now very precisely. The best bet is to follow your instincts and do as much awesome stuff as you can, and then, with some luck, some of those attempts will cover an use case.
Ciro tends to take most pride on his systems programming answers, which is a subject that truly relatively few people know about. He likes it when he goes insanely deep into a subject, way beyond what OP had in mind, exposing full root causes and broader causes, see e.g.:
- stackoverflow.com/questions/1778538/how-many-gcc-optimization-levels-are-there/30308151#30308151
- stackoverflow.com/questions/34519521/why-does-gcc-create-a-shared-object-instead-of-an-executable-binary-according-to/55704865#55704865
- stackoverflow.com/questions/8352535/how-does-kernel-get-an-executable-binary-file-running-under-linux/31394861#31394861
Ciro also derives great joy from his "media related answers" (3D graphics, audio, video), which are immensely fun to write, and sometimes borderline art, see answers such as those under "OpenGL" and "Media" under the best articles by Ciro Santillis or even simpler answers such as:
There is something of greater value in perfectly presented technical knowledge, that goes beyond than simply getting something done. The pleasure of understanding and mastering something, and perhaps of the explanation itself. Sometimes when answering, Ciro feels like a tailor, where ASCII is his cloth. See also: Section "The art of programming", Section "Physics and the illusion of life".
Ciro's deep understanding of Stack Overflow mechanisms and its shortcomings also helped shape his ideas for: OurBigBook.com. So it is a bit funny to think that after all time Ciro spent on the website, he actually wants to destroy it and replace it with something better. There can be no innovation without some damage. It also led to Ciro's creation of Stack Overflow Vote Fraud Script.
After answering so many questions, he ended up converging to a more or less consistent style, which he formalized at:Like any other style guide, this answer style guide, once fully incorporated and memorized, allows Ciro to write answers faster, without thinking about formatting issues.
- meta.stackexchange.com/questions/18614/style-guide-for-questions-and-answers/326746#326746. Key self-quote:Intersperse paragraphs with lists, code blocks and other block elementsBeautiful text is not just text. Beautiful text is half text, and half ASCII art. There is almost a texture, or tempo, to it.
- meta.stackexchange.com/questions/10647/how-do-i-write-a-good-title/311903#311903. Question title style only. After a few years later more people agreeing with that post which now had -12 votes: meta.stackoverflow.com/questions/422082/should-we-add-option-use-complete-sentences-to-first-answers-queue
Ciro also made a question title style guide: meta.stackexchange.com/questions/10647/how-do-i-write-a-good-title/311903#311903 but for some reason the Stack Overflow community prefers their semi-defined title meta-language to proper English. Go figure.
Ciro started contributing to Stack Overflow in 2012 when he was at École Polytechnique.
Like all things that end up shaping the course of one's life, Ciro started contributing without thinking too much about it.
His first answer was to the LaTeX question: Standalone diagrams with TikZ?, which reflects the fact that this happened while Ciro was reaching his Ciro Santilli's Open Source Enlightenment.
Ciro's first upvote was for his 2012 question: How to run a Python script portably without specifying its full path?
When he started contributing, Ciro was still a newbie. One early event he will never forget was when someone mentioned a "man page", and Ciro commented saying that there was a typo!
When Ciro reached 15 points and gained the ability to upvote, it felt like a major milestone, he even took a screenshot of the browser! 1k, 10k and 100k were also particularly exciting. When the 100k cup (archive) arrived in 2018, Ciro made a show-off Facebook post (archive). At some point though, your brain stops caring, and automatically filters out any upvotes you get except on the answers that you are really proud of and which don't yet have lots of upvotes. The last remaining useless gamed achievement that Ciro looked forward to was legendary (archive), and which he achieved on 2021-02-16.
From the start, Ciro's motivations for contributing to Stack Overflow have been a virtuous circle of:
- save the world through free education
- It feels especially amazing when people in the real world start taking note of you, and either close friends tell you straight out that you're a Stack Overflow God, or as you slowly and indirectly find out that less close know or came to you due to your amazing contributions.
It is also amazing when you start having a repertoire of answers, and as you are writing a new answer, you remember: "hey, the knowledge of that answer would be so welcome here", and so you link to the other answer as well at the perfect point. This somewhat achieves does what OurBigBook.com aims to do: for each small section of a tutorial, gather the best answers by multiple people.
Another one is Aaron Hall, who is also very high on the necromancer list, answers in Python which is a topic Ciro cares about, and states on his profile:so another necromancer.
Follow me on Twitter and tell me what canonical questions you would like me to respond to!
Way to go.
Ciro also asks some questions on a ratio of about 1 question per 10 answers. But Ciro's questions tend to be about extremely niche that no one knows/cares about, and a high percentage of them ends up getting self answered either at asking time or after later research.
Some fun reactions to Ciro's Stack Overflow activity:
- Eric B comments[ref] on Ciro's answer to the question "What does multicore assembly language look like?":
Holy shit, Ciro made it his masters degree to write OP an answer. What a long and detailed answer, thanks!
Things that are not nice such as:
- Taboola, Outbrain, and other chumbox
- BLOBs
- Europe cookie law
- adhesive inside mobile phones and more generally, planned obsolescence
- Jupyter Notebook
- typographical characters that look like ASCII ones, but are not the ASCII ones, e.g. typographical quotes, em-dash. The non-breaking hyphen is not even whitespace, and by def Why not stick to ASCII when ASCII is good enough?
- excessive encapsulation
- replacement of master and slave terminology from technology
- mailing lists. And to add insult to injury, HTML on mailing list messages instead of plaintext.
- blank lines in code added by people trying to increase clarity, especially when there is already indentation for that. Every blank line must be preceded by a line comment explaining what the following block is about, or removed.
- messaging software that force you to have a mobile phone
- advertisements by telephone/SMS
- "state" such as global variables and object members, long live functional programming?
- mosquitoes, the only intrinsically bad thing about tropical countries
- projects with slow compilation times
- Microsoft Windows
- the 2019 Chinese government
- e-learning websites that only allows verified teachers to write content. Cowards who can't handle ranking algorithms.
- domain-specific language
- a build system without an out-of-tree option
- non-linear Git history: stackoverflow.com/questions/20348629/what-are-advantages-of-keeping-linear-history-in-git
- visual programming languages like Scratch. Waste of time. Text programming languages are already equally as visual due to indentation:Just make good serious gamedev libraries and integrated development environments for those real languages instead.
if x == 0: x = 1
- software that prevents you from running as root. Let me fucking shoot myself in the foot if I want to. It is better than having to deal with your hand holding bullshit, which is done in a different way for every project. E.g.: stackoverflow.com/questions/17466017/how-to-solve-you-must-not-be-root-to-run-crosstool-ng-when-using-ct-ng/53099177#53099177
- Medium
- luxury goods
- euphemism
- closed access academic journals are evil
- websites without OAuth
- shower room without a window to the exterior (mould!!!)
- single programs with their interface split across multiple windows, e.g. GIMP, ZynAddSubFX
- graphical user interfaces
- logograms
- infinitesimals. Just use limit instead, please
- country
- knowledge olympiads
- programming languages without a decent dominating package system
- closed source offline software used by millions
- exams
- security through obscurities
- dots in Gmail address
- things in websites that look like links, and behave like links, but don't let you middle click to open them on a separate tab
- K-pop
- numerical computing language
- fiscal paradises
- when the front-end of an website changes an important permanent state, but the URL does not change
- splash screens: you should show boot messages so that people will know what to Google for when things fail. Do you think computer newbies will be afraid and have nightmares?
- milk chocolate: why would you eat that instead of dark chocolate if you are older than 10?
- to talk about something without giving the real name to not scare off the audience
- mathematical symbol that looks like a Greek letter but isn't. Or perhaps mathematical notation in general
- when more than two people gather to play a board game or video game, and two or more people start chatting on and on about random subjects rather than concentrating on the game
- watching television while eating. Same for reading, or doing basically anything else but eat. The only acceptable activity is talking relaxedly, not about work.
- noises coming out of your bicycle. It is so hard to find where they come to fix them!!!
- code drop
- private cars as opposed to public transport. As a cyclist, you can just see the effect that large roads have on nearby areas, it just destroys nature.
- closed standards
- double consonants that make no difference to sound. Dilema? Dilemma? Dillema? Dillemma? Please!
- social media websites that show stuff from people you don't follow when you don't explicitly want that, including things which are not ads, just random suggestions. Twitter starting being like that cirac a 2022. Facebook got worse around that time. It is a constant fight against those stupid websites.
- socks with short legs that don't protect your ankle/lower calf from cold/scratches/dirt, e.g. liner socks
- Presta valves. Why would such a flimsy tech have become so popular compared to the infinitely superior Schrader!
This is a quick presentation that goes over some of the most common difficulties people find with Git.
Because Ciro's a software engineer, and he's done enough staring in computers for a lifetime already, and he believes in the power of Git, he didn't pay much attention to this part ;-)
According to the eLife paper, the code appears to have been uploaded to: github.com/d-j-k/puntseq. TODO at least mention the key algorithms used more precisely.
Ciro can however see that it does present interesting problems!
Because it was necessary to wait for 2 days to get our data, the workshop first reused sample data from previous collections done earlier in the year to illustrate the software.
First there is some signal processing/machine learning required to do the base calling, which is not trivial in the Oxford Nanopore, since neighbouring bases can affect the signal of each other. This is mostly handled by Oxford Nanopore itself, or by hardcore programmers in the field however.
After the base calling was done, the data was analyzed using computer programs that match the sequenced 16S sequences to a database of known sequenced species.
This is of course not just a simple direct string matching problem, since like any in experiment, the DNA reads have some errors, so the program has to find the best match even though it is not exact.
The PuntSeq team would later upload the data to well known open databases so that it will be preserved forever! When ready, a link to the data would be uploaded to: www.puntseq.co.uk/data
Author: David Tong.
Number of pages circa 2021: 155.
It should also be noted that those notes are still being updated circa 2020 much after original publication. But without Git to track the LaTeX, it is hard to be sure how much. We'll get there one day, one day.
Some quotes self describing the work:
- Perhaps for this reason Ciro Santilli was not able to get as much as he'd out of those notes either. This is not to say that the notes are bad, just not what Ciro needed, much like P&S:This is a very clear and comprehensive book, covering everything in this course at the right level. To a large extent, our course will follow the first section of this book.
In this course we will not discuss path integral methods, and focus instead on canonical quantization.
A follow up course in the University of Cambridge seems to be the "Advanced QFT course" (AQFT, Quantum field theory II) by David Skinner: www.damtp.cam.ac.uk/user/dbs26/AQFT.html
cirosantilli.com content uploaded to ourbigbook.com/cirosantilli Updated 2025-01-06 +Created 1970-01-01
Managed to upload the content from the static website cirosantilli.com (OurBigBook Markup source at github.com/cirosantilli/cirosantilli.github.io) to ourbigbook.com/cirosantilli.
Although most of the key requirements were already in place since the last update, as usual doing things with the complex reference content stresses the system further and leads to the exposition of several new bugs.
The upload of OurBigBook Markup files to ourbigbook.com was done with the newly added OurBigBook CLI
ourbigbook --web
option. Although fully exposed to end users, the setup is not super efficient: a trully decent implementation should only upload changed files, and would basically mean reimplementing/using Git, since version diffing is what Git shines at. But I've decided not to put much emphasis on CLI upload for now, since it is expected that initially the majority of users will use the Web UI only. The functionality was added primarily to upload the reference content.This is a major milestone, as the new content can start attracting new users, and makes the purpose of the website much clearer. Just having this more realistic content also immediately highlighted what the next development steps need to be.
Once v1.0 is reached, I will actually make all internal links of cirosantilli.com to point to ourbigbook.com/cirosantilli to try and drive some more traffic.
The new content blows up by far the limit of the free Heroku PostgreSQL database of 10k lines. This meant that I needed to upgrade the Heroku Postgres plugin from the free Hobby Dev to the 9 USD/month Hobby Basic: elements.heroku.com/addons/heroku-postgresql, so now hosting costs will increase from 7 USD/month for the dyno to 7 + 9 = 16 UDS/month. After this upgrade and uploading all of cirosantilli.com to ourbigbook.com, Heroku dashboard reads reads:so clearly if we are ever forced to upgrade plans again, it means that a bunch of people are using the website and that things are going very very well! Happy how this storage cost turned out so far.
- 30,918 rows out of 10,000,000
- 61.0 MB (out of 10 GB)
One key limitation found was that Heroku RAM memory is quite limited at 512MB, and JavaScript is not exactly the most memory economical language out there. Started investigation at: github.com/ourbigbook/ourbigbook/issues/230 Initially working around that by simply splitting the largest files. We were just on the verge of what could be ran however luckily, so a few dozen splits was enough, it managed to handle 70 kB OurBigBook Markup inputs. So hopefully if we manage to optimize a bit more we will be able to set a maximum size of 100 kB and still have a good safety margin.
The minimalism, serverlessness/lack of temporary caches/lack of permission management, Hipp's religious obsession with efficiency, the use of their own pure Fossil version control[ref]. Wait, scrap that last one. Pure beauty!
Official Git mirror: github.com/sqlite/sqlite
Create a table
sqlite3 db.sqlite3 "
CREATE TABLE 'IntegerNames' (int0 INT, char0 CHAR(16));
INSERT INTO 'IntegerNames' (int0, char0) VALUES (2, 'two'), (3, 'three'), (5, 'five'), (7, 'seven');
"
List tables:output:
sqlite3 db.sqlite3 '.tables'
IntegerNames
Show schema of a table:outputs the query that would generate that table:
sqlite3 db.sqlite3 '.schema IntegerNames'
CREATE TABLE IF NOT EXISTS 'IntegerNames' (int0 INT, char0 CHAR(16));
Show all data in a table:output:
sqlite3 db.sqlite3 'SELECT * FROM IntegerNames'
2|two
3|three
5|five
7|seven
So we can track the music in Git!
- always upvote questions you care about, to increase the probability that they will get answered
- never upvote other people's answers unless you might gain from it somehow, otherwise you are just giving other high reputation users more reputation relative to you
- only mark something to close or as a duplicate if it will bring you some advantage, because closing things creates enemies, especially if the OP has a high profileOne example advantage is if you have already answered the question (and the duplicate as well in case of duplicates), because this will prevent competitors from adding new better answers to overtake you.
- protect questions you've answered whenever someone with less than 10 reputation answers it with a bad answer, to prevent other good contributors from coming along and beating you
- when you find a duplicate pool answer every question with similar answers.Alter each answer slightly to avoid the idiotic duplicate answer detector.If one of the question closes, it is not too bad, as it continues netting you to upvotes, and prevents new answers from coming in.
- follow on Twitter/RSS someone who comments on the top features of new software releases. E.g. for Git, follow GitHub on Twitter, C++ on Reddit. Then run back to any question which has a new answer.
- always upvote the question when you answer it:
- the more upvotes, more likely people are to click it.
- the OP is more likely to see your answer and feel good and upvote you
- if a niche question only has few answers and you come with a good one, upvote the existing ones by other high profile users.This may lead to them upvoting or liking you.Even if they don't, other people will still see your answer anyway, and this will lead to people to upvoting you more just to make your great answer surpass the current ones, especially if the accepted one has less upvotes than yours. Being second is often an asset.
- always upvote comments that favor you:
- "I like this answer!" on your answers
- "also look at that question" when you have answered that question
- don't invest a lot in edits. They don't give you rep, and they can get reverted and waste your time.Why are you trying to help other people's answers to get rep anyways? Just make a separate answer instead! :-)
- if you answer a question by newbie without 15 reputation, find their other questions if any and upvote them, so that the OP can upvote your answer in addition to just accepting
- If you haven't answered a question, link to related questions you've answered on question comments, so more people will come to your answers.If you have answered the question, only link to other questions at the bottom of your answer, so that people won't go away before they reach your answer, and so as to strengthen your answer.
- if a question has 50 million answers and you answer it (often due to a new feature), make a comment on the question pointing to your answer
- if you get a downvote, always leave a comment asking why. It is not because you care about their useless opinion, but because other readers might see the comment, feel sorry for you, and upvote.
- ask any questions under a separate anonymous accounts. Because:
- intelligent people are born knowing, and don't ever ask any questions, so that would hurt your reputation
- downvoting questions does not take 1 reputation away from the downvoter, and so it greatly opens the door for your opponents to downvote you without any cost.
Website front-end for a mathematical formal proof system Updated 2025-01-06 +Created 1970-01-01
When Ciro Santilli first learnt the old Zermelo-Fraenkel set theory and the idea of formal proofs, his teenager mind was completely blown.
Finally, there it was: a proper and precise definition of mathematics, including a definition of integers, reals and limits!
Theorems are strings, proofs are string manipulations, and axioms are the initial strings that you can use.
Once proved, press a button on your computer, and the proof is automatically verified. No messy complicated "group of savants" reading it for 4 years and looking for flaws!
There are a few proof assistant systems with several theorems in their Git tracked standard library. The hottest ones circa 2020 are:
- github.com/HOL-Theorem-Prover/HOL
- github.com/seL4/isabelle. Rumours have it that this is "uncompilable" from source without blobs. It does however offer a very rich IDE.
- github.com/coq/coq
- Metamath this one is likely an older and less powerful system, but the web presentation and tutorial are very good! Source: github.com/metamath/metamath-exe Here is a proof that 2 + 2 equals 4: us.metamath.org/mpeuni/2p2e4.html
- Lean
- www.bookofproofs.org/branches/fpl-formal-proving-language/ from BookofProofs
And here are some more interesting links:
- github.com/awesomo4000/awesome-provable an awesome list of formal stuff
- devel.isa-afp.org/ Isabelle Archive of Formal Proofs. A curated list of Isabelle proofs, with minimal web UI. This is almost what we need, but without the manual curation, and with a better web UI.
- www.cs.ru.nl/~freek/100/ list of how many of the "arbitrarily" selected the Hundred Greatest Theorems by Paul and Jack Abad (1999) had been proved in several formal systems, serving therefore as a benchmark of sorts
However, as expressed by the QED manifesto, is unbelievable that there isn't one awesome and dominating website, that hosts all those proofs, possibly an on the browser editor, and which all mathematicians in the world use as the one golden reference of mathematics to rule them all!
Just imagine the impact.
Standard library maintainers don't have to deal with the impossible question of what is "beautiful" or "useful" enough mathematics to deserve merged: users just push content to the online database, and star what they like!
We then just use GitHub-like namespaces for each person's theorem, e.g. "cirosantilli/fundmaental-theorem-of-calculus" or "johndoe/fundmaental-theorem-of-calculus" so that each person owns their own preferred definition IDs, which others can reuse.
No more endless bikeshedding over what insane level of generality do your analysis theorems need to be (Ciro Santilli attended at talk about Lean where the speaker mentioned this was a problem)!
This would move things more out of the "pull request and Git tracked code" approach, into a more "database with entries" version of things.
Furthermore, it is just a matter of time until the "single standard library" approach starts to break down, as the git clone becomes impossibly large. At this point, people have to start publishing separate packages. And when this happens, you would need to retest every package that you add to your project. This is why a centralized database is just inevitable at some point, it just scales better.
Interested in a conjecture? No problem: just subscribe to its formal statement + all known equivalents, and get an email on your inbox when it gets proved!
Are you a garage mathematician and have managed to prove a hard theorem, but no "real" mathematician will read your proof because your unknown? Fuck that, just publish it on the system and let it get auto verified. Overnight fame awaits.
Notation incompatibility hell? A thing of the past, just automatically convert to your preferred representation.
Such a system would be the perfect companion to OurBigBook.com. Just like computer code offers the backbone of Linux Kernel Module Cheat Linux kernel tutorials, a formal proof system website would be the backbone of mathematics tutorials! You know what, if OurBigBook.com becomes insanely successful, Ciro is going to add this to it later on.
Furthermore, it would not be too hard to achieve this system!
All we would need would be something analogous to a package registry like PyPI or NodeJS' registry.
Then, each person can publish packages containing proofs.
Packages can rely on other packages that contain pre-requisites definition or theorem.
Packages are just regular git repos, with some metadata. One notable metadata would be a human readable description of the theorems the package provides.
The package registry would then in addition to most package registries have a CI server in it, that checks the correctness of all proofs, generates a web-page showing each theorem.
All proofs can be conditional: the package registry simply shows clearly what axiom set a theorem is based on.
This is a close as we can get to Erdős' book.
Maybe Ciro will just stuff this into OurBigBook.com once that takes over the world.
This project could be seen as a more automated/less moderated version of ProofWiki.
Bibliography:
- The Math Genome Project has very similar end goals. Apparently it will run proofs on server against the stdlib, but not allow one proof to depend on another, so in the end you still have to pull request everything back. Also there may be moderation forever, unclear. Ciro tried to create a dummy lolol theorem without any correct syntax and it just became private. Also apparently every single proof needs corresponding LaTeX manually written to be accepted. Cowards!
- math.stackexchange.com/questions/1767070/what-is-the-current-state-of-formalized-mathematics/3297536#3297536
- math.stackexchange.com/questions/2747661/why-is-there-not-a-system-for-computer-checking-mathematical-proofs-yet-2018
- stackoverflow.com/questions/19421234/how-do-i-generate-latex-from-isabelle-hol
- stackoverflow.com/questions/30152139/what-are-the-strengths-and-weaknesses-of-the-isabelle-proof-assistant-compared-t
- arxiv.org/abs/2102.03044 SPIRG, a decentralized version of this