Comparison of proof assistants by Ciro Santilli 35 Updated +Created
Conclusion and feelings by Ciro Santilli 35 Updated +Created
It is a bit sad to work on a project that no one cares about. You're not sure if you're crazy or a visionary. And it is kind of lonely.
I sometimes wonder if I would be happy doing this for the rest of my life if I could. And if it would have any impact at all no matter for how long I do it. My feelings in that area go from slightly depressed to slightly excited about the potential a few times every week.
As we all know, living and making life choices means sacrificing other things that could have been. When I was in France in 2015, I started a masters course in AI/robotics with the idea of doing a PhD and AGI research later on but quit half way because I felt university was such a waste of time.
But come now the AI boom, and although I still believe education is broken, I might have been much better off financially/reputationally if I had withstood the bullshit followed that path. Instead I sacrificed that for nerding about low level programming and open educational content.
It is hard to get such ideas off one's mind. But the fact is, for better or worse, I've started walking the path of educational reform and sacrificed others along the way, and this is the path that I'm further ahead than other people, and perhaps I should pursue it further to a possible conclusion. Also this path has the advantage that it is not fully exclusive from other academic endeavors as we will always need content about the new flashy things that keep coming up.
So yeah, it's hard, but here I am, and I'll go as far as I can without going into Charles Bukowski levels of personal sacrifice.
What might be next by Ciro Santilli 35 Updated +Created
OK, I need to do content. I know :-) At the university I'm at, the only department that is open is the mathematics one. Both:
  • physically, I'm sitting next to some students right now, though they don't yet know that their saviour is just next to them.
  • in terms of publishing the course materials online. Many of them even have solution
All other courses extremely closed, notably Physics, which is the other course I'd consider. There are upsides and downsides for going for Mathematics:
  • upside:
    • maths doesn't change with time
    • maths doesn't require experiments
  • downside: most of it is useless compared to Physics
If I were free to choose, I might go for Physics instead. But maths isn't hard, and I think I'll just go with the hand I'm dealt this time to start with.
Tech wise, the big things are the following ones to which I have given different levels of architectural consideration (i.e. read: I'm afraid they'll be fucking hard and that I'll spend a month on yet another useless feature that won't help get a single user). I don't think I'll do those before at least a little bit of content, we'll see:
  • WYSIWYG: this is not a question of if, but when and how. Even I miss it when dealing with images. I was particularly impressed by Trillium Notes, and might consider forking it or reusing some of its components
  • perfect two way sync from web to local: github.com/ourbigbook/ourbigbook/issues/326
    Currently, after much effort, publishing from local to web is extremely good.
    But pulling back changes that you make on web UI locally is not really possible. A basic version can be made easily, but a great version requires some thought.
    In particular, preventing accidental rewrite on simultaneous local + web edits require edit history to be in place.
    The rationale here is that users would start editing on Web with a low entry barrier. And as they become more committed to the project, they would eventually transition to having editing most of their content locally from a desktop, with the exception of a few minor edits on the go when they are on a cell phone, and which we want to very easily and automatically be pulled back to local as soon as they open an editor on their laptop.
    I.e. we want to add a downwards arrow to the following diagram:
    https://raw.githubusercontent.com/ourbigbook/ourbigbook-media/master/feature/local-editing/bigb-publish-to-web-or-static-editor-logos.svg
Smaller cute tech that I might do before content "real quick" include:
Another thing I really want to do before time is up is to create a video summarizing my philosophy of education. I want it to be as fun and funny and sad as possible, with silly moving animated images and slides, not just me talking to the camera. Although all of the points I intend to talk about have undoubtedly been covered by others, it is something that I feel so strongly about that I would like to tell others about it more personally. If I start it it will likely take a few days to get done, and I'm not sure wha the final quality would be. It is a bit sad to not do "project work", but I think I'll end up doing it regardless. Class it under "fundraising" if you will, as it may help to find other like minded but rich people.
How the tech improved by Ciro Santilli 35 Updated +Created
In any case, the outcome of that is that the tech has improved. And I have done a relatively good job of clearly publishing any "more user visible" improvements to docs.ourbigbook.com/news and social media such as though it is important to note that there have been more than one "fix a hard bug" weeks that were not published because they would just bore readers.
During this period the main focus has been on improving OurBigBook Web, i.e. the dynamic website that powers OurBigBook.com. There are two reasons for that:
  • Web is what has the OurBigBook topics feature for mind-melding, which is the killer feature of OurBigBook compared to other note taking apps and therefore deserves the highest levels of priority
    Static website generation is an indispensable escape valve that ensures that your content can be published forever even if OurBigBook.com goes down one day, which it won't as long as I live. But the innovation is Web.
  • static website generation was closer to good enough, but web was much further and is fundamentally harder.
    I'm extremely satisfied with OurBigBook static website generation and haven't touched it as much. It wasn't easy to reach this state, but I'm there.
    But Web is a different and much more complex beast.
    Making CLI software that will run on a person's local computer under full trust and building a bunch of HTML from lightweight markup in bulk is one thing.
    But making a public dynamic website that has to continuously maintain a coherent database state on granular updates, while giving users some trust but not enough for them to blow everything up is on a totally different level. See e.g. the recent SPAM attack we've had to fend off.
    Figure 1.
    Screenshot showing voting manipulated SPAM as the most highly upvoted article on OurBigBook.com
    . Source.
    And then there's also the issue of front-end being mega-hard to get right.
As a result, Web is now way less buggy and much more usable.
If you look through the list of Web updates, there is nothing specifically mind blowing. The core ideas have largely crystallized, and we are just trying to making them click. I have a few more punches up my sleeve, but the core is decided.
Figure 3.
OurBigBook Web article announcement
. Source. Another cute new feature, you can send an email to your followers about a new amazing article you created.
Web process has been somewhat slower than what I'd like. Of course, it is the case of any project that things are easily said than done. But there are two other main structural factors that have played into it:
  • I have my first baby now, and we're learning how to deal with that on the fly.
    For example, we could have put him on childcare a bit earlier, but due to inexperience we've kept him a bit longer than we maybe should have.
    Things are well sorted out now, but not matter how good your support system is, at the end of the day, and more often night, it is you the parents that have to deal with a lot of inevitable baby issues. Unless you want them to turn into psychopaths and drug addicts that is, which I don't. I've reached the point of semi failure middle age that the baby feels like my best moonshot.
    All of this sets a fundamental limit on how many hours you can work per week.
    But at least with the donations I was able to work on OurBigBook at all. Because if it weren't for that, I would have to focus entirely on the generic job instead and OurBigBook would have been put on hold.
  • the choice of Web stack. I was allured by Next.js. I can see the beauty and usefulness of a Node.js render front-end that also runs on backend and hydration. That is awesome.
    But:
    • React is insanely hard to learn and understand. Furthermore, it is also hard to understand the performance problem that it solves, and actually have a benchmark where this problem is solved faster than just delivering some HTML files with ad-hoc Js on top.
    • the lack (or perhaps excess of shitty) actual web framework like Ruby on Rails and Django means that I have to rediscover the wheel many times over for all the essential support activities like testing, login and so one
    At this point a rewrite is out of the question. I've managed to master things well enough to get a decent result, and given up on the few things that I couldn't for the life of me achieve, after documenting them very well for posterity of course.
Aside from Web, there was only one thing that received a significant improvement, and that was the OurBigBook VS Code extension. The extension is not perfect, and it is not the "final UI", which has to be some WYSIWYG implementation, and there are some fundamental limitations that cannot be overcome without patching VS Code itself. However, the extension is already extremely usable, and I'm writing this on it right now. Basics like syntax highlighting, jump to definition and autocomplete are very useful and usable.
Figure 4.
Tree navigation in the OurBigBook Visual Studio Code extension
.
I ended up doing tech rather than content as usual by Ciro Santilli 35 Updated +Created
At first I had intended to create a lot more content for the world class university located where I lived, but I ended up not doing that and just improving the project tech instead.
There are a few reasons for this, good or bad:
  • as a tech nerd, my natural tendency is to first sit down by myself and code to solve big general problems rather than go out and try to solve specific people's specific problems to obtain money and users
  • at one point I got the feeling that helping students with a bunch of small courses might be useful, and that instead I might get more impact by instead by focusing on creating content for a next big thing area such as: because many of the courses are fundamentally useless by design due to misalignment between university and reality.
    I'm still not sure what to do about that, but I do think I'll try to do a bit of course solving at least and see how it goes.
    One thing I've learned first hand through Ciro Santilli's Stack Overflow contributions and Linux Kernel Module Cheat is that the barrier to make money from a useful open source learning project that benefits a large number of people a little bit is huge, perhaps infinite, and that it might be better to instead focus more intensely on fewer users. This insight pushes me more towards going for solving local courses.
    Another consideration that supports going for courses is that being close to students is perhaps my only unfair advantage. There is likely no one else in the world in the same position that I'm at, with some "free time" to chill with undergrads and help them with 100% of my undivided attention and passion.
    A point that pulls me towards the big tutorials however is that my time is almost up, and focusing on them would increase the chances that I will be work in those fields afterwards. This feeling may go against the best interests of the project, but it is perhaps an inevitable self preservation consideration unless someone decides to free me from that forever with the 2M :-)
  • the entry barrier to help students of a top university is rather high. The students are already extremely busy and pressured (this is pe), and if it is in the slightest hard to explain their problems to you because you are not fluent enough in their subject, they will find a faster way to obtain the knowledge and never come to you.
  • I also did a bit of procrastinating with a few quick few exploration into cute programming projects. Nothing too crazy long however, just the usual. It's in my nature to have broad interests, and perhaps only such a person can make a OurBigBook.com. I'm not a fast worker. But I never stop. Once something is in my "this must be done or learnt list", I just keep coming back to it again and again until it happens.
The downsides of going for tech first are severe:
  • you risk being misaligned with what users want and spend enormous amounts of time on useless features
  • it is also rather demotivating that you are working hard on a really cool feature but you know that there are no users yet so no one will benefit from it, and that this feature alone is not enough to attract the users anyways
There are however counterpoints to these as for anything else:
  • I'm a user and I'm always improving it for myself. If there are other people like me out there, they will love it. If there aren't, perhaps I'll never be able to do anything that caters for them well enough anyways.
  • as the two users made me understand, once someone touches your thing, they expect it to be perfect, and their standards are extremely high. This is understandable in part given the large number of note taking apps in existence, and notably WYSIWYG ones. As such, there is some rationale for improving tech.
Metrics and rationales by Ciro Santilli 35 Updated +Created
Long story short, the project is so far a complete failure on the most important metric: number of regular users, which current sits at exactly one: myself.
There were notable users who found the project online and who actually tried to use the website for some content and provided extremely valuable feedback:Unfortunately after the period of a few weeks they stopped using it to follow their other priorities instead. Which is of course totally fine, however sad.
I still believe that the OurBigBook Web feature is a significant tech innovation that could make the website go big.
I also believe that the project gets many fundamentals of braindumping right, notably the infinitely deep table of contents without forced scoping, e.g.:
- Mathematics
  - Calculus
does not make Calculus have an ID orr URL of mathematics/calculus, rather it's just calculus.
But there is a fundamental difficulty in reaching critical mass to that self-sustaining point, as people don't seem to be convinced by these logical "my system is better" argument alone, as opposed to having them Google into stuff they need now and then understand that the project is awesome.
A closely related critical mass issue is that existing big multiuser knowledge base websites such as Stack Overflow and Wikipedia have a tremendous advantage on PageRank. No matter how useless a Wikipedia article about something is, it will always be on top of Google within a week of creation for title hits. And since the main goal of publishing your stuff is to get it seen, it makes much more sense for writers to publish on such existing websites whenever possible, because anywhere else it is way way less likely to be seen by anybody.
Even I end up writing way more on Stack Overflow than on OurBigBook as a programmer. But I still believe that there is a value to OurBigBook, for the usual reasons of:
  • it allows you to organize a more global view of a subject, i.e. a book. Even I write answers on Stack Overflow, I also tend to organize links to these answers in a structured ways here, see e.g. big topics such as SQL
  • deletionism and overly narrowness of allowed topics/style
Perhaps what saddens me the most is that even on GitHub stars/Twitter/Hacker news terms there is almost no interest in the project despite the fact that I consider that it has innovations, while many other note taking apps as well in the thousands of stars. Maybe I'm just delusional and all the tech that I'm doing is completely useless?
Part of the issue is probably linked to the fact that most other note taking apps focus on "help me organize my ideas so I can make more money" and often completely ignore "I want to publish my knowledge", and stuff that helps you make money is always easier to sell and promote.
OurBigBook on the other hand a huge focus on "I want to publish me knowledge". It aims almost single mindedly in being the best tool ever for that. However this doesn't make money for people, and therefore there are going to be way less potential users.
I do believe strongly that all it takes is a few users for the project to snowball. For some people, once you start braindumping, it is very addictive, and you never want to stop basically. So with only a few of those we can open large parts of undergrad knowledge to the world. But these people are few, and so far I haven't been able to find even a single one like me, and on top of that convince them that I have created the ultimate system for their knowledge publishing desires.
Another general lesson is that I should perhaps aimed for greater compatibility with existing systems such as Obsidian. Taking something that many people already know and use can have a huge impact on acceptance. E.g. anything that touches Obsidian can reach thousands of stars: github.com/KosmosisDire/obsidian-webpage-export. Note taking apps that aim for "markdown" compatibility also tend to fare better, even if in the end you inevitably have to extend the Markdown for some of your features. And WYSIWYG, which I want but don't have, is perhaps the ultimate familiarity.
Another issue compared to other platforms is that OurBigBook just came out late. Obsidian launched in 2020. Roam Research and Trillium Notes also came earlier. And it is hard to fight the advantage already gained by those on the "I'm going to take some personal notes" area. I do believe however that there a strong separation between "these are my personal notes" and "I want to publish these". Once you decide to publish your knowledge, you immediately start to write in a different way, and it is very hard to convert pre-existing "private" notes into ones suitable for public consumption.
Alan Partridge by Ciro Santilli 35 Updated +Created
Video 1.
Partridge works at Tesco for a day
. Source. Clip from Alan Partridge's Scissored Isle. This is a good one.
OurBigBook Project Update March 2024 by Ciro Santilli 35 Updated +Created 2025-03-28
This is a summary of the status of the OurBigBook Project, focusing notably on the past 9 months that I've been able to devote fully to it starting June 2024 notably due to the anonymous 1000 Monero donation and other supporters.
I have 3 months left and after unless some crazy person gives more money, I'll go back to some generic programming job that could be done by many other people so that my wife won't kill me. Hopefully I'll find something in quantum computing or AGI research this time that is not too boring, but we'll see.
I should also note that I have raised my requirement for a second year full time from 100k USD to 200k USD, such that there are about only 144k USD missing as of writing, a bargain. See also Section "Sponsor Ciro Santilli's work on OurBigBook.com". I have also set a 2M USD retirement goal in case someone wants to free me to lurk after university students for the rest of my life. Creepy.
The reason for this increase is partly because I'm jealous watching my university peers getting relatively richer and richer than me. More seriously though, as I'm likely going to be looking for a job soon, I don't want to scare employers off too much thinking that it is likely that I'll be leaving in a few months too easily. Plus inflation and the natural lack of security that such endeavour brings.
Rhetoric by Ciro Santilli 35 Updated +Created
Wikipedia internal PageRank by Ciro Santilli 35 Updated +Created
One day, once WebGraph exposes a PageRank CLI, we will be able to do it fully from the CLI, it will be beautiful.
Common Crawl WWW Ranking by Ciro Santilli 35 Updated +Created
This appears to be the direct precursor project of the Common Crawl web graph official PageRank
This section is about: wwwranking.webdatacommons.org/
Did not contain either of cirosantilli.com or OurBigBook.com as of 2025!
Based on Common Crawl 2012, and they don't seem to be updating it regularly...
Joyce Chen Lab by Ciro Santilli 35 Updated +Created
Open PageRank by Ciro Santilli 35 Updated +Created
This section is about: www.domcop.com/openpagerank/
TODO is their source code open source?
Top 10 million websites: www.domcop.com/top-10-million-websites Can be downloaded as CSV. Contained both cirosantilli.com and OurBigBook.com as of 2025!
Get values for some websites: www.domcop.com/openpagerank/
Common Crawl web graph official PageRank by Ciro Santilli 35 Updated +Created
As of 2025 Common Crawl web graph also dumps its own PageRank for each release. See e.g. the file cc-main-2024-25-dec-jan-feb-host-ranks.txt.gz from at: data.commoncrawl.org/projects/hyperlinkgraph/cc-main-2024-25-dec-jan-feb/index.html The first 20 rows are:
#harmonicc_pos  #harmonicc_val  #pr_pos #pr_val #host_rev
1       3.4626736E7     3       0.005384977821460953    com.facebook
2       3.42356E7       2       0.007010813553170503    com.googleapis.fonts
3       3.007577E7      1       0.008634952900502719    com.google
4       3.0036014E7     4       0.004411782034463272    com.googletagmanager
5       2.9900088E7     5       0.0036940035989790525   com.youtube
6       2.9537252E7     6       0.0032959808223701      com.instagram
7       2.9092556E7     9       0.0027616338842143423   com.twitter
8       2.7346152E7     7       0.0032101332824200743   com.gstatic.fonts
9       2.6818654E7     11      0.0017699438634060259   com.linkedin
10      2.5383126E7     8       0.0027849243241515574   org.gmpg
11      2.3747762E7     12      0.0016577826631867043   com.google.maps
12      2.3514198E7     15      0.0013399414238881337   com.googleapis.ajax
13      2.3504832E7     16      0.0012791339750445332   com.google.play
14      2.337092E7      47      3.794876113587071E-4    be.youtu
15      2.2925148E7     14      0.0013857916784687163   com.cloudflare.cdnjs
16      2.2851038E7     18      0.0012066313543285154   com.google.plus
17      2.2833728E7     13      0.0015745738381307273   org.wordpress
18      2.2830926E7     36      6.02400471665468E-4     com.pinterest
19      2.27056E7       45      4.001342924757244E-4    com.google.support
20      2.2687704E7     24      9.381217848819624E-4    net.jsdelivr.cdn
so quite plausible, except for org.gmpg. What the fuck is that and why is it ranked so high? Is it a quirk with the hosts inside subdomains?
Perhaps a more relevant dump might be the domain-only one cc-main-2024-25-dec-jan-feb-domain-ranks.txt.gz:
#harmonicc_pos  #harmonicc_val  #pr_pos #pr_val #host_rev       #n_hosts
1       3.1238044E7     3       0.01110707704411023     com.facebook    3632
2       3.0950192E7     2       0.016650558868491434    com.googleapis  3470
3       3.000803E7      1       0.01749148008448444     com.google      14053
4       2.7319046E7     5       0.00670112168785935     com.instagram   789
5       2.7020862E7     7       0.005464885844102939    com.youtube     1628
6       2.6954494E7     4       0.007740808154448889    com.googletagmanager    42
7       2.6344278E7     8       0.0052073382920908295   com.twitter     712
8       2.5414934E7     6       0.0058790483755603844   com.gstatic     171
9       2.4803688E7     11      0.0038589161241338816   com.linkedin    690
10      2.4683842E7     10      0.004929923081722034    org.gmpg        2
11      2.3575146E7     9       0.005111453489231459    com.cloudflare  951
12      2.2735678E7     14      0.002131882799792225    com.gravatar    98
13      2.2356142E7     12      0.002513741654851857    org.wordpress   1250
14      2.2132868E7     15      0.0019991529719988496   com.apple       3261
15      2.2095914E7     31      0.0010706467268355303   org.wikipedia   2099
16      2.2057972E7     21      0.0015644264715267535   com.pinterest   360
17      2.1941062E7     40      8.52391305373285E-4     be.youtu        15
18      2.1826452E7     16      0.0018442726685905964   net.jsdelivr    40
19      2.1764224E7     34      9.747994384099485E-4    gl.goo  951
20      2.1690982E7     35      9.740295347556525E-4    com.vimeo 
But nope, org.gmpg is still there!
vigna.di.unimi.it/ftp/papers/GraphStructure.pdf comments on it:
for instance, gmpg.org is the reference for a vocabulary that describes relationships
so it appears to be a computer-readable ontology mechanism in the lines of Resource Description Framework which interlinks many websites. The article also mentions another interesting noise in miibeian.gov.cn which every Chinese website is required to link to for their ICP license.
The source code for it seem to be at: github.com/commoncrawl/cc-webgraph and seems to use the Java version of the WebGraph quite directly on their BVGraph dump. There is apparently no CLI for PageRank however unfortunately, they have to use a bit of Java code. That would be so awesome!
Intel fellow by Ciro Santilli 35 Updated +Created
Katz centrality by Ciro Santilli 35 Updated +Created
Just image being famous only for being 44 years too early to a party.
The downside of "Katz centrality" compared to PageRank appears to be that if if a big node links to many many nodes, all of those earn a lot of reputation, regardless of how outgoing links there are:
University of Chicago research group by Ciro Santilli 35 Updated +Created
Eigenvector centrality by Ciro Santilli 35 Updated +Created
This is the family of algorithms to which PageRank

There are unlisted articles, also show them or only show them.