Long story short, the project is so far a complete failure on the most important metric: number of regular users, which current sits at exactly one: myself.
There were notable users who found the project online and who actually tried to use the website for some content and provided extremely valuable feedback:Unfortunately after the period of a few weeks they stopped using it to follow their other priorities instead. Which is of course totally fine, however sad.
I still believe that the OurBigBook Web feature is a significant tech innovation that could make the website go big.
I also believe that the project gets many fundamentals of braindumping right, notably the infinitely deep table of contents without forced scoping, e.g.:does not make Calculus have an ID orr URL of
- Mathematics
- Calculus
mathematics/calculus
, rather it's just calculus
.But there is a fundamental difficulty in reaching critical mass to that self-sustaining point, as people don't seem to be convinced by these logical "my system is better" argument alone, as opposed to having them Google into stuff they need now and then understand that the project is awesome.
A closely related critical mass issue is that existing big multiuser knowledge base websites such as Stack Overflow and Wikipedia have a tremendous advantage on PageRank. No matter how useless a Wikipedia article about something is, it will always be on top of Google within a week of creation for title hits. And since the main goal of publishing your stuff is to get it seen, it makes much more sense for writers to publish on such existing websites whenever possible, because anywhere else it is way way less likely to be seen by anybody.
Even I end up writing way more on Stack Overflow than on OurBigBook as a programmer. But I still believe that there is a value to OurBigBook, for the usual reasons of:
Perhaps what saddens me the most is that even on GitHub stars/Twitter/Hacker news terms there is almost no interest in the project despite the fact that I consider that it has innovations, while many other note taking apps as well in the thousands of stars. Maybe I'm just delusional and all the tech that I'm doing is completely useless?
Part of the issue is probably linked to the fact that most other note taking apps focus on "help me organize my ideas so I can make more money" and often completely ignore "I want to publish my knowledge", and stuff that helps you make money is always easier to sell and promote.
OurBigBook on the other hand a huge focus on "I want to publish me knowledge". It aims almost single mindedly in being the best tool ever for that. However this doesn't make money for people, and therefore there are going to be way less potential users.
I do believe strongly that all it takes is a few users for the project to snowball. For some people, once you start braindumping, it is very addictive, and you never want to stop basically. So with only a few of those we can open large parts of undergrad knowledge to the world. But these people are few, and so far I haven't been able to find even a single one like me, and on top of that convince them that I have created the ultimate system for their knowledge publishing desires.
Another general lesson is that I should perhaps aimed for greater compatibility with existing systems such as Obsidian. Taking something that many people already know and use can have a huge impact on acceptance. E.g. anything that touches Obsidian can reach thousands of stars: github.com/KosmosisDire/obsidian-webpage-export. Note taking apps that aim for "markdown" compatibility also tend to fare better, even if in the end you inevitably have to extend the Markdown for some of your features. And WYSIWYG, which I want but don't have, is perhaps the ultimate familiarity.
Another issue compared to other platforms is that OurBigBook just came out late. Obsidian launched in 2020. Roam Research and Trillium Notes also came earlier. And it is hard to fight the advantage already gained by those on the "I'm going to take some personal notes" area. I do believe however that there a strong separation between "these are my personal notes" and "I want to publish these". Once you decide to publish your knowledge, you immediately start to write in a different way, and it is very hard to convert pre-existing "private" notes into ones suitable for public consumption.
Updates OurBigBook Project Update March 2024 by
Ciro Santilli 37 Updated 2025-07-14 +Created 2025-03-08 2025-07-14
This is a summary of the status of the OurBigBook Project, focusing notably on the past 9 months that I've been able to devote fully to it starting June 2024 notably due to the anonymous 1000 Monero donation and other supporters.
I have 3 months left and after unless some crazy person gives more money, I'll go back to some generic programming job that could be done by many other people so that my wife won't kill me. Hopefully I'll find something in quantum computing or AGI research this time that is not too boring, but we'll see.
I should also note that I have raised my requirement for a second year full time from 100k USD to 200k USD, such that there are about only 144k USD missing as of writing, a bargain. See also Section "Sponsor Ciro Santilli's work on OurBigBook.com". I have also set a 2M USD retirement goal in case someone wants to free me to lurk after university students for the rest of my life. Creepy.
The reason for this increase is partly because I'm jealous watching my university peers getting relatively richer and richer than me. More seriously though, as I'm likely going to be looking for a job soon, I don't want to scare employers off too much thinking that it is likely that I'll be leaving in a few months too easily. Plus inflation and the natural lack of security that such endeavour brings.
This appears to be the direct precursor project of the Common Crawl web graph official PageRank
This section is about: wwwranking.webdatacommons.org/
Based on Common Crawl 2012, and they don't seem to be updating it regularly...
Created by the Università degli Studi di Milano.
Homepage: chen.uchicago.edu/
This section is about: www.domcop.com/openpagerank/
TODO is their source code open source?
Top 10 million websites: www.domcop.com/top-10-million-websites Can be downloaded as CSV. Contained both cirosantilli.com and OurBigBook.com as of 2025!
Get values for some websites: www.domcop.com/openpagerank/
Common Crawl web graph official PageRank by
Ciro Santilli 37 Updated 2025-07-14 +Created 2025-02-26
As of 2025 Common Crawl web graph also dumps its own PageRank for each release. See e.g. the file so quite plausible, except for
cc-main-2024-25-dec-jan-feb-host-ranks.txt.gz
from at: data.commoncrawl.org/projects/hyperlinkgraph/cc-main-2024-25-dec-jan-feb/index.html The first 20 rows are:#harmonicc_pos #harmonicc_val #pr_pos #pr_val #host_rev
1 3.4626736E7 3 0.005384977821460953 com.facebook
2 3.42356E7 2 0.007010813553170503 com.googleapis.fonts
3 3.007577E7 1 0.008634952900502719 com.google
4 3.0036014E7 4 0.004411782034463272 com.googletagmanager
5 2.9900088E7 5 0.0036940035989790525 com.youtube
6 2.9537252E7 6 0.0032959808223701 com.instagram
7 2.9092556E7 9 0.0027616338842143423 com.twitter
8 2.7346152E7 7 0.0032101332824200743 com.gstatic.fonts
9 2.6818654E7 11 0.0017699438634060259 com.linkedin
10 2.5383126E7 8 0.0027849243241515574 org.gmpg
11 2.3747762E7 12 0.0016577826631867043 com.google.maps
12 2.3514198E7 15 0.0013399414238881337 com.googleapis.ajax
13 2.3504832E7 16 0.0012791339750445332 com.google.play
14 2.337092E7 47 3.794876113587071E-4 be.youtu
15 2.2925148E7 14 0.0013857916784687163 com.cloudflare.cdnjs
16 2.2851038E7 18 0.0012066313543285154 com.google.plus
17 2.2833728E7 13 0.0015745738381307273 org.wordpress
18 2.2830926E7 36 6.02400471665468E-4 com.pinterest
19 2.27056E7 45 4.001342924757244E-4 com.google.support
20 2.2687704E7 24 9.381217848819624E-4 net.jsdelivr.cdn
org.gmpg
. What the fuck is that and why is it ranked so high? Is it a quirk with the hosts inside subdomains?Perhaps a more relevant dump might be the domain-only one But nope,
cc-main-2024-25-dec-jan-feb-domain-ranks.txt.gz
:#harmonicc_pos #harmonicc_val #pr_pos #pr_val #host_rev #n_hosts
1 3.1238044E7 3 0.01110707704411023 com.facebook 3632
2 3.0950192E7 2 0.016650558868491434 com.googleapis 3470
3 3.000803E7 1 0.01749148008448444 com.google 14053
4 2.7319046E7 5 0.00670112168785935 com.instagram 789
5 2.7020862E7 7 0.005464885844102939 com.youtube 1628
6 2.6954494E7 4 0.007740808154448889 com.googletagmanager 42
7 2.6344278E7 8 0.0052073382920908295 com.twitter 712
8 2.5414934E7 6 0.0058790483755603844 com.gstatic 171
9 2.4803688E7 11 0.0038589161241338816 com.linkedin 690
10 2.4683842E7 10 0.004929923081722034 org.gmpg 2
11 2.3575146E7 9 0.005111453489231459 com.cloudflare 951
12 2.2735678E7 14 0.002131882799792225 com.gravatar 98
13 2.2356142E7 12 0.002513741654851857 org.wordpress 1250
14 2.2132868E7 15 0.0019991529719988496 com.apple 3261
15 2.2095914E7 31 0.0010706467268355303 org.wikipedia 2099
16 2.2057972E7 21 0.0015644264715267535 com.pinterest 360
17 2.1941062E7 40 8.52391305373285E-4 be.youtu 15
18 2.1826452E7 16 0.0018442726685905964 net.jsdelivr 40
19 2.1764224E7 34 9.747994384099485E-4 gl.goo 951
20 2.1690982E7 35 9.740295347556525E-4 com.vimeo
org.gmpg
is still there!vigna.di.unimi.it/ftp/papers/GraphStructure.pdf comments on it: so it appears to be a computer-readable ontology mechanism in the lines of Resource Description Framework which interlinks many websites. The article also mentions another interesting noise in
miibeian.gov.cn
which every Chinese website is required to link to for their ICP license.The downside of "Katz centrality" compared to PageRank appears to be that if if a big node links to many many nodes, all of those earn a lot of reputation, regardless of how outgoing links there are:
University of Chicago research group by
Ciro Santilli 37 Updated 2025-07-14 +Created 2025-02-26
This is the family of algorithms to which PageRank
Open PageRank implementation and data by
Ciro Santilli 37 Updated 2025-07-14 +Created 2025-02-26
This section is about more "open" PageRank implementations, notably using either or both of:
As of 2025, the most open and reproducible implementation appears to be whatever Common Crawl web graph official PageRank does, which is to use WebGraph. It's quite beautiful.
École Polytechnique alumnus of 2009 by
Ciro Santilli 37 Updated 2025-07-14 +Created 2025-02-26
École Polytechnique alumnus of 1983 by
Ciro Santilli 37 Updated 2025-07-14 +Created 2025-02-26
In 2017 apparently they've started making their own Web Graphs, i.e. they parse the HTML and extract the graph of what links to what.
Edit: actually, they already calculate PageRank for us!!! Fantastic!!! Main section: Section "Common Crawl web graph official PageRank".
A quick exploration of the graph can be seen at: github.com/cirosantilli/cirosantilli.github.io/issues/198
Their source code is at: github.com/commoncrawl/cc-webgraph
There are unlisted articles, also show them or only show them.