University is broken Updated +Created
You just have to spend a few minutes with students until they complain about the courses or teachers. And you just have to spend a few hours with teachers until they complain about the students or broader system.
University is broken, and everyone knows it. The only question now is finding a viable, "political cash flow positive" path, into something better.
Bibliography:
Generating test data for full text search tests Updated +Created
I've been thinking lightly about adding full text search to OurBigBook.
For example, at docs.ourbigbook.com/news/article-and-topic-id-prefix-search article search was added, but it only finds if you search something that appears right at the start of a title, e.g. for:
Fundamental theorem of calculus
you'd get a hit for:
fundamental
but not for
calculus
To do this efficiently, we need full text search, which PostgreSQL implements.
But finding a clean way to generate test data for testing out the speedup was not so easy and exploration into this led me to publishing a few new slightly improved methods where Googlers can now find them:
Finally I did a writeup summarizing PostgreSQL full text search: Section "PostgreSQL full-text search" and also dumped it at: www.reddit.com/r/PostgreSQL/comments/12yld1o/is_it_worth_using_postgres_builtin_fulltext/ for good measure.
How the tech improved Updated +Created
In any case, the outcome of that is that the tech has improved. And I have done a relatively good job of clearly publishing any "more user visible" improvements to docs.ourbigbook.com/news and social media such as though it is important to note that there have been more than one "fix a hard bug" weeks that were not published because they would just bore readers.
During this period the main focus has been on improving OurBigBook Web, i.e. the dynamic website that powers OurBigBook.com. There are two reasons for that:
  • Web is what has the OurBigBook topics feature for mind-melding, which is the killer feature of OurBigBook compared to other note taking apps and therefore deserves the highest levels of priority
    Static website generation is an indispensable escape valve that ensures that your content can be published forever even if OurBigBook.com goes down one day, which it won't as long as I live. But the innovation is Web.
  • static website generation was closer to good enough, but web was much further and is fundamentally harder.
    I'm extremely satisfied with OurBigBook static website generation and haven't touched it as much. It wasn't easy to reach this state, but I'm there.
    But Web is a different and much more complex beast.
    Making CLI software that will run on a person's local computer under full trust and building a bunch of HTML from lightweight markup in bulk is one thing.
    But making a public dynamic website that has to continuously maintain a coherent database state on granular updates, while giving users some trust but not enough for them to blow everything up is on a totally different level. See e.g. the recent SPAM attack we've had to fend off.
    Figure 1.
    Screenshot showing voting manipulated SPAM as the most highly upvoted article on OurBigBook.com
    . Source.
    And then there's also the issue of front-end being mega-hard to get right.
As a result, Web is now way less buggy and much more usable.
If you look through the list of Web updates, there is nothing specifically mind blowing. The core ideas have largely crystallized, and we are just trying to making them click. I have a few more punches up my sleeve, but the core is decided.
Figure 3.
OurBigBook Web article announcement
. Source. Another cute new feature, you can send an email to your followers about a new amazing article you created.
Web process has been somewhat slower than what I'd like. Of course, it is the case of any project that things are easily said than done. But there are two other main structural factors that have played into it:
  • I have my first baby now, and we're learning how to deal with that on the fly.
    For example, we could have put him on childcare a bit earlier, but due to inexperience we've kept him a bit longer than we maybe should have.
    Things are well sorted out now, but not matter how good your support system is, at the end of the day, and more often night, it is you the parents that have to deal with a lot of inevitable baby issues. Unless you want them to turn into psychopaths and drug addicts that is, which I don't. I've reached the point of semi failure middle age that the baby feels like my best moonshot.
    All of this sets a fundamental limit on how many hours you can work per week.
    But at least with the donations I was able to work on OurBigBook at all. Because if it weren't for that, I would have to focus entirely on the generic job instead and OurBigBook would have been put on hold.
  • the choice of Web stack. I was allured by Next.js. I can see the beauty and usefulness of a Node.js render front-end that also runs on backend and hydration. That is awesome.
    But:
    • React is insanely hard to learn and understand. Furthermore, it is also hard to understand the performance problem that it solves, and actually have a benchmark where this problem is solved faster than just delivering some HTML files with ad-hoc Js on top.
    • the lack (or perhaps excess of shitty) actual web framework like Ruby on Rails and Django means that I have to rediscover the wheel many times over for all the essential support activities like testing, login and so one
    At this point a rewrite is out of the question. I've managed to master things well enough to get a decent result, and given up on the few things that I couldn't for the life of me achieve, after documenting them very well for posterity of course.
Aside from Web, there was only one thing that received a significant improvement, and that was the OurBigBook VS Code extension. The extension is not perfect, and it is not the "final UI", which has to be some WYSIWYG implementation, and there are some fundamental limitations that cannot be overcome without patching VS Code itself. However, the extension is already extremely usable, and I'm writing this on it right now. Basics like syntax highlighting, jump to definition and autocomplete are very useful and usable.
Figure 4.
Tree navigation in the OurBigBook Visual Studio Code extension
.
Metrics and rationales Updated +Created
Long story short, the project is so far a complete failure on the most important metric: number of regular users, which current sits at exactly one: myself.
There were notable users who found the project online and who actually tried to use the website for some content and provided extremely valuable feedback:Unfortunately after the period of a few weeks they stopped using it to follow their other priorities instead. Which is of course totally fine, however sad.
I still believe that the OurBigBook Web feature is a significant tech innovation that could make the website go big.
I also believe that the project gets many fundamentals of braindumping right, notably the infinitely deep table of contents without forced scoping, e.g.:
- Mathematics
  - Calculus
does not make Calculus have an ID orr URL of mathematics/calculus, rather it's just calculus.
But there is a fundamental difficulty in reaching critical mass to that self-sustaining point, as people don't seem to be convinced by these logical "my system is better" argument alone, as opposed to having them Google into stuff they need now and then understand that the project is awesome.
A closely related critical mass issue is that existing big multiuser knowledge base websites such as Stack Overflow and Wikipedia have a tremendous advantage on PageRank. No matter how useless a Wikipedia article about something is, it will always be on top of Google within a week of creation for title hits. And since the main goal of publishing your stuff is to get it seen, it makes much more sense for writers to publish on such existing websites whenever possible, because anywhere else it is way way less likely to be seen by anybody.
Even I end up writing way more on Stack Overflow than on OurBigBook as a programmer. But I still believe that there is a value to OurBigBook, for the usual reasons of:
  • it allows you to organize a more global view of a subject, i.e. a book. Even I write answers on Stack Overflow, I also tend to organize links to these answers in a structured ways here, see e.g. big topics such as SQL
  • deletionism and overly narrowness of allowed topics/style
Perhaps what saddens me the most is that even on GitHub stars/Twitter/Hacker news terms there is almost no interest in the project despite the fact that I consider that it has innovations, while many other note taking apps as well in the thousands of stars. Maybe I'm just delusional and all the tech that I'm doing is completely useless?
Part of the issue is probably linked to the fact that most other note taking apps focus on "help me organize my ideas so I can make more money" and often completely ignore "I want to publish my knowledge", and stuff that helps you make money is always easier to sell and promote.
OurBigBook on the other hand a huge focus on "I want to publish me knowledge". It aims almost single mindedly in being the best tool ever for that. However this doesn't make money for people, and therefore there are going to be way less potential users.
I do believe strongly that all it takes is a few users for the project to snowball. For some people, once you start braindumping, it is very addictive, and you never want to stop basically. So with only a few of those we can open large parts of undergrad knowledge to the world. But these people are few, and so far I haven't been able to find even a single one like me, and on top of that convince them that I have created the ultimate system for their knowledge publishing desires.
Another general lesson is that I should perhaps aimed for greater compatibility with existing systems such as Obsidian. Taking something that many people already know and use can have a huge impact on acceptance. E.g. anything that touches Obsidian can reach thousands of stars: github.com/KosmosisDire/obsidian-webpage-export. Note taking apps that aim for "markdown" compatibility also tend to fare better, even if in the end you inevitably have to extend the Markdown for some of your features. And WYSIWYG, which I want but don't have, is perhaps the ultimate familiarity.
Another issue compared to other platforms is that OurBigBook just came out late. Obsidian launched in 2020. Roam Research and Trillium Notes also came earlier. And it is hard to fight the advantage already gained by those on the "I'm going to take some personal notes" area. I do believe however that there a strong separation between "these are my personal notes" and "I want to publish these". Once you decide to publish your knowledge, you immediately start to write in a different way, and it is very hard to convert pre-existing "private" notes into ones suitable for public consumption.
Quick fun with the Common Crawl web graph Updated +Created
I wanted to do a quick exploration of open PageRank implementation and data.
My general motivation for this is that a PageRank-like algorithm could be useful for more accurate user and article ranking on OurBigBook, see: Section "PageRank-like ranking"
But it could also be just generally cool to apply it to other graph datasets, e.g. for computing an Wikipedia internal PageRank.
A quick Google reveals only Open PageRank, but their methods are apparently closed source.
Then I had a look at the Common Crawl web graph data to see if I could easily calculate it myself, and... they already have it! See: Section "Common Crawl web graph official PageRank"
Their graph dumps are in BVGraph graph file format, which is the native format of the WebGraph framework, which implements the format and algorithms such as PageRank.
The only thing I miss is a command line interface to calculate the PageRank. That would be so awesome.
The more I look at it the more I love Common Crawl.
In cc-main-2024-25-dec-jan-feb-domain-ranks.txt:
  • cirosantilli.com was ranked ~453k
  • ourbigbook.com was at ~606k