You just have to spend a few minutes with students until they complain about the courses or teachers. And you just have to spend a few hours with teachers until they complain about the students or broader system.
University is broken, and everyone knows it. The only question now is finding a viable, "political cash flow positive" path, into something better.
Bibliography:
- academeblog.org/2022/10/05/american-universities-are-going-to-implode/ American Universities Are Going to Implode (2022) by www.linkedin.com/in/jsgabin/
- theimaginativeconservative.org/2020/04/higher-education-implode-alexander-zubatov.html Higher Education Is About to Implode by Alexander Zubatov (2020)
- 2024-05-28 www.theguardian.com/education/article/2024/may/28/i-see-little-point-uk-university-students-on-why-attendance-has-plummeted ‘I see little point’: UK university students on why attendance has plummetedGive it to me, baby. OurBigBook incoming.
"There’s a bit of a feeling that there is just box-ticking going on [among students], and getting a degree at the end of it."
For example, at docs.ourbigbook.com/news/article-and-topic-id-prefix-search article search was added, but it only finds if you search something that appears right at the start of a title, e.g. for:you'd get a hit for:but not for
Fundamental theorem of calculus
fundamental
calculus
But finding a clean way to generate test data for testing out the speedup was not so easy and exploration into this led me to publishing a few new slightly improved methods where Googlers can now find them:
- unix.stackexchange.com/questions/97160/is-there-something-like-a-lorem-ipsum-generator/787733#787733 I propose a neat random "sentence" generator using common CLI tools like
grep
andsed
and the pre-installed Ubuntu dictionary/usr/share/dict/american-english
:grep -v "'" /usr/share/dict/american-english | shuf -r | paste -d ' ' $(printf "%4s" | sed 's/ /- /g') | sed -e 's/^\(.\)/\U\1/;s/$/./' | head -n10000000 \ > lorem.txt
- to achieve that, I also proposed two superior "join every N lines" method for the CLI: stackoverflow.com/questions/25973140/joining-every-group-of-n-lines-into-one-with-bash/79257780#79257780, notably this awk poem:
seq 10 | awk '{ printf("%s%s", NR == 1 ? "" : NR % 3 == 1 ? "\n" : " ", $0 ) } END { printf("\n") }'
- to achieve that, I also proposed two superior "join every N lines" method for the CLI: stackoverflow.com/questions/25973140/joining-every-group-of-n-lines-into-one-with-bash/79257780#79257780, notably this awk poem:
- stackoverflow.com/questions/3371503/sql-populate-table-with-random-data/79255281#79255281 I propose:
- a clean PostgreSQL random string stored procedure that picks random characters from an allowed character list
CREATE OR REPLACE FUNCTION random_string(int) RETURNS TEXT as $$ select string_agg(substr(characters, (random() * length(characters) + 1)::integer, 1), '') as random_word from (values('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789- ')) as symbols(characters) join generate_series(1, $1) on 1 = 1 $$ language sql;
- first generating PostgreSQL data as CSV, and then importing the CSV into PostgreSQL as a more flexible method. This can also be done in a streaming fashion from stdin which is neat.
python generate_data.py 10 | psql mydb -c '\copy "mytable" FROM STDIN'
- a clean PostgreSQL random string stored procedure that picks random characters from an allowed character list
- stackoverflow.com/questions/16020164/psqlexception-error-syntax-error-in-tsquery/79437030#79437030 regarding the safe generation of prefix search
tsquery
from user inputs without query errors, I've learned aboutwebsearch_to_tsquery
and further highlighted a possibletsquery -> text -> tsquery
approach that might be correct for prefix searches - stackoverflow.com/questions/67438575/fulltext-search-using-sequelize-postgres/79439253#79439253 I put everything together into a minimal Sequelize example, read for usage in OurBigBook
Finally I did a writeup summarizing PostgreSQL full text search: Section "PostgreSQL full-text search" and also dumped it at: www.reddit.com/r/PostgreSQL/comments/12yld1o/is_it_worth_using_postgres_builtin_fulltext/ for good measure.
In any case, the outcome of that is that the tech has improved. And I have done a relatively good job of clearly publishing any "more user visible" improvements to docs.ourbigbook.com/news and social media such as though it is important to note that there have been more than one "fix a hard bug" weeks that were not published because they would just bore readers.
During this period the main focus has been on improving OurBigBook Web, i.e. the dynamic website that powers OurBigBook.com. There are two reasons for that:As a result, Web is now way less buggy and much more usable.
- Web is what has the OurBigBook topics feature for mind-melding, which is the killer feature of OurBigBook compared to other note taking apps and therefore deserves the highest levels of priorityStatic website generation is an indispensable escape valve that ensures that your content can be published forever even if OurBigBook.com goes down one day, which it won't as long as I live. But the innovation is Web.
- static website generation was closer to good enough, but web was much further and is fundamentally harder.I'm extremely satisfied with OurBigBook static website generation and haven't touched it as much. It wasn't easy to reach this state, but I'm there.But Web is a different and much more complex beast.Making CLI software that will run on a person's local computer under full trust and building a bunch of HTML from lightweight markup in bulk is one thing.But making a public dynamic website that has to continuously maintain a coherent database state on granular updates, while giving users some trust but not enough for them to blow everything up is on a totally different level. See e.g. the recent SPAM attack we've had to fend off.And then there's also the issue of front-end being mega-hard to get right.
If you look through the list of Web updates, there is nothing specifically mind blowing. The core ideas have largely crystallized, and we are just trying to making them click. I have a few more punches up my sleeve, but the core is decided.
OurBigBook Web search
. Source. This is one of the many basic quality of life improvements that have been done on OurBigBook Web.OurBigBook Web article announcement
. Source. Another cute new feature, you can send an email to your followers about a new amazing article you created.Web process has been somewhat slower than what I'd like. Of course, it is the case of any project that things are easily said than done. But there are two other main structural factors that have played into it:
- For example, we could have put him on childcare a bit earlier, but due to inexperience we've kept him a bit longer than we maybe should have.Things are well sorted out now, but not matter how good your support system is, at the end of the day, and more often night, it is you the parents that have to deal with a lot of inevitable baby issues. Unless you want them to turn into psychopaths and drug addicts that is, which I don't. I've reached the point of semi failure middle age that the baby feels like my best moonshot.But at least with the donations I was able to work on OurBigBook at all. Because if it weren't for that, I would have to focus entirely on the generic job instead and OurBigBook would have been put on hold.
- the choice of Web stack. I was allured by Next.js. I can see the beauty and usefulness of a Node.js render front-end that also runs on backend and hydration. That is awesome.But:
- React is insanely hard to learn and understand. Furthermore, it is also hard to understand the performance problem that it solves, and actually have a benchmark where this problem is solved faster than just delivering some HTML files with ad-hoc Js on top.
- the lack (or perhaps excess of shitty) actual web framework like Ruby on Rails and Django means that I have to rediscover the wheel many times over for all the essential support activities like testing, login and so one
At this point a rewrite is out of the question. I've managed to master things well enough to get a decent result, and given up on the few things that I couldn't for the life of me achieve, after documenting them very well for posterity of course.
Aside from Web, there was only one thing that received a significant improvement, and that was the OurBigBook VS Code extension. The extension is not perfect, and it is not the "final UI", which has to be some WYSIWYG implementation, and there are some fundamental limitations that cannot be overcome without patching VS Code itself. However, the extension is already extremely usable, and I'm writing this on it right now. Basics like syntax highlighting, jump to definition and autocomplete are very useful and usable.
Long story short, the project is so far a complete failure on the most important metric: number of regular users, which current sits at exactly one: myself.
There were notable users who found the project online and who actually tried to use the website for some content and provided extremely valuable feedback:Unfortunately after the period of a few weeks they stopped using it to follow their other priorities instead. Which is of course totally fine, however sad.
I still believe that the OurBigBook Web feature is a significant tech innovation that could make the website go big.
I also believe that the project gets many fundamentals of braindumping right, notably the infinitely deep table of contents without forced scoping, e.g.:does not make Calculus have an ID orr URL of
- Mathematics
- Calculus
mathematics/calculus
, rather it's just calculus
.But there is a fundamental difficulty in reaching critical mass to that self-sustaining point, as people don't seem to be convinced by these logical "my system is better" argument alone, as opposed to having them Google into stuff they need now and then understand that the project is awesome.
A closely related critical mass issue is that existing big multiuser knowledge base websites such as Stack Overflow and Wikipedia have a tremendous advantage on PageRank. No matter how useless a Wikipedia article about something is, it will always be on top of Google within a week of creation for title hits. And since the main goal of publishing your stuff is to get it seen, it makes much more sense for writers to publish on such existing websites whenever possible, because anywhere else it is way way less likely to be seen by anybody.
Even I end up writing way more on Stack Overflow than on OurBigBook as a programmer. But I still believe that there is a value to OurBigBook, for the usual reasons of:
Perhaps what saddens me the most is that even on GitHub stars/Twitter/Hacker news terms there is almost no interest in the project despite the fact that I consider that it has innovations, while many other note taking apps as well in the thousands of stars. Maybe I'm just delusional and all the tech that I'm doing is completely useless?
Part of the issue is probably linked to the fact that most other note taking apps focus on "help me organize my ideas so I can make more money" and often completely ignore "I want to publish my knowledge", and stuff that helps you make money is always easier to sell and promote.
OurBigBook on the other hand a huge focus on "I want to publish me knowledge". It aims almost single mindedly in being the best tool ever for that. However this doesn't make money for people, and therefore there are going to be way less potential users.
I do believe strongly that all it takes is a few users for the project to snowball. For some people, once you start braindumping, it is very addictive, and you never want to stop basically. So with only a few of those we can open large parts of undergrad knowledge to the world. But these people are few, and so far I haven't been able to find even a single one like me, and on top of that convince them that I have created the ultimate system for their knowledge publishing desires.
Another general lesson is that I should perhaps aimed for greater compatibility with existing systems such as Obsidian. Taking something that many people already know and use can have a huge impact on acceptance. E.g. anything that touches Obsidian can reach thousands of stars: github.com/KosmosisDire/obsidian-webpage-export. Note taking apps that aim for "markdown" compatibility also tend to fare better, even if in the end you inevitably have to extend the Markdown for some of your features. And WYSIWYG, which I want but don't have, is perhaps the ultimate familiarity.
Another issue compared to other platforms is that OurBigBook just came out late. Obsidian launched in 2020. Roam Research and Trillium Notes also came earlier. And it is hard to fight the advantage already gained by those on the "I'm going to take some personal notes" area. I do believe however that there a strong separation between "these are my personal notes" and "I want to publish these". Once you decide to publish your knowledge, you immediately start to write in a different way, and it is very hard to convert pre-existing "private" notes into ones suitable for public consumption.
Updates Post OurBigBook job search round 2025 Updated 2025-05-09 +Created 2025-05-07 2025-05-09
I shouldn't be doing this on funded OurBigBook time which is until the end of May, but I was getting too nervous and decided to start a casual job search to test the waters.
In particular I want to see if I can get past the HR lady step without toning down my online profiles. If nothing works out for the next round I'll be hiding anything too spicy like:Another interesting point is to see if French companies are more likely to reply given that Ciro Santilli studied at École Polytechnique which the French worship.
- prominently seeking funding for OurBigBook on my LinkedIn profile
- CIA 2010 covert communication websites references. This will be my first job hunt since I have published that article. Wish me luck.
- gay Putin profile picture on Stack Overflow
Gay Putin, currently used in Ciro Santilli's Stack Overflow profile
. Ciro's profiles may be a bit too much for the HR ladies who reject his job applications on the spot. To be fair, perhaps not enough years of experience for certain applications and job hopping may have something to do with it too. But since they don't ever tell you anything not to get sued, we'll never know.I'm looking in particular either for:
- machine learning-adjacent jobs in companies that seem to be doing something that could further AGI, e.g. automatic code generation or robotics would be ideal
- quantum computing
- systems programming, which is what I actually have work experience with
I spent the last two weeks doing that:
- one week browsing everything of interest in London and Paris and sending applications to anything that seemed both relevant and interesting. Maintaining an application list at: Section "Job application by Ciro Santilli".
- one week on a very laborious but somewhat interesting take home exercise for Linux kernel engineer a Canonical, makers of Ubuntu.I had a week to finish 5 practical coding and packaging questions, and I tried to do everything as perfectly as possible, but I somewhat underestimated the amount of work and wait needed to do everything and didn't manage to finish question 4 and missed 5. Oops let's see how that goes.At least this had a few good outcomes for the Internet as I tried to document things as nicely as I could where they were missing from Google as usual:
- I re-tested Linux Kernel Module Cheat and made some small improvements. Things still worked from a Ubuntu 24.10 host (using Docker to Ubuntu 22.04), and I also checked that kernel 6.8 builds and GDB step debugs after adding the newly required config
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
, also mentioned that at: Why are there no debug symbols in my vmlinux when using gdb with /proc/kcore? - I contributed some simple updates to github.com/martinezjavier/ldd3 getting it closer to work on Linux kernel v6.8. That repository aims to keep the venerable examples from Linux kernel module book LDD3 alive on newer kernels, and is a very good source for kernel module developers.
- How to compile a Linux kernel module?: wrote a quick Ciro-approved tutorial
- Dynamic array in Linux kernel module: I gave an educational example of a dynamic byte array (like std::string) using the kvmalloc family of allocators
- quickemu: this is a good emulator manager and I think I'll be using it for Ubuntu images when needed from now on. I wrote:
- How to run Ubuntu desktop on QEMU?: an introductory tutorial to the software as their README is not that good as is often the case. It's hard for project authors to predict what new users want or not. This is my second answer to this question, the previous one focusing on a more manual approach without third party helpers.
- How to share folder between guest/host? (Quickemu): I explained how to setup a 9p mount to share a directory between guest and host
- Error :: You must put some 'source' URIs in your sources.list: updated this answer for Ubuntu 24.04. This issue comes up when you want to do either of:which don't work by default, and my answer explains how to do it from the GUI and CLI. The CLI method is specially important for Docker images. Since Ubuntu doesn't offer a stable CLI method for this, the method breaks from time to time and we have to find the new config file to edit.
sudo apt build-dep sudo apt source
- What is hardware enablement (HWE)?: I learned a bit better how Ubuntu structures its kernel releases for each Ubuntu release
Some of the main issues I had were:- compiling Linux kernel for Ubuntu is extremely slow. I was used to compiling for embedded system with Buildroot, which finishes in minutes, but for Ubuntu is hours, presumably because they enable as many drivers as possible to make a single ISO work on as many different computers as possible, which makes sense, but also makes development harder
- my QEMU setup for Ubuntu was not quite as streamlined and I relearned a few things and set up quickemu. By chance I had recently come across quickemu for testing OurBigBook on MacOS, but I had to learn a bit how to set it up reasonably too
- I re-tested Linux Kernel Module Cheat and made some small improvements. Things still worked from a Ubuntu 24.10 host (using Docker to Ubuntu 22.04), and I also checked that kernel 6.8 builds and GDB step debugs after adding the newly required config
github.com/cirosantilli/cirosantilli.github.io/issues/198. Previously at: stackoverflow.com/questions/31321009/best-more-standard-graph-representation-file-format-graphson-gexf-graphml/79467334#79467334 but Stack Overflow fucking deleted the question.
My general motivation for this is that a PageRank-like algorithm could be useful for more accurate user and article ranking on OurBigBook, see: Section "PageRank-like ranking"
But it could also be just generally cool to apply it to other graph datasets, e.g. for computing an Wikipedia internal PageRank.
Then I had a look at the Common Crawl web graph data to see if I could easily calculate it myself, and... they already have it! See: Section "Common Crawl web graph official PageRank"
Their graph dumps are in BVGraph graph file format, which is the native format of the WebGraph framework, which implements the format and algorithms such as PageRank.
The only thing I miss is a command line interface to calculate the PageRank. That would be so awesome.
Announcements:
In cc-main-2024-25-dec-jan-feb-domain-ranks.txt:
cirosantilli.com
was ranked ~453kourbigbook.com
was at ~606k