Video 1.
10k GitHub Stars by BookStack (2022)
Source. Answering to an AMA unfortunately :-) But some OK small bits of information trickled through.
The dominating meme database as of 2020.
As of 2022 visible at: www.nature.com/scitable
Apparently they had a separate URL as just scitable.com, so they were somewhat serious about it before shutting it down.
As of 2022 marked:
This page has been archived and is no longer updated
RIP.
www.nature.com/scitable/blog/student-voices/ has last entry 2015, so presumably that's the shutdown year.
Self description:
Using our platform, you can customize your own eBooks for your students. Create an online classroom. Contribute and share content and connect with networks of colleagues.
so quite related to OurBigBook.com.
But non-paid plan currently disables "Search engine indexing" of that sharing, so it's useless. There's an "Allow duplicate as template" button though which is nice.
URLs are horrendous however, e.g.: lofty-flower-be4.notion.site/aa-2274c59a06124d5b974b781a67340670 Only the aa in that came from us. They don't even have the guts for a fixed subdomain.
Also it does not work without JavaScript, no SSR, everything is dynamic.
They don't show multiple input pages on the same render, e.g.: lofty-flower-be4.notion.site/aa-2274c59a06124d5b974b781a67340670 does not contain the child lofty-flower-be4.notion.site/bb-45df7212a2e14e04b3f9604035c7acf4 as already implemented on OurBigBook Web Dynamic Article Tree.
Cross page links to work fine. But you don't link to explicit IDs, only internal hidden IDs. This can be even slightly confusing to users as multiple identical options can show up when you start creating a link. They do try to disambiguate with the parent page however.
So this is a reasonable single-person publishing platform for your notes.
Someone made and sold a helper for it:
Tree based organization at last. Infinitely deep.
Amazing WYSIWYG, including maths and tables, plus insane plugins like canvas mode, and specific file formats like code/mermaid diagrams/drawing mode.
Version history with automatic snapshots at intervals. TODO how is it implemented? Do they just ZIP multiple versions?
No multiuser features. Except for that, could have been a good starting point of an online multiuser thing such as OurBigBook.com!
With Book Notes it is possible possible to see more than one page at a time on the output, whic his a major feature of OurBigBook. But does it show on HTML export as well?
You can static HTML export any subtree by right clicking on it in the navigation tree.
Is there a CLI to export to HTML? github.com/zadam/trilium/issues/3012
HTML export keeps all data as HTML is their native format. This may be inherited from CKEditor. The files are mostly visible, but there is some CSS missing, it is not 100% like editor, notably math is broken. There is also a hosted way of exposing: github.com/zadam/trilium/wiki/Sharing.
trilium.rocks however has a very good export, it is just a question of how much they had to hacked things, source at: github.com/zerebos/trilium.rocks
The default tHTML export uses frame navigation, with a toc fixed on the left frame. Efficient, but not of this century.
There is no concept of user created unique text IDs: you can have the same headers in the same folders in the UI. It's not even a matter of scopes. On exports they are differentiated as 1_name, 2_name, etc.
./Trilium Demo/Books/To read/1_HR.md
./Trilium Demo/Books/To read/2_HR.md
./Trilium Demo/Books/To read/HR.md
Markdown export warns:
this preserves most of the formatting
Architecture: runs on local SQLite database via better-sqlite3. Data apparently stored in SQLite database at ~/.local/share/trilium-data, no raw files.
Markup is stored as HTML as seen from: sqlite3 document.db 'SELECT * from note_contents'. HTML is their native storage format, quite interesting. But this means it is not source centric, so any source editing would have to go via import/export. It can be done apparently: github.com/zadam/trilium/wiki/Markdown but involves shoving a ZIP around.
WYSIWYG based on ckeditor.com/ which is a dependency. It is kind of cool that the view in which you view the output is exactly the same as the one you edit in, and there is no intermediate format, just the HTML.
Math is KaTeX based.
It also runs on the browser via a server: github.com/zadam/trilium/wiki/Server-installation. And they have a paid service for it at: trilium.cc/. Quite impressive.
They have server to from desktop sync: github.com/zadam/trilium/wiki/synchronization. There is no conflict resolution, one of them wins randomly. But they have revision history, and anything lost will be in the revision history. They have so many features it is mind blowing.
Maintainer announced he would be slowing down development since January 2024: github.com/zadam/trilium/issues/4620?ref=selfh.st
Why Wikipedia sucks: Section "Wikipedia".
The most important page of Wikipedia is undoubtedly: en.wikipedia.org/wiki/Wikipedia:Reliable_sources/Perennial_sources which lists the accepted and non accepted sources. Basically, the decision of what is true in this world.
Wikipedia is incredibly picky about copyright. E.g.: en.wikipedia.org/wiki/Wikipedia:Deletion_of_all_fair_use_images_of_living_people because "such portrait could be created". Yes, with a time machine, no problem! This does more harm than good... excessive!
Citing in Wikipedia is painful. Partly because of they have a billion different templates that you have to navigate. They should really have a system where you can easily reuse existing sources across articles! Section "How to use a single source multiple times in a Wikipedia article?"
Video 1.
What Happened To Wikipedia's Founders?
Source.
Video 2.
Inside the Wikimedia Foundation offices by Wikimedia Foundation (2008)
Source.
WikiFauna refers to a classification of different Wiki contributor stereotypes. Some of them originate from the venerable C2 wiki.
Video 1.
What Mental Breakdown Of a Wikipedia Moderator looks like by Vince Vintage
. Source.
Some examples by Ciro Santilli follow.
Of the tutorial-subjectivity type:
Notability constraints, which are are way too strict:
  • even information about important companies can be disputed. E.g. once Ciro Santilli tried to create a page for PsiQuantum, a startup with $650m in funding, and there was a deletion proposal because it did not contain verifiable sources not linked directly to information provided by the company itself: en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/PsiQuantum Although this argument is correct, it is also true about 90% of everything that is on Wikipedia about any company. Where else can you get any information about a B2B company? Their clients are not going to say anything. Lawsuits and scandals are kind of the only possible source... In that case, the page was deleted with 2 votes against vs 3 votes for deletion.
    should we delete this extremely likely useful/correct content or not according to this extremely complex system of guidelines"
    is very similar to Stack Exchange's own Stack Overflow content deletion issues. Ain't Nobody Got Time For That. "Ain't Nobody Got Time for That" actually has a Wiki page: en.wikipedia.org/wiki/Ain%27t_Nobody_Got_Time_for_That. That's notable. Unlike a $600M+ company of course.
    In December 2023 the page was re-created, and seemed to stick: en.wikipedia.org/wiki/Talk:PsiQuantum#Secondary_sources It's just a random going back and forth. Author Ctjk has an interesting background:
    I am a legal official at a major government antitrust agency. The only plausible connection is we regulate tech firms
There are even a Wikis that were created to remove notability constraints: Wiki without notability requirements.
For these reasons reason why Ciro basically only contributes images to Wikipedia: because they are either all in or all out, and you can determine which one of them it is. And this allows images to be more attributable, so people can actually see that it was Ciro that created a given amazing image, thus overcoming Wikipedia's lack of reputation system a little bit as well.
Wikipedia is perfect for things like biographies, geography, or history, which have a much more defined and subjective expository order. But when it comes to "tutorials of how to actually do stuff", which is what mathematics and physics are basically about, Wikipedia has a very hard time to go beyond dry definitions which are only useful for people who already half know the stuff. But to learn from zero, newbies need tutorials with intuition and examples.
Bibliography:
Per-table dumps created with mysqldump and listed at: dumps.wikimedia.org/. Most notably, for the English Wikipedia: dumps.wikimedia.org/enwiki/latest/
A few of the files are not actual tables but derived data, notably dumps.wikimedia.org/enwiki/latest/enwiki-latest-all-titles-in-ns0.gz from Download titles of all Wikipedia articles
The tables are "documented" under: www.mediawiki.org/wiki/Manual:Database_layout, e.g. the central "page" table: www.mediawiki.org/wiki/Manual:Page_table. But in many cases it is impossible to deduce what fields are from those docs.
dumps.wikimedia.org/enwiki/latest/enwiki-latest-category.sql.gz contains a list of categories. It only contains the categories and some counts, but it doesn't contain the subcategories and pages under each category, so it is a bit pointless.
The SQL first defines the table:
CREATE TABLE `category` (
  `cat_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `cat_title` varbinary(255) NOT NULL DEFAULT '',
  `cat_pages` int(11) NOT NULL DEFAULT 0,
  `cat_subcats` int(11) NOT NULL DEFAULT 0,
  `cat_files` int(11) NOT NULL DEFAULT 0,
  PRIMARY KEY (`cat_id`),
  UNIQUE KEY `cat_title` (`cat_title`),
  KEY `cat_pages` (`cat_pages`)
) ENGINE=InnoDB AUTO_INCREMENT=249228235 DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED;
followed by a few humongous inserts:
INSERT INTO `category` VALUES (2,'Unprintworthy_redirects',1597224,20,0),(3,'Computer_storage_devices',88,11,0)
which we can see at: en.wikipedia.org/wiki/Category:Computer_storage_devices
Se see that en.wikipedia.org/wiki/Category:Computer_storage_devices_by_company
so it contains only categories.
We can check this with:
sed -s 's/),/\n/g' enwiki-latest-category.sql | grep Computer_storage_devices
and it shows:
(3,'Computer_storage_devices',88,11,0
(521773,'Computer_storage_devices_by_company',6,6,0
There doesn't seem to be any interlink between the categories, only page and subcategory counts therefore.
Jewish_physicists
Let's observe them in MySQL:
mysql enwiki -e "select page_id, page_namespace, page_title, page_is_redirect from page where page_namespace in (0, 14) and page_title in ('Computer_storage_devices', 'Computer_data_storage')"
outputs:
+----------+----------------+--------------------------+------------------+
| page_id  | page_namespace | page_title               | page_is_redirect |
+----------+----------------+--------------------------+------------------+
|     5300 |              0 | Computer_data_storage    |                0 |
| 42371130 |              0 | Computer_storage_devices |                1 |
|   711721 |             14 | Computer_data_storage    |                0 |
|   895945 |             14 | Computer_storage_devices |                0 |
+----------+----------------+--------------------------+------------------+
mysql enwiki -e "select cl_from, cl_to from categorylinks where cl_from in (5300, 711721, 895945, 42371130)"
gives:
+----------+-----------------------------------------------------------------------+
| cl_from  | cl_to                                                                 |
+----------+-----------------------------------------------------------------------+
|     5300 | All_articles_containing_potentially_dated_statements                  |
|     5300 | Articles_containing_potentially_dated_statements_from_2009            |
|     5300 | Articles_containing_potentially_dated_statements_from_2011            |
|     5300 | Articles_with_GND_identifiers                                         |
|     5300 | Articles_with_NKC_identifiers                                         |
|     5300 | Articles_with_short_description                                       |
|     5300 | Computer_architecture                                                 |
|     5300 | Computer_data_storage                                                 |
|     5300 | Short_description_matches_Wikidata                                    |
|     5300 | Use_dmy_dates_from_June_2020                                          |
|     5300 | Wikipedia_articles_incorporating_text_from_the_Federal_Standard_1037C |
|   711721 | Computer_architecture                                                 |
|   711721 | Computer_data                                                         |
|   711721 | Computer_hardware_by_type                                             |
|   711721 | Data_storage                                                          |
|   895945 | Computer_data_storage                                                 |
|   895945 | Computer_peripherals                                                  |
|   895945 | Recording_devices                                                     |
| 42371130 | Redirects_from_alternative_names                                      |
+----------+-----------------------------------------------------------------------+
So we see that cl_from encodes the parent categories:
  • parent categories of categories:
    • en.wikipedia.org/wiki/Category:Computer_data_storage, which has ID 711721, has parent categories: "Computer hardware by type", "Computer data", "Data storage", "Computer architecture". This matches exactly on the database. These are all encoded on the source code of the page:
      {{DEFAULTSORT:Storage}}
      [[Category:Computer hardware by type]]
      [[Category:Computer data|Storage]]
      [[Category:Data storage|Computer]]
      [[Category:Computer architecture]]
    • en.wikipedia.org/wiki/Category:Computer_storage_devices has parent categories: "Computer data storage", "Recording devices", "Computer peripherals". This matches exactly on the database.
  • parent categories of pages:
    • en.wikipedia.org/wiki/Computer_storage_devices whish is a redirect gets the magic category "Redirects_from_alternative_names", a humongous placeholder with many thousands of pages: en.wikipedia.org/wiki/Category:Redirects_from_alternative_names
    • en.wikipedia.org/wiki/Computer_data_storage shows only two categories onthe web UI: "Computer data storage" and "Computer architecture". Both of these are present on the database and at the end of the source code:
      {{DEFAULTSORT:Computer Data Storage}}
      [[Category:Computer data storage| ]]
      [[Category:Computer architecture]]
      The others appear to be more magic. Two of them we can guess from the templates:
      {{short description|Storage of digital data readable by computers}}
      {{Use dmy dates|date=June 2020}}
      are likely Use_dmy_dates_from_June_2020 and Articles_with_short_description but the rest is more magic and not necessarily present in-source.
So to find all articls and categories under a given category title, say en.wikipedia.org/wiki/Category:Mathematics we can run:
mariadb enwiki -e "select cl_from, cl_to, page_namespace, page_title from categorylinks inner join page on page_namespace in (0, 14) and cl_from = page_id and cl_to = 'Mathematics'"
Definition, anywhere on article, likely ideally as the first usage:
<ref name="myname">{{cite web ...}}</ref>
And then you can use it later on as:
<ref name="myname" />
which automatically expands the exact same thing, or using the shortcut:
{{r|myname}}
To cite multiple pages of a book: en.wikipedia.org/wiki/Wikipedia:Citing_sources#Citing_multiple_pages_of_the_same_source, the best method is to define and use the reference without adding the p or location in cite as:
<ref name="googleStory">{{cite book |title=The Google Story}}</ref>{{rp|p=123}}
Do not set the page in cite, otherwise it shows up on the references. Instead we use the {{rp}} template. And then use the reference with the {{r}} template as:
{{r|googleStory|p=456}}
or for multiple pages:
{{r|googleStory|pp=123, 156-158}}
A good big sample definition:
<ref name="googleStory">{{cite book |last1=Vise |first1=David |author-link1=David A. Vise |last2=Malseed |first2=Mark |author-link2=Mark Malseed |title=The Google Story |date=2008 |publisher=Delacorte Press |url=https://archive.org/details/isbn_9780385342728}}</ref>
There is also title-link to link to a wiki page. But it is incompatible with url= for Internet Archive Open Library links which is a shame.
So, it turns out that Wikipedia does have a (ultra obscure as usual) mechanism for pull requests. You learn a new one every day.
OMG they have that. Slightly slightly overlap with OurBigBook.com.
A 2022 clone of phabricator.wikimedia.org/source/mediawiki.git gives first commits from 2003 by:
  • Lee Daniel Crocker: en.wikipedia.org/wiki/Lee_Daniel_Crocker
    He is best known for rewriting the software upon which Wikipedia runs, to address scalability problems.
    so that gives a good notion of the last major rewrite.
  • Brion Vibber
TODO when was wikipedia open sourced from Nupedia? The ealry days of Wikipedia are quite obscure due to its transition from Nupedia.
Cool tool that allows you to graphically visualize page viewc counts of specific pages. It offers somewhat similar insights to Google Trends.
The homepage shows views of selected pages, e.g. when Google had their 25th birthday: pageviews.wmcloud.org/?project=en.wikipedia.org&platform=all-access&agent=user&redirects=0&start=2023-09-11&end=2023-10-01&pages=Cat|Dog|Larry_Page Larry Page briefly beat "Cat" and "Dog".
/topviews shows the most viewed pages for a given month: pageviews.wmcloud.org/topviews/?project=en.wikipedia.org&platform=all-access&date=2023-08&excludes= It is extremelly epic that XXX: Return of Xander Cage, a 2017 film, is on the top ten of the August 2023 month. The page was around 8th place on a Google search for "xxx": archive.ph/wip/giRY8 at the time. XXXX (beer) was also on the top 20, followed by Sex on 21.
Because of edit wars and encyclopedic tone requirements. See also: Wikipedia.
One thing to note is that Jimmy was a finance worker before starting wikipdia, e.g. he had capital to hire Larry Sanger.
Maybe that's the way to go about it, make money first, and later on change the world.
Starting just after the beginning of the Internet can't hurt either. Though tooling must have been insane back then.
Video 1.
Meet the man behind a third of what's on Wikipedia
. Source.
Open source software engine created for and used by Wikipedia.
Their reference markup is incredibly overengineered, convoluted, and underdocumented, it is unbelivable!
Use the reference:
This is a fact.{{sfn|Schweber|1994|p=487}}
Define the reference:
===Sources===
{{refbegin|2|indent=yes}}
*{{Cite book|author-link=Silvan S. Schweber |title=QED and the Men Who Made It: Dyson, Feynman, Schwinger, and Tomonaga|last=Schweber|first=Silvan S.|location=Princeton|publisher=University Press|year=1994 |isbn=978-0-691-03327-3 |url=https://archive.org/details/qedmenwhomadeitd0000schw/page/492 |url-access=registration}}
{{refend}}
sfn is magic and matches the the author last name and date from the Cite, it is documented at: en.wikipedia.org/wiki/Template:Sfn
Unforutunately, if there are multiple duplicate Cites inline in the article, it will complain that there are multiple definitions, and you have to first factor out the article by replacing all those existing Cite with sfn, and keeping just one Cite at the bottom. What a pain...
You can also link to a specific page of the book, e.g. if it is a book is on Internet Archive Open Library with:
{{sfn|Murray|1997|p=[https://archive.org/details/supermenstory00murr/page/86 86]}}
For multiple pages should use pp= instead of p=. Does not seem to make much difference on the rendered output besides showing p. vs pp., but so be it:
{{sfn|Murray|1997|pp=[https://archive.org/details/supermenstory00murr/page/86 86-87]}}
A really good option to store educational media such as images and video!
Shame that like the rest of Wikimedia, their interface is so clunky and lacking obvious features.
This is basically what Jimmy Wales had originally set out to make Wikipedia, a peer reviewed thing.
But then he noticed the entry barrier was too high while inviding an economist to review an article he wrote, and just made the more open thing instead.
The venerable first wiki.
The pre-Eternal September feeling is palpable.
People could freely comment their thoughts and sign below, making it much closer to what Ciro Santilli wants OurBigBook.com to be. But with upvotes ;-)
Nothing can better encapsulate the nostalgia of early day Internet. Genius at times, banal at others, you will be forever in our hearts!

Articles by others on the same topic (0)

There are currently no matching articles.