Zhang Ziqian by Ciro Santilli 35 Updated +Created
Maser by Ciro Santilli 35 Updated +Created
Video 1.
Principles of the Optical Maser by Bell Labs
. Source. Date: 1963.
Yann LeCun by Ciro Santilli 35 Updated +Created
1914 Nobel Prize in Physics by Ciro Santilli 35 Updated +Created
Not only did this open the way for X-ray crystallography, it more fundamentally clarified the nature of X-rays as being electromagnetic radiation, and helped further establish the atomic theory.
Quantization as an Eigenvalue Problem by Ciro Santilli 35 Updated +Created
This paper appears to calculate the Schrödinger equation solution for the hydrogen atom.
TODO is this the original paper on the Schrödinger equation?
Published on Annalen der Physik in 1926.
Open access in German at: onlinelibrary.wiley.com/doi/10.1002/andp.19263840404 which gives volume 384, Issue 4, Pages 361-376. Kudos to Wiley for that. E.g. Nature did not have similar policies as of 2023.
This paper may have fallen into the public domain in the US in 2022! On the Internet Archive we can see scans of the journal that contains it at: ia903403.us.archive.org/29/items/sim_annalen-der-physik_1926_79_contents/sim_annalen-der-physik_1926_79_contents.pdf. Ciro Santilli extracted just the paper to: commons.wikimedia.org/w/index.php?title=File%3AQuantisierung_als_Eigenwertproblem.pdf. It is not as well processed as the Wiley one, but it is of 100% guaranteed clean public domain provenance! TODO: hmmm, it may be public domain in the USA but not Germany, where 70 years after author deaths rules, and Schrodinger died in 1961, so it may be up to 2031 in that country... messy stuff. There's also the question of wether copyright is was tranferred to AdP at publication or not.
Contains formulas such as the Schrödinger equation solution for the hydrogen atom (1''):
where:
  • In order for there to be numerical agreement, must have the value
  • , are the charge and mass of the electron
Wikipedia dumps by Ciro Santilli 35 Updated +Created
Per-table dumps created with mysqldump and listed at: dumps.wikimedia.org/. Most notably, for the English Wikipedia: dumps.wikimedia.org/enwiki/latest/
A few of the files are not actual tables but derived data, notably dumps.wikimedia.org/enwiki/latest/enwiki-latest-all-titles-in-ns0.gz from Download titles of all Wikipedia articles
The tables are "documented" under: www.mediawiki.org/wiki/Manual:Database_layout, e.g. the central "page" table: www.mediawiki.org/wiki/Manual:Page_table. But in many cases it is impossible to deduce what fields are from those docs.
Wiley (publisher) by Ciro Santilli 35 Updated +Created
Linux CLI HOWTO by Ciro Santilli 35 Updated +Created
PDF tool by Ciro Santilli 35 Updated +Created
Project Gutenberg by Ciro Santilli 35 Updated +Created
enwiki-latest-category.sql by Ciro Santilli 35 Updated +Created
dumps.wikimedia.org/enwiki/latest/enwiki-latest-category.sql.gz contains a list of categories. It only contains the categories and some counts, but it doesn't contain the subcategories and pages under each category, so it is a bit pointless.
The SQL first defines the table:
CREATE TABLE `category` (
  `cat_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `cat_title` varbinary(255) NOT NULL DEFAULT '',
  `cat_pages` int(11) NOT NULL DEFAULT 0,
  `cat_subcats` int(11) NOT NULL DEFAULT 0,
  `cat_files` int(11) NOT NULL DEFAULT 0,
  PRIMARY KEY (`cat_id`),
  UNIQUE KEY `cat_title` (`cat_title`),
  KEY `cat_pages` (`cat_pages`)
) ENGINE=InnoDB AUTO_INCREMENT=249228235 DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED;
followed by a few humongous inserts:
INSERT INTO `category` VALUES (2,'Unprintworthy_redirects',1597224,20,0),(3,'Computer_storage_devices',88,11,0)
which we can see at: en.wikipedia.org/wiki/Category:Computer_storage_devices
Se see that en.wikipedia.org/wiki/Category:Computer_storage_devices_by_company
so it contains only categories.
We can check this with:
sed -s 's/),/\n/g' enwiki-latest-category.sql | grep Computer_storage_devices
and it shows:
(3,'Computer_storage_devices',88,11,0
(521773,'Computer_storage_devices_by_company',6,6,0
There doesn't seem to be any interlink between the categories, only page and subcategory counts therefore.
enwiki-latest-categorylinks.sql by Ciro Santilli 35 Updated +Created
Download all Wikipedia categories by Ciro Santilli 35 Updated +Created
Jewish_physicists
Let's observe them in MySQL:
mysql enwiki -e "select page_id, page_namespace, page_title, page_is_redirect from page where page_namespace in (0, 14) and page_title in ('Computer_storage_devices', 'Computer_data_storage')"
outputs:
+----------+----------------+--------------------------+------------------+
| page_id  | page_namespace | page_title               | page_is_redirect |
+----------+----------------+--------------------------+------------------+
|     5300 |              0 | Computer_data_storage    |                0 |
| 42371130 |              0 | Computer_storage_devices |                1 |
|   711721 |             14 | Computer_data_storage    |                0 |
|   895945 |             14 | Computer_storage_devices |                0 |
+----------+----------------+--------------------------+------------------+
mysql enwiki -e "select cl_from, cl_to from categorylinks where cl_from in (5300, 711721, 895945, 42371130)"
gives:
+----------+-----------------------------------------------------------------------+
| cl_from  | cl_to                                                                 |
+----------+-----------------------------------------------------------------------+
|     5300 | All_articles_containing_potentially_dated_statements                  |
|     5300 | Articles_containing_potentially_dated_statements_from_2009            |
|     5300 | Articles_containing_potentially_dated_statements_from_2011            |
|     5300 | Articles_with_GND_identifiers                                         |
|     5300 | Articles_with_NKC_identifiers                                         |
|     5300 | Articles_with_short_description                                       |
|     5300 | Computer_architecture                                                 |
|     5300 | Computer_data_storage                                                 |
|     5300 | Short_description_matches_Wikidata                                    |
|     5300 | Use_dmy_dates_from_June_2020                                          |
|     5300 | Wikipedia_articles_incorporating_text_from_the_Federal_Standard_1037C |
|   711721 | Computer_architecture                                                 |
|   711721 | Computer_data                                                         |
|   711721 | Computer_hardware_by_type                                             |
|   711721 | Data_storage                                                          |
|   895945 | Computer_data_storage                                                 |
|   895945 | Computer_peripherals                                                  |
|   895945 | Recording_devices                                                     |
| 42371130 | Redirects_from_alternative_names                                      |
+----------+-----------------------------------------------------------------------+
So we see that cl_from encodes the parent categories:
  • parent categories of categories:
    • en.wikipedia.org/wiki/Category:Computer_data_storage, which has ID 711721, has parent categories: "Computer hardware by type", "Computer data", "Data storage", "Computer architecture". This matches exactly on the database. These are all encoded on the source code of the page:
      {{DEFAULTSORT:Storage}}
      [[Category:Computer hardware by type]]
      [[Category:Computer data|Storage]]
      [[Category:Data storage|Computer]]
      [[Category:Computer architecture]]
    • en.wikipedia.org/wiki/Category:Computer_storage_devices has parent categories: "Computer data storage", "Recording devices", "Computer peripherals". This matches exactly on the database.
  • parent categories of pages:
    • en.wikipedia.org/wiki/Computer_storage_devices whish is a redirect gets the magic category "Redirects_from_alternative_names", a humongous placeholder with many thousands of pages: en.wikipedia.org/wiki/Category:Redirects_from_alternative_names
    • en.wikipedia.org/wiki/Computer_data_storage shows only two categories onthe web UI: "Computer data storage" and "Computer architecture". Both of these are present on the database and at the end of the source code:
      {{DEFAULTSORT:Computer Data Storage}}
      [[Category:Computer data storage| ]]
      [[Category:Computer architecture]]
      The others appear to be more magic. Two of them we can guess from the templates:
      {{short description|Storage of digital data readable by computers}}
      {{Use dmy dates|date=June 2020}}
      are likely Use_dmy_dates_from_June_2020 and Articles_with_short_description but the rest is more magic and not necessarily present in-source.
So to find all articls and categories under a given category title, say en.wikipedia.org/wiki/Category:Mathematics we can run:
mariadb enwiki -e "select cl_from, cl_to, page_namespace, page_title from categorylinks inner join page on page_namespace in (0, 14) and cl_from = page_id and cl_to = 'Mathematics'"
csvkit by Ciro Santilli 35 Updated +Created
Lots of features, but slow because written in Python. A faster version may be csvtools. Also some annoyances like obtuse header handing and missing features like grep + cut in one go: csvgrep and select column in csvkit.
csvtool by Ciro Santilli 35 Updated +Created
A compiled executable under /usr/bin/csvtool, has an Ubuntu 23.04 package: manpages.ubuntu.com/manpages/lunar/en/man1/csvtool.1.html
There seems to be no sane filtering mechanism however: stackoverflow.com/questions/46540752/using-csvtool-call-to-filter-csv-in-bash
csvtools by Ciro Santilli 35 Updated +Created
A fast version of a somewhat subset of csvkit, written in C.
Build failed with undefined reference to pcre_config on Ubuntu 23.04: github.com/DavyLandman/csvtools/issues/18
Unfortunately it is lacking some basic options, like optional header + selecting column by index on csvgrep (though csvcut has it). The project seems kind of dead.
Also unclear if it allows to filter + print only selected columns.

There are unlisted articles, also show them or only show them.