Principal component analysis Updated 2025-07-16
is a hyperparameter, and are common choices when doing dataset exploration, as they can be easily visualized on a planar plot.
The mapping is done by projecting all points to a dimensional hyperplane. PCA is an algorithm for choosing this hyperplane and the coordinate system within this hyperplane.
The hyperplane choice is done as follows:
- the hyperplane will have origin at the mean point
- the first axis is picked along the direction of greatest variance, i.e. where points are the most spread out.Intuitively, if we pick an axis of small variation, that would be bad, because all the points are very close to one another on that axis, so it doesn't contain as much information that helps us differentiate the points.
- then we pick a second axis, orthogonal to the first one, and on the direction of second largest variance
- and so on until orthogonal axes are taken
www.sartorius.com/en/knowledge/science-snippets/what-is-principal-component-analysis-pca-and-how-it-is-used-507186 provides an OK-ish example with a concrete context. In there, each point is a country, and the input data is the consumption of different kinds of foods per year, e.g.:so in this example, we would have input points in 4D.
- flour
- dry codfish
- olive oil
- sausage
Suppose that every country consumes the same amount of flour every year. Then, that number doesn't tell us much about which country each point represents (has the least variance), and the first PCA axes would basically never point anywhere near that direction.
Another cool thing is that PCA seems to automatically account for linear dependencies in the data, so it skips selecting highly correlated axes multiple times. For example, suppose that dry codfish and olive oil consumption are very high in Portugal and Spain, but very low in Germany and Poland. Therefore, the variation is very high in those two parameters, and contains a lot of information.
However, suppose that dry codfish consumption is also directly proportional to olive oil consumption. Because of this, it would be kind of wasteful if we selected:since the information about codfish already tells us the olive oil. PCA apparently recognizes this, and instead picks the first axis at a 45 degree angle to both dry codfish and olive oil, and then moves on to something else for the second axis.
Project Zomboid Updated 2025-07-16
This game is quite detailed: www.youtube.com/watch?v=w4Jmqp8a_bU
The orthogonal group is the group of all matrices with orthonormal rows and orthonormal columns Updated 2025-07-16
Or equivalently, the set of rows is orthonormal, and so is the set of columns. TODO proof that it is equivalent to the orthogonal group is the group of all matrices that preserve the dot product.
Tinker Tailor Soldier Spy (TV series) Updated 2025-07-16
- 1: Jim Prideaux captured. Some ex-colleague invites Smiley to dinner and keeps asking how incompetent people like Alleline climbed to the top of the Circus. Smiley recalled to service to meet Ricki Tarr.
- 2: Ricki Tarr tells his story to Smiley. Peter Guillam starts stealing material from the Circus, find missing page on the communication officer list. Smiley sets up his investigation operation.
- 3: Smiley meets Connie who tells that she was fired for suspecting Poliakov. Flashbacks show the ousting of Control and Smiley.
- 4: Guillam steals more material from the circus. While doing that, he is called by the top officers to inquire about Ricki Tarr being in England, which they suspect because they discovered that his family has come.
- 5: Jim Prideaux tells his story to Smiley, who cannot easily access the Circus reports about it. When he is returned to England, there was basically no debriefing, and Esterhase already knew about the Tinker Tailor codenames, presumably through Merlin.
- 6: Smiley hears the story of yet another ousted man, who heard the Russians knew in advance about Jim Prideaux' coming. Toby Esterhase dismissed him for alcoholism.
Vector graphics Updated 2025-07-16
Wikipedia analytics Updated 2025-07-16
Higgs boson Updated 2025-07-16
Mark 17 nuclear bomb Updated 2025-07-16
The Principles of Quantum Mechanics by Paul Dirac (1930) Updated 2025-07-16
Wikipedia CatTree Updated 2025-07-16
This mini-project walks the category hierarchy Wikipedia dumps and dumps them in various simple formats, HTML being the most interesting!
Mathematics dump of Wikipedia CatTree
. Source. Box2D Updated 2025-07-16
Wikipedia dumps Updated 2025-07-16
Per-table dumps created with mysqldump and listed at: dumps.wikimedia.org/. Most notably, for the English Wikipedia: dumps.wikimedia.org/enwiki/latest/
A few of the files are not actual tables but derived data, notably dumps.wikimedia.org/enwiki/latest/enwiki-latest-all-titles-in-ns0.gz from Download titles of all Wikipedia articles
The tables are "documented" under: www.mediawiki.org/wiki/Manual:Database_layout, e.g. the central "page" table: www.mediawiki.org/wiki/Manual:Page_table. But in many cases it is impossible to deduce what fields are from those docs.
Wiley (publisher) Updated 2025-07-16
William Shockley Updated 2025-07-16
Willis Lamb Updated 2025-07-16
Quantum compiler benchmark Updated 2025-07-16
These appear to be benchmarks that don't involve running anything concretely, just compiling and likely then counting gates:
The Spiders' Web: Britain's Second Empire Updated 2025-07-16
2017. Directed by Michael Oswald. Adam Curtis vibes.
Some notable points:
- the role of the British Overseas Territories as tax havens
- the role of the City of London in setting economic policy
- the role of trusts
XPath Updated 2025-07-16
E. Coli K-12 MG1655 gene thrB Updated 2025-07-16
Immediately follows e. Coli K-12 MG1655 gene thrA,
Part of E. Coli K-12 MG1655 operon thrLABC.
House mouse Updated 2025-07-16
Unlisted articles are being shown, click here to show only listed articles.


