Mutation by Ciro Santilli 35 Updated +Created
Principal component analysis by Ciro Santilli 35 Updated +Created
Given a bunch of points in dimensions, PCA maps those points to a new dimensional space with .
is a hyperparameter, and are common choices when doing dataset exploration, as they can be easily visualized on a planar plot.
The mapping is done by projecting all points to a dimensional hyperplane. PCA is an algorithm for choosing this hyperplane and the coordinate system within this hyperplane.
The hyperplane choice is done as follows:
  • the hyperplane will have origin at the mean point
  • the first axis is picked along the direction of greatest variance, i.e. where points are the most spread out.
    Intuitively, if we pick an axis of small variation, that would be bad, because all the points are very close to one another on that axis, so it doesn't contain as much information that helps us differentiate the points.
  • then we pick a second axis, orthogonal to the first one, and on the direction of second largest variance
  • and so on until orthogonal axes are taken
www.sartorius.com/en/knowledge/science-snippets/what-is-principal-component-analysis-pca-and-how-it-is-used-507186 provides an OK-ish example with a concrete context. In there, each point is a country, and the input data is the consumption of different kinds of foods per year, e.g.:
  • flour
  • dry codfish
  • olive oil
  • sausage
so in this example, we would have input points in 4D.
The question is then: we want to be able to identify the country by what they eat.
Suppose that every country consumes the same amount of flour every year. Then, that number doesn't tell us much about which country each point represents (has the least variance), and the first PCA axes would basically never point anywhere near that direction.
Another cool thing is that PCA seems to automatically account for linear dependencies in the data, so it skips selecting highly correlated axes multiple times. For example, suppose that dry codfish and olive oil consumption are very high in Portugal and Spain, but very low in Germany and Poland. Therefore, the variation is very high in those two parameters, and contains a lot of information.
However, suppose that dry codfish consumption is also directly proportional to olive oil consumption. Because of this, it would be kind of wasteful if we selected:
  • dry codfish as the first axis
  • olive oil as the second axis
since the information about codfish already tells us the olive oil. PCA apparently recognizes this, and instead picks the first axis at a 45 degree angle to both dry codfish and olive oil, and then moves on to something else for the second axis.
We can see that much like the rest of machine learning, PCA can be seen as a form of compression.
Scalar matrix by Ciro Santilli 35 Updated +Created
MongoDB by Ciro Santilli 35 Updated +Created
List databases:
echo 'show dbs' | mongo
Delete database:
use mydb
db.dropDatabase()
or:
echo 'db.dropDatabase()' | mongo mydb
View collections within a database:
echo 'db.getCollectionNames()' | mongo mydb
Show all data from one of the collections: stackoverflow.com/questions/24985684/mongodb-show-all-contents-from-all-collections
echo 'db.collectionName.find()' | mongo mydb
Hierarchical Data Format by Ciro Santilli 35 Updated +Created
Reclaim the Streets by Ciro Santilli 35 Updated +Created
Crystal system by Ciro Santilli 35 Updated +Created
Random number generation by Ciro Santilli 35 Updated +Created
Internet privacy by Ciro Santilli 35 Updated +Created
List of cryptocurrency exchanges by Ciro Santilli 35 Updated +Created
Contravariant vector by Ciro Santilli 35 Updated +Created
Factorio by Ciro Santilli 35 Updated +Created
Application programming interface by Ciro Santilli 35 Updated +Created
Computer security researcher by Ciro Santilli 35 Updated +Created
Ciro Santilli found out that he likes computer security researchers and vice versa.
It's a bit the same reason why he likes physicists: you can't bullshit with security.
You can't just talk nice and hope for people to belive you.
You can't not try to break things and just keep everyone happy in their false illusion of safety.
You can't do a half job.
If you do any of that, you will get your ass handed to you in a little gift bag.
Password by Ciro Santilli 35 Updated +Created
Malware by Ciro Santilli 35 Updated +Created
Scott Aaronson by Ciro Santilli 35 Updated +Created
Mathematics illustration software by Ciro Santilli 35 Updated +Created
Many plotting software can be used to create mathematics illustrations. They just tend to have more data-oriented rather than explanatory-oriented output.
Mainframe computer by Ciro Santilli 35 Updated +Created
Tape drive by Ciro Santilli 35 Updated +Created
One of the most enduring forms of storage! Started in the 1950s, but still used in the 2020s as the cheapest (and slowest access) archival method. Robot arms are needed to load and read them nowadays.
Video 1.
Web camera mounted insite an IBM TS4500 tape library by lkaptoor (2020)
Source. Footage dated 2018.

There are unlisted articles, also show them or only show them.