Intel department Created 2025-02-26 Updated 2025-07-16
Intel hardware Created 2025-02-26 Updated 2025-07-16
Intel employee Created 2025-02-26 Updated 2025-07-16
École Polytechnique alumnus by year Created 2025-02-26 Updated 2025-07-16
École Polytechnique students identify their academic year, or "promotion" in French, by start year date.
For example, Ciro Santilli's year started in 2009, though as a foreign student he arrived only at the start of 2010, and Ciro's promotion is usually known just as X09. And as the century barrier is broken we'll start to need to specify as X2009 one day.
List of notable alumni:
Rob Pike Created 2025-02-26 Updated 2025-07-16
University of Milan Created 2025-02-26 Updated 2025-07-16
PhD thesis Created 2025-02-26 Updated 2025-07-16
WebGraph (software) Created 2025-02-26 Updated 2025-07-16
A quick hands-on introduction to the software by Ciro Santilli can be found at: github.com/cirosantilli/cirosantilli.github.io/issues/198
BVGraph Created 2025-02-26 Updated 2025-07-16
The native file format of WebGraph.
It is a binary format and highly storage efficient.
It is for example what Common Crawl web graph currently dumps to as of 2025, see e.g.: data.commoncrawl.org/projects/hyperlinkgraph/cc-main-2024-25-dec-jan-feb/index.html
TODO meaning of "BV"?
A quick hands-on introduction to the format by Ciro Santilli can be found at: github.com/cirosantilli/cirosantilli.github.io/issues/198
Cancer research Created 2025-02-26 Updated 2025-07-16
Graph file format Created 2025-02-26 Updated 2025-07-16
Updates Quick fun with the Common Crawl web graph Created 2025-02-26 Updated 2025-07-16
github.com/cirosantilli/cirosantilli.github.io/issues/198. Previously at: stackoverflow.com/questions/31321009/best-more-standard-graph-representation-file-format-graphson-gexf-graphml/79467334#79467334 but Stack Overflow fucking deleted the question.
My general motivation for this is that a PageRank-like algorithm could be useful for more accurate user and article ranking on OurBigBook, see: Section "PageRank-like ranking"
But it could also be just generally cool to apply it to other graph datasets, e.g. for computing an Wikipedia internal PageRank.
Then I had a look at the Common Crawl web graph data to see if I could easily calculate it myself, and... they already have it! See: Section "Common Crawl web graph official PageRank"
Their graph dumps are in BVGraph graph file format, which is the native format of the WebGraph framework, which implements the format and algorithms such as PageRank.
The only thing I miss is a command line interface to calculate the PageRank. That would be so awesome.
Announcements:
In cc-main-2024-25-dec-jan-feb-domain-ranks.txt:
cirosantilli.com
was ranked ~453kourbigbook.com
was at ~606k
White-East asian mixed Created 2025-02-23 Updated 2025-07-16
Multiracial Created 2025-02-23 Updated 2025-07-16
White people Created 2025-02-23 Updated 2025-07-16
Race (human categorization) Created 2025-02-23 Updated 2025-07-16
Atherton, California Created 2025-02-23 Updated 2025-07-16
Municipality in San Mateo County Created 2025-02-23 Updated 2025-07-16
San Mateo County Created 2025-02-23 Updated 2025-07-16
Municipality in Illinois Created 2025-02-23 Updated 2025-07-16
Unlisted articles are being shown, click here to show only listed articles.