In 2017 apparently they've started making their own Web Graphs, i.e. they parse the HTML and extract the graph of what links to what.
Edit: actually, they already calculate PageRank for us!!! Fantastic!!! Main section: Section "Common Crawl web graph official PageRank".
A quick exploration of the graph can be seen at: github.com/cirosantilli/cirosantilli.github.io/issues/198
Their source code is at: github.com/commoncrawl/cc-webgraph
École Polytechnique students identify their academic year, or "promotion" in French, by start year date.
For example, Ciro Santilli's year started in 2009, though as a foreign student he arrived only at the start of 2010, and Ciro's promotion is usually known just as X09. And as the century barrier is broken we'll start to need to specify as X2009 one day.
List of notable alumni:
A quick hands-on introduction to the software by Ciro Santilli can be found at: github.com/cirosantilli/cirosantilli.github.io/issues/198
The native file format of WebGraph.
It is a binary format and highly storage efficient.
It is for example what Common Crawl web graph currently dumps to as of 2025, see e.g.: data.commoncrawl.org/projects/hyperlinkgraph/cc-main-2024-25-dec-jan-feb/index.html
TODO meaning of "BV"?
A quick hands-on introduction to the format by Ciro Santilli can be found at: github.com/cirosantilli/cirosantilli.github.io/issues/198
Updates Quick fun with the Common Crawl web graph by
Ciro Santilli 40 Created 2025-02-26 Updated 2025-11-05
github.com/cirosantilli/cirosantilli.github.io/issues/198. Previously at: stackoverflow.com/questions/31321009/best-more-standard-graph-representation-file-format-graphson-gexf-graphml/79467334#79467334 but Stack Overflow fucking deleted the question.
My general motivation for this is that a PageRank-like algorithm could be useful for more accurate user and article ranking on OurBigBook, see: Section "PageRank-like ranking"
But it could also be just generally cool to apply it to other graph datasets, e.g. for computing an Wikipedia internal PageRank.
Then I had a look at the Common Crawl web graph data to see if I could easily calculate it myself, and... they already have it! See: Section "Common Crawl web graph official PageRank"
Their graph dumps are in BVGraph graph file format, which is the native format of the WebGraph framework, which implements the format and algorithms such as PageRank.
The only thing I miss is a command line interface to calculate the PageRank. That would be so awesome.
Announcements:
In cc-main-2024-25-dec-jan-feb-domain-ranks.txt:
cirosantilli.comwas ranked ~453kourbigbook.comwas at ~606k
Get Advice From A Certified Bitcoin Recovery Specialist / Captain WebGenesis by
Rolph Shmidt 0 Created 2025-02-26 Updated 2025-04-18
In an increasingly digital world, online scams have become more sophisticated, leaving many victims feeling hopeless and defrauded. I was one of those individuals, but thanks to Captain WebGenesis, I was able to recover my lost funds and regain my peace of mind. Captain WebGenesis has emerged as a beacon of hope for those who have been scammed. As a leading Bitcoin recovery company, they specialize in helping victims retrieve their lost cryptocurrency through expert investigation and advanced recovery techniques.
They can be reached through the link below.
Web: (Captainwebgenesis.com)
Web: (Captainwebgenesis.com)
Welcome to my home page!
BEST BITCOIN RECOVERY EXPERT TO RECOVER LOST BITCOIN; USDT RECOVERY EXPERT VISIT CYBER CONSTABLE INTELLIGENCE by
Felix John 0 Created 2025-02-26 Updated 2025-04-18
Running Greenfield Grocers, my supermarket business, has always been my passion. I'm constantly looking for ways to grow both my business and wealth, which led me to an investment opportunity that seemed too good to pass up. A Bitcoin platform promised high returns, and after what I believed was thorough research, I decided to invest 195,000. Everything seemed legitimate at first, and I felt confident in my decision. Unfortunately, my optimism quickly turned to regret. The platform began to delay withdrawals. The excuses they gave were vague and unconvincing, but I held on, hoping the issues were temporary. However, as time passed, the platform became even less responsive, and eventually, it disappeared entirely. I was left with nothing no platform, no support, and no way to access my funds. The emotional and financial toll was immense. Losing that amount of money felt like both a personal and professional blow, leaving me questioning my judgment and feeling utterly defeated. As I struggled to accept what had happened, I spoke with a business associate who had faced a similar scam in the past. At first, I was hesitant about taking any further steps, but my associate urged me not to give up. He mentioned that the police had successfully referred him to a company called Cyber Constable Intelligence when he was in a similar situation. After hearing his positive experience, I decided to take his advice and reach out to Cyber Constable Intelligence. From the very first conversation, I knew I had made the right choice. The team at Cyber Constable Intelligence was professional, empathetic, and incredibly knowledgeable. They took the time to explain the entire recovery process and reassured me that they had the necessary tools and experience to recover my lost funds. Their approach gave me hope during a time when I felt hopeless. After 2 weeks of dedicated work, I received the incredible news that Cyber Constable Intelligence had successfully recovered the full 195,000 I had lost. The sense of relief was overwhelming, and I can’t describe the gratitude I felt. The team’s expertise and persistence turned a dire situation into a victory. I will forever be thankful to Cyber Constable Intelligence Without their support, I’m not sure where I’d be today. This experience has not only taught me to be more cautious but has also shown me the power of finding the right people to help when you need it most.
Here's Their Info Below
WhatsApp: 1 (252) 378-7611
mail: cyberconstable@coolsite.net
Website info; www.cyberconstableintelligence.com
Here's Their Info Below
WhatsApp: 1 (252) 378-7611
mail: cyberconstable@coolsite.net
Website info; www.cyberconstableintelligence.com
Welcome to my home page!
Pinned article: Introduction to the OurBigBook Project
Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
Intro to OurBigBook
. Source. We have two killer features:
- topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculusArticles of different users are sorted by upvote within each article page. This feature is a bit like:
- a Wikipedia where each user can have their own version of each article
- a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it.Figure 1. Screenshot of the "Derivative" topic page. View it live at: ourbigbook.com/go/topic/derivativeVideo 2. OurBigBook Web topics demo. Source. - local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.
- to OurBigBook.com to get awesome multi-user features like topics and likes
- as HTML files to a static website, which you can host yourself for free on many external providers like GitHub Pages, and remain in full control
Figure 3. Visual Studio Code extension installation.Figure 4. Visual Studio Code extension tree navigation.Figure 5. Web editor. You can also edit articles on the Web editor without installing anything locally.Video 3. Edit locally and publish demo. Source. This shows editing OurBigBook Markup and publishing it using the Visual Studio Code extension.Video 4. OurBigBook Visual Studio Code extension editing and navigation demo. Source. - Infinitely deep tables of contents:
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact







