GenBank by Ciro Santilli 35 Updated +Created
Astrophysics by Ciro Santilli 35 Updated +Created
A fancy name for astronomy ;-)
CIFAR-10 by Ciro Santilli 35 Updated +Created
60,000 32x32 color images in 10 different classes: airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.
TODO release date.
This dataset can be thought of as an intermediate between the simplicity of MNIST, and a more full blown ImageNet.
ImageNet subset by Ciro Santilli 35 Updated +Created
Subset generators:
Unfortunately, since ImageNet is a closed standard no one can upload such pre-made subsets, forcing everybody to download the full dataset, in ImageNet1k, which is huge!
ImageNet 2015 by Ciro Santilli 35 Updated +Created
COCO dataset by Ciro Santilli 35 Updated +Created
From cocodataset.org/:
  • 330K images (>200K labeled)
  • 1.5 million object instances
  • 80 object categories
  • 91 stuff categories
  • 5 captions per image. A caption is a short textual description of the image.
So they have relatively few object labels, but their focus seems to be putting a bunch of objects on the same image. E.g. they have 13 cat plus pizza photos. Searching for such weird combinations is kind of fun.
Their official dataset explorer is actually good: cocodataset.org/#explore
And the objects don't just have bounding boxes, but detailed polygons.
Also, images have captions describing the relation between objects:
a black and white cat standing on a table next to a pizza.
Epic.
This dataset is kind of cool.
COCO 2017 by Ciro Santilli 35 Updated +Created
This is the one used on MLperf v2.1 ResNet, likely one of the most popular choices out there.
2017 challenge subset:
  • train: 118k images, 18GB
  • validation: 5k images, 1GB
  • test: 41k images, 6GB
Open Images dataset by Ciro Santilli 35 Updated +Created
TODO vs COCO dataset.
As of v7:
The images and annotations are both under CC BY, with Google as the copyright holder.
Optical character recognition by Ciro Santilli 35 Updated +Created
Machine learning company by Ciro Santilli 35 Updated +Created
Hugging Face by Ciro Santilli 35 Updated +Created
Interesting website, hosts mostly:
Ontology by Ciro Santilli 35 Updated +Created
Knowledge graph by Ciro Santilli 35 Updated +Created
Many people believe that knowledge graphs are a key element of AGI: Knowledge graph as a component of AGI.
Bibligraphy:
Knowledge graph as a component of AGI by Ciro Santilli 35 Updated +Created
Related:

Pinned article: ourbigbook/introduction-to-the-ourbigbook-project

Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
Video 1.
Intro to OurBigBook
. Source.
We have two killer features:
  1. topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculus
    Articles of different users are sorted by upvote within each article page. This feature is a bit like:
    • a Wikipedia where each user can have their own version of each article
    • a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
    This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it.
    Figure 1.
    Screenshot of the "Derivative" topic page
    . View it live at: ourbigbook.com/go/topic/derivative
    Video 2.
    OurBigBook Web topics demo
    . Source.
  2. local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:
    • to OurBigBook.com to get awesome multi-user features like topics and likes
    • as HTML files to a static website, which you can host yourself for free on many external providers like GitHub Pages, and remain in full control
    This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.
    Figure 5. . You can also edit articles on the Web editor without installing anything locally.
    Video 3.
    Edit locally and publish demo
    . Source. This shows editing OurBigBook Markup and publishing it using the Visual Studio Code extension.
    Video 4.
    OurBigBook Visual Studio Code extension editing and navigation demo
    . Source.
  3. https://raw.githubusercontent.com/ourbigbook/ourbigbook-media/master/feature/x/hilbert-space-arrow.png
  4. Infinitely deep tables of contents:
    Figure 6.
    Dynamic article tree with infinitely deep table of contents
    .
    Descendant pages can also show up as toplevel e.g.: ourbigbook.com/cirosantilli/chordate-subclade
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact