- 2023 vimalabs.github.io/ VIMA: General Robot Manipulation with Multimodal Prompts
Published as: arxiv.org/pdf/2304.03442.pdf Generative Agents: Interactive Simulacra of Human Behavior by Park et al.
This was the Holy Grail as of 2023, when text-to-image started to really take off, but text-to-video was miles behind.
The most classic thing he did perhaps was creating the LeNet neural network and using it on the MNIST dataset to recognize hand-written digits circ 1998.
He does lots of little experiments which is cool.
As highlighted e.g. at Human Compatible by Stuart J. Russell (2019), this AI alignment intrinsically linked to the idea of utility in economy.
There are two main ways to try and reach AGI:Which one of them to take is of of the most important technological questions of humanity according to Ciro Santilli
- AI training robot: expensive, slow, but realistic world
- AI training game: faster, less expensive, but possibly non-realistic-enough realistic
There is also an intermediate area of research/engineering where people try to first simulate the robot and its world realistically, use the simulation for training, and then transfer the simulated training to real robots, see e.g.: realistic robotics simulation.
It would have to understand that it must keep its bank account high to buy power.
In that scenario, Internet bandwidth would likely be its most precious resources, as that is how it would interact with the world to learn from it and make money.
Compute power and storage would come next as resources.
And of course, once it got to cloud computing, which might be immediately and thus invalidate this experiment, things would just go nuts more and more.
Terrible name, but very interesting dataset:
GitHub describes the input quite well:
The model takes as input a RGB image from the robot workspace camera and a task string describing the task that the robot is supposed to perform.What task the model should perform is communicated to the model purely through the task string. The image communicates to the model the current state of the world, i.e. assuming the model runs at three hertz, every 333 milliseconds, we feed the latest RGB image from a robot workspace camera into the model to obtain the next action to take.
TODO: how is the scenario specified?
TODO: any simulation integration to it?
 Pinned article: Introduction to the OurBigBook Project
Welcome to the OurBigBook Project! Our goal is to create the perfect publishing platform for STEM subjects, and get university-level students to write the best free STEM tutorials ever.
Everyone is welcome to create an account and play with the site: ourbigbook.com/go/register. We belive that students themselves can write amazing tutorials, but teachers are welcome too. You can write about anything you want, it doesn't have to be STEM or even educational. Silly test content is very welcome and you won't be penalized in any way. Just keep it legal!
Intro to OurBigBook
. Source. We have two killer features:
- topics: topics group articles by different users with the same title, e.g. here is the topic for the "Fundamental Theorem of Calculus" ourbigbook.com/go/topic/fundamental-theorem-of-calculusArticles of different users are sorted by upvote within each article page. This feature is a bit like:- a Wikipedia where each user can have their own version of each article
- a Q&A website like Stack Overflow, where multiple people can give their views on a given topic, and the best ones are sorted by upvote. Except you don't need to wait for someone to ask first, and any topic goes, no matter how narrow or broad
 This feature makes it possible for readers to find better explanations of any topic created by other writers. And it allows writers to create an explanation in a place that readers might actually find it.Figure 1. Screenshot of the "Derivative" topic page. View it live at: ourbigbook.com/go/topic/derivativeVideo 2. OurBigBook Web topics demo. Source.
- local editing: you can store all your personal knowledge base content locally in a plaintext markup format that can be edited locally and published either:This way you can be sure that even if OurBigBook.com were to go down one day (which we have no plans to do as it is quite cheap to host!), your content will still be perfectly readable as a static site.- to OurBigBook.com to get awesome multi-user features like topics and likes
- as HTML files to a static website, which you can host yourself for free on many external providers like GitHub Pages, and remain in full control
 Figure 3. Visual Studio Code extension installation.Figure 4. Visual Studio Code extension tree navigation.Figure 5. Web editor. You can also edit articles on the Web editor without installing anything locally.Video 3. Edit locally and publish demo. Source. This shows editing OurBigBook Markup and publishing it using the Visual Studio Code extension.Video 4. OurBigBook Visual Studio Code extension editing and navigation demo. Source.
- Infinitely deep tables of contents:
All our software is open source and hosted at: github.com/ourbigbook/ourbigbook
Further documentation can be found at: docs.ourbigbook.com
Feel free to reach our to us for any help or suggestions: docs.ourbigbook.com/#contact







