{c}

Terrible name, but very interesting dataset:
* https://robotics-transformer-x.github.io/
* https://github.com/google-deepmind/open_x_embodiment

GitHub describes the input quite well:
> The model takes as input a RGB image from the robot workspace camera and a task string describing the task that the robot is supposed to perform.

  What task the model should perform is communicated to the model purely through the task string. The image communicates to the model the current state of the world, i.e. assuming the model runs at three hertz, every 333 milliseconds, we feed the latest RGB image from a robot workspace camera into the model to obtain the next action to take.

TODO: how is the scenario specified?

TODO: any <AI training robot simulation>[simulation] integration to it?

\Image[https://web.archive.org/web/20250209172539if_/https://raw.githubusercontent.com/google-deepmind/open_x_embodiment/main/imgs/teaser.png]
{height=600}


 Open X-Embodiment (source code)