Source: cirosantilli/ciro-s-2d-reinforcement-learning-games

= Ciro's 2D reinforcement learning games
{c}
{tag=AI training game}

= Large cohesive game world for robotic-like artificial intelligence development
{synonym}

Prototype: https://github.com/cirosantilli/Urho3D-cheat

Prior art research: https://github.com/cirosantilli/awesome-reinforcement-learning-games

\Video[https://youtube.com/watch?v=j_fl4xoGTKU]
{title=Top Down 2D Continuous Game with <Urho3D> <C++> <Simple DirectMedia Layer>[SDL] and <Box2D> for <Reinforcement learning> by <Ciro Santilli> (2018)}
{description=Source code at: https://github.com/cirosantilli/Urho3D-cheat[].}

\Image[https://raw.githubusercontent.com/cirosantilli/media/master/Basketball_stage_of_Ciro_Santilli's_2D_continuous_AI_game.png]
{title=Screenshot of the basketball stage of Ciro's 2D continuous game}
{description=Source code at: https://github.com/cirosantilli/rl-game-2d-grid[]. Big kudos to <game-icons.net> for the sprites.}

Less good <discrete> prototype: https://github.com/cirosantilli/rl-game-2d-grid <YouTube> demo: <video Top Down 2D Continuous Game with Urho3D C++ SDL and Box2D for Reinforcement Learning by Ciro Santilli (2018)>.

\Video[https://youtube.com/watch?v=TQ5k2u25eI8]
{title=Top Down 2D <Discrete> <tile-based video game>[Tile Based Game] with <C++> <Simple DirectMedia Layer>[SDL] and Boost R-Tree for Reinforcement Learning by <Ciro Santilli> (2017)}

The goal of this project is to reach <artificial general intelligence>.

A few initiatives have created reasonable sets of robotics-like games for the purposes of AI development, most notably: <OpenAI> and <DeepMind>.

However, all projects so far have only created sets of unrelated games, or worse: focused on closed games designed for humans!

What is really needed is to create a single cohesive game world, designed specifically for this purpose, and with a very large number of game mechanics.

Notably, by "game mechanic" is meant "a magic aspect of the game world, which cannot be explained by object's location and inertia alone" in order to test the <the missing link between continuous and discrete AI>.

Much in the spirit of <gvgai>, we have to do the following loop:
* create an initial game that a human can solve
* find an AI that beats it well
* study the AI, and add a new mechanic that breaks the AI, but does not break a human!

The question then becomes: do we have enough computational power to simulation a game worlds that is analogous enough to the real world, so that our AI algorithms will also apply to the real world?

To reduce computation requirements, it is better to focus on a 2D world at first. Such world with the right mechanics can break any AI, while still being faster to simulate than a 3D world.

The initial prototype uses the Urho3D open source <game engine>, and that is a reasonable project, but a raw <Simple DirectMedia Layer> + Box2D + <OpenGL> solution from scratch would be faster to develop for this use case, since Urho3D has a lot of human-gaming features that are not needed, and because 2019 Urho3D lead developers https://github.com/cirosantilli/china-dictatorship/blob/23c5bd936361f78a8dd6bd1f412286808714d2da/communities-that-censor-politics.md[disagree with the China censored keyword attack].

Simulations such as these can be viewed as a form of https://en.wikipedia.org/wiki/Synthetic_data#Synthetic_data_in_machine_learning[synthetic data generation procedure], where the goal is to use computer worlds to reduce the costs of experiments and to improve reproducibility.

Ciro has always had a feeling that AI research in the 2020's is too unambitious. How many teams are actually aiming for <AGI>? When he then read <Superintelligence by Nick Bostrom (2014)> it said the same. <AGI research has become a taboo in the early 21st century>.

Related projects:
* https://github.com/deepmind/lab2d[]: 2D <gridworld> games, <C++> with Lua bindings

Related ideas:
* https://www.youtube.com/watch?v=MHFrhIAj0ME?t=4183 <Can't get you out of my head by Adam Curtis (2021)> Part 1: Bloodshed on Wolf Mountain :)
* https://www.youtube.com/watch?v=EUjc1WuyPT8 <AI alignment>: Why It's Hard, and Where to Start by <Eliezer Yudkowsky> (2016)

Bibliograpy:
* https://agents.inf.ed.ac.uk/blog/multiagent-learning-environments/ Multi-Agent Learning Environments (2021) by Lukas Schäfer from the <Autonomous agents research group of the University of Edinburgh>. One of their games actually uses apples as visual represntation of rewards, exactly like Ciro's game. So funny. They also have a 2d continuous game: https://agents.inf.ed.ac.uk/blog/multiagent-learning-environments/#mpe
* humanoid robot simulation
  * 2022 MoCapAct by <Microsoft Research>: https://www.microsoft.com/en-us/research/blog/mocapact-training-humanoid-robots-to-move-like-jagger
* <AI training game>{full}
* <software-based artificial life>{full}

\Video[https://youtube.com/watch?v=MvFABFWPBrw]
{title=<DeepMind> Has A Superhuman Level Quake 3 AI Team by Two Minute Papers (2018)}
{description=Commentary of <DeepMind>'s 2019 https://deepmind.com/blog/article/capture-the-flag-science[Capture the Flag paper]. DeepMind does some similar simulations to what Ciro wants, but TODO do they publish source code for all of them? If not Ciro calls <bullshit> on non-reproducible research. Does https://github.com/deepmind/lab[this repo] contain everything?}

\Video[https://youtube.com/watch?v=Lu56xVlZ40M]
{title=<OpenAI> Plays Hide and Seek... and Breaks The Game! by <Two Minute Papers> (2019)}
{description=Commentary of <OpenAI>[OpenAi]'s 2019 https://openai.com/blog/emergent-tool-use/[hide and seek] paper. OpenAI does some similar simulations to what Ciro wants, but TODO do they publish source code for all of them? If not Ciro calls bullshit on non-reproducible research, and even worse due to the fake "Open" in the name. Does https://github.com/openai/multi-agent-emergence-environments[this repo] contain everything?}

\Video[https://www.youtube.com/watch?v=tVNoetVLuQg]
{title=Much bigger simulation, AIs learn Phalanx by Pezzza's Work (2022)}
{description=2d agents with vision. Simple prey/predator scenario.}