AI safety Updated +Created
Basically ensuring that good AI alignment allows us to survive the singularity.
Ciro's 2D reinforcement learning games Updated +Created
Video 1.
Top Down 2D Continuous Game with Urho3D C++ SDL and Box2D for Reinforcement learning by Ciro Santilli (2018)
Source. Source code at: github.com/cirosantilli/Urho3D-cheat.
Figure 1.
Screenshot of the basketball stage of Ciro's 2D continuous game
. Source code at: github.com/cirosantilli/rl-game-2d-grid. Big kudos to game-icons.net for the sprites.
Video 2.
Top Down 2D Discrete Tile Based Game with C++ SDL and Boost R-Tree for Reinforcement Learning by Ciro Santilli (2017)
Source.
The goal of this project is to reach artificial general intelligence.
A few initiatives have created reasonable sets of robotics-like games for the purposes of AI development, most notably: OpenAI and DeepMind.
However, all projects so far have only created sets of unrelated games, or worse: focused on closed games designed for humans!
What is really needed is to create a single cohesive game world, designed specifically for this purpose, and with a very large number of game mechanics.
Notably, by "game mechanic" is meant "a magic aspect of the game world, which cannot be explained by object's location and inertia alone" in order to test the the missing link between continuous and discrete AI.
Much in the spirit of gvgai, we have to do the following loop:
  • create an initial game that a human can solve
  • find an AI that beats it well
  • study the AI, and add a new mechanic that breaks the AI, but does not break a human!
The question then becomes: do we have enough computational power to simulation a game worlds that is analogous enough to the real world, so that our AI algorithms will also apply to the real world?
To reduce computation requirements, it is better to focus on a 2D world at first. Such world with the right mechanics can break any AI, while still being faster to simulate than a 3D world.
The initial prototype uses the Urho3D open source game engine, and that is a reasonable project, but a raw Simple DirectMedia Layer + Box2D + OpenGL solution from scratch would be faster to develop for this use case, since Urho3D has a lot of human-gaming features that are not needed, and because 2019 Urho3D lead developers disagree with the China censored keyword attack.
Simulations such as these can be viewed as a form of synthetic data generation procedure, where the goal is to use computer worlds to reduce the costs of experiments and to improve reproducibility.
Ciro has always had a feeling that AI research in the 2020's is too unambitious. How many teams are actually aiming for AGI? When he then read Superintelligence by Nick Bostrom (2014) it said the same. AGI research has become a taboo in the early 21st century.
Related projects:
Bibliograpy:
Video 3.
DeepMind Has A Superhuman Level Quake 3 AI Team by Two Minute Papers (2018)
Source. Commentary of DeepMind's 2019 Capture the Flag paper. DeepMind does some similar simulations to what Ciro wants, but TODO do they publish source code for all of them? If not Ciro calls bullshit on non-reproducible research. Does this repo contain everything?
Video 4.
OpenAI Plays Hide and Seek... and Breaks The Game! by Two Minute Papers (2019)
Source. Commentary of OpenAi's 2019 hide and seek paper. OpenAI does some similar simulations to what Ciro wants, but TODO do they publish source code for all of them? If not Ciro calls bullshit on non-reproducible research, and even worse due to the fake "Open" in the name. Does this repo contain everything?
Video 5.
Much bigger simulation, AIs learn Phalanx by Pezzza's Work (2022)
Source. 2d agents with vision. Simple prey/predator scenario.
Human Compatible Updated +Created
The key takeaway is that setting an explicit value function to an AGI entity is a good way to destroy the world due to poor AI alignment. We are more likely to not destroy by creating an AI whose goals is to "do want humans what it to do", but in a way that it does not know before hand what it is that humans want, and it has to learn from them. This approach appears to be known as reward modeling.
Some other cool ideas:
  • a big thing that is missing for AGI in the 2010's is some kind of more hierarchical representation of the continuous input data of the world, e.g.:
    • intelligence is hierarchical
    • we can group continuous things into higher objects, e.g. all these pixels I'm seeing in front of me are a computer. So I treat all of them as a single object in my mind.
  • game theory can be seen as part of artificial intelligence that deals with scenarios where multiple intelligent agents are involved
  • probability plays a crucial role in our everyday living, even though we don't think too much about it every explicitly. He gives a very good example of the cost/risk tradeoffs of planning to the airport to catch a plane. E.g.:
    • should you leave 2 days in advance to be sure you'll get there?
    • should you pay an armed escort to make sure you are not attacked in the way?
  • economy, and notably the study of the utility, is intrinsically linked to AI alignment
The Matrix (1999) Updated +Created
Ciro Santilli just keep watching that a gazillion times whenever it showed on TV.
All action scenes are useless crap, but the premise with Ciro's precious simulation hypothesis subject, related physics and the illusion of life.
It is a shame that the key premise of using human bodies to produce energy is completely and impossibly stupid. You would obviously get more energy by directing burning the food you feed into humans.
If the film had been made later, maybe the much more plausible concept of AI alignment would would have been used instead. What a shame.
Video 1.
Blue Pill or Red Pill scene from The Matrix (1999)
. Source.