OpenAI Gym

Development ceased in 2021 and was taken up by a not-for-profit as Farama Gymnasium.

Table of contents
- Farama Gymnasium OpenAI Gym
  - Farama Gymnasium solutions Farama Gymnasium
  - Farama Foundation Farama Gymnasium

Farama Gymnasium

OpenAI Gym development by OpenAI ceased in 2021, and the Farama Foundation not for profit took up maintenance of it.

gymnasium==1.1.1 just worked on Ubuntu 24.10 testing with the hello world gym/random_control.py:

sudo apt install swig
cd gym
virtualenv -p python3
. .venv/bin/activate
pip install -r requirements-python-3-12.txt
./random_control.py

just works and opens a game window on my desktop.

Figure 1.
Lunar Lander environment of Farama Gymnasium with random controls
.

This example just passes random commands to the ship so don't expect wonders. The cool thing about it though is that you can open any environment with it e.g.

./random_control.py CarRacing-v3

To manually control it we can use gym/moon_play.py:

cd gym
./moon_play.py

Manual control is extremely useful to get an intuition about the problem. You will notice immediately that controlling the ship is extremely difficult.

Figure 2.
Lunar Lander environment of Farama Gymnasium with manual control
.

We slow it down to 10 FPS to give us some fighting chance.

We don't know if it is realistic, but what is certain is that this is definitely not designed to be a fun video game!

the legs of the lander are short and soft, and you're not supposed to hit the body on ground, so you have to go very slow
the thrusters are quite weak and inertia management is super important
the ground is very slippery

A good strategy is to land anywhere very slowly and then inch yourself towards the landing pad.

The documentation for it is available at: gymnasium.farama.org/environments/box2d/lunar_lander/ The agent input is described as:

The state is an 8-dimensional vector: the coordinates of the lander in x & y, its linear velocities in x & y, its angle, its angular velocity, and two booleans that represent whether each leg is in contact with the ground or not.

so it is a fundamentally flawed robot training example as global x and y coordinates are precisely known.

Variation in the scenario comes from:

initial speed of vehicle
shape of lunar surface, but TODO can the ship observe the lunar surface shape in any way? If not, once again, this is a deeply flawed example.

The actions are documented at: