Maintaining OpenAI Gym for the ultimate goal of safe AGI

Hello, who are you and what are you working on?

Hi, my name is Peter Zhokhov. I'm a former physicist and now a Member of the Technical Staff at OpenAI. Currently, I'm working as a DRI (designated responsible individual) for OpenAI Gym; but that is not to say that I have written all or even most of the Gym myself - in fact, it was largely ready when I joined OpenAI. Moreover, I split the maintenance load with my OpenAI colleague Chris Hesse.

OpenAI Gym is a collection of reinforcement learning environments with a single interface; or in other words, a place for AI to play games. These games range from very simple ones designed to train a specific reasoning skill to the games that people play (such as Atari videogames) to simulations of robotic movement and manipulation.

Why did you start working on this?

OpenAI's mission is to build safe AGI (artificial general intelligence), and AI that learns to play games in Gym (or, more specifically, learns to achieve high reward) is an essential stepping stone towards general intelligence. On the other hand, the unified interface of all the games allows us at OpenAI, and the research community as a whole, to focus on the AI research itself and not worry too much about game design or compatibility. For example, imagine a researcher comes up with a powerful algorithm that quickly learns to walk with an ant simulator and would like to see how well this algorithm would do when learning to walk like a human. With Gym, the only code change required is the name of the environment.

How have you spread the word and got more people to contribute to OpenAI Gym?

OpenAI Gym was in a lucky position of not having to compete with any other project - before, there was not really any project providing a unified interface for all of the environments, and in existing open-source reinforcement learning projects the environment and the algorithm would be tied together, making it hard to experiment with new algorithms. So when we released this big step forward, the research community was very receptive and positive.

In addition to writing a blog post for the release, we also partnered with NVidia and Intel to train AI agents on OpenAI Gym games, and gave out AWS credits for people using Gym in their research.

One thing that didn't work so well was creating a leaderboard that would allow people to submit their agents and compete against the previous best agent in the game; it turned out to be a huge maintenance burden on the team on top of maintaining the Gym codebase itself, fixing bugs and so on.

What are the biggest obstacles you've had to overcome?

I think the biggest challenge was to get the abstractions right (which I suppose is rather universal for software). And while (in my humble opinion :)) Gym got pretty close to getting them right, there are still things to improve.

Also, important as it sounds, maintaining open-source repos is not the most glamorous job, and sidelining the maintenance in favor of new cool internal projects was a real issue for me. Big shoutout to Chris for making sure we regularly set time aside to deal with issues and pull requests.

What are your hopes for the future of OpenAI Gym?

It would be great if one day we can express the entire complexity of the world as a Gym environment. Speaking seriously though, reinforcement learning is developing at an amazing pace and solving harder and harder problems. While it is clear that, for instance, CartPole environment is not a hard enough task to be a benchmark of any kind anymore, my hope is that future environments and benchmarks will be united using the Gym interface.

What advice do you have for other open source projects and maintainers?

Maintenance is just as important (if not more important) than new features, my advice is to prioritize it appropriately. Another piece of advice is to be assertive - people will always file issues that are not really relevant for the project itself (maybe a bug in a third-party package, maybe system misconfiguration) and it is much more productive to ask for clarification rather than spending hours trying to replicate a problem with insufficient information.

OpenAI Gym is a great way of accelerating contributions in the field of reinforcement learning, in service of their larger goal of creating safe AGI (which if it works will likely be the most important invention in human history). To find out more you can see the website here, or contribute on GitHub here.

To follow Peter's work, you can follow him on GitHub or find him on LinkedIn.

OpenAI gym