Post
by **TomCat96** » Fri Oct 18, 2019 1:25 pm

It's important to distinguish among the disparate AI techniques out there.

As a hobby of mine, I like to tinker with OpenAI gym and manipulate some of the algorithms for these game bots.

This particular field of AI is called reinforcement learning as oppose to say your typical linear classifiers, LSTMs, autoencoders, and convolutional neural networks.

Of all these, linear classifiers are probably the best understood. And they do a decent job classifying things.

The mathematics of a linear classifier is essentially solving for a vector in an N-dimensional space, or minimization of an entropy function in steps so that the parameters of a neural network will gradually fit the data.

The mathematics is quite different for reinforcement learning, where the target is to search for an optimal policy.

There exists a concept in reinforcement learning that does not exist in your typical linear classifiers and convolutional neural networks--that of the Agent.

The Agent is an actor which may interact in a State space, and may produce an action. There exists a set of rewards (which is really a giant matrix).

The agent may act within this state space in discrete time steps. The objective is to find a policy, the set of actions, or subsequent states, or action, state tuples, over the course of the entire state space which maximizes the reward.

In my subjective opinion, reinforcement learning, the AI of bots, is an order of magnitude more complex than linear classifiers. Papers on reinforcement learning get posted regularly. But by and large, it's not the coding that is difficult. Code, and even libraries implementing the new mathematical techniques are available within days of the papers being posted. These libraries are available for use on Tensorflow.

Thus, tinkering with a new bot to interact with the OpenAI gym world can be just a matter of a few calls to some libraries. Truly Amazing really.

I have a bot of sonic running around interacting with his world, trying to run loops and collect gold rings. Believe it or not, its actually beaten the first stage once or twice.

Here's the problem though. How do I make the bot better?

I don't have a single clue. And I'm not sure anyone else does either. Sure you can tinker with a few things, combine AI techniques, tweak reward functions, throw more computing power at it. But we don't actually know what will make it better. If I throw 10 GPUs at it instead of 5, how much "smarter" will Sonic be? There are no metrics to determine this. Even for certain RL algorithm which mathematically guarantee convergence, the state space of these games are so ridiculously intractable to brute force that it's just not going to make much of a difference.

I won't speak for the other AI techs, but for DOTA botting it's not a matter of engineering at this point as it is going to the whiteboard. We can't engineer our way into better bot AI yet--at least not fundamentally. I would argue at this moment, it's really in the hands of the PhDs right now.