TwitLonger — When you talk too much for Twitter

Dave Churchill · @DaveChurchill

24th Jan 2019 from TwitLonger

My First Impressions of AlphaStar

Today DeepMind showed off their StarCraft 2 playing AI agent: AlphaStar. To sum things up right away: what they have done is really incredible. I've talked to just about everyone involved in StarCraft AI over the years, and the most optimistic estimates of this level of performance were that it was at least 2 years away. That estimate however was based on the previous rate of publically shown progress in StarCraft AI, which got dramatically increased when large companies devote lots of resources to the topic. Huge congratulations to the DeepMind team for what I consider a huge success in the field of StarCraft AI, and AI research in general.

There will be a lot of nit-picking done by the StarCraft / AI communities about whether or not the setup of these show matches were actually that impressive, and I want to briefly discuss those topics. Here are some of the topics I've seen so far:

1. The games were all played on a single (two-player) map

AlphaStar was trained to play on a single map, which was Catalst LE. A very small two-player map in which each player knows the exact location of the enemy's base at the start of the game. This does indeed make things biased toward AlphaStar, because it a) makes the learning task much easier (single environment) b) allows the bot not to scout (it knows where the enemy is) c) it is a small map, and so rush strategies are much more powerful (and AIs traditionally are better at them due to their lower complexity).

From an AI research perspective, I am curious to see if the size of the models trained by AlphaStar are capable of scaling to larger maps, and generalize to the different topologies of those maps, and the varied strategies that come with that challenge.

3. The pro player was not at a world-champion level

MaNa is a great player, especially at PvP. In my opinion, it represents an incredible achievement to be able to beat him. It is perhaps just a matter of additional training time to beat a stronger opponent. I do not see this as a valid criticism in any way.

4. AlphaStar only played one matchup: Protoss vs Protoss

This is probably a valid initial criticism, however, what they have done is an amazing start, and far better than any learning-based approaches that has been done in the past. I am interested to see how scalable the networks are to playing different matchups. It is possible however that they could just separately train a network for each matchup, making this a non-issue.

5. Protoss is the 'easiest' race for AI to exploit due to Stalkers/Blink

I do believe that this is true, however not necessarily a criticism of AlphaStar itself. Stalker micro has been shown to be incredibly powerful, even in rule-based SC2 agents. The choice of the AI to exploit this fact just shows that it is learning properly, not that it is 'cheesing' the opponent. Most of the wins from AlphaStar seemed to come by the simple overwhelming of the opponent with Stalker units and blink micro, which was able to beat varied strategies from the opponents. I think that this should be more of a criticism of how strong the Stalker unit is when combined with perfect AI precision in clicking, rather than blaming the agent for going for that strategy.

6. It played 200 years of StarCraft 2 in one week. Is this fair?

Fair is not really a concept that matters when it comes to AI or computation. In the end, what matters is only whether or not the AI system won. Obviously, computers do not consist of human brains, so there will be a massive amount of computation involved in learning, and the ability to parallelize and play games at superhuman speeds is one of the things researchers will take advantage of when creating an AI system like AlphaGo, just like they do for many other engineering projects.

6. It got destroyed in the live match, what happaned?

This was actually quite interesting to see, and can lead to obvious speculation about whether or not the bot performed as well as they claimed during the recorded matches. However, I have no doubt that the claims made by DeepMind are valid, the games were all above board, and the pro players are telling 100% truth when they claim that the bot is as strong as it looked in the recorded games.

I think what probably happened is that the human player tried some strategies which the bot had not seen before (Immortal Drop) and this led to some poor decision making. This is of course the bane of any learning algorithm: you can only learn about things which you have encountered before, and so coming up against new tactics will usually be very difficult. This is perhaps a serious problem in a game with a strategy space as large as Starcraft, and hopefully one that can be solved with additional training time, or training against a wider variety of opponents. What is also showed was that the type of mistakes that AI make are often much different than the ones that humans make. There's no way any human would have allowed their base to be attacked for that long without retaliation. Perhaps it was was a missing semicolon. Only DeepMind knows :)

In conclusion, I'm not entirely sure yet that AlphaStar is a large theoretical AI leap over AlphaGo, since I don't know the details behind it. Howerver, the results they have achieved so far are amazing, and shows real promise that deep learning / reinforcement learning can perform well in games with large state and action spaces. I would claim that AlphaZero is currently the strongest StarCraft AI agent that exists in any form, and it has been done in a way (learned) that allows it to get stronger just by devoting more time to training. In a way, these results are so good so quickly that it's a little depressing for an academic researcher like myself to see just how fast the field can progress when you're willing to throw tens of millions of dollars behind it (my estimate) - something we could never accomplish as academic researchers.

I am very excited to see how their approach scales to larger more varied maps, and different matchups. And whether the bot can eventually stand up to all the cheese that humans will inevitably throw at it. GG!

Reply · Report Post