Monday, November 29, 2010

Google AI Challenge (Planet Wars) Entry

UPDATE: I seem to have gotten second place.  Not bad, that's definitely higher than I expected.

At last, I am free of it!  Google AI Challenge (Planet Wars) is over.  No more spending all my free time thinking about how to eek out extra few wins, no more rushing home after work to start testing new bots on the TCP server.  Or, for that matter, no more postponing some pretty important things just because I wanted to implement a new idea.  However, it was fun, and quite worth it.

Without further ado, here's the description of what I did.  The code is on github.

1. Laungage, Tools

The first important choice was the language.  Of those that had starter packs, I would have been most comfortable with C++ and Python; remembering that most top bots in the previous Tron challenge were written in C++, I decided to use that for my bot.  In retrospect it was a pretty flimsy argument, as it could have easily been that the top contestants were just more comfortable with C++, or thought it would be better than other languages, creating a self-fulfilling prophecy.  I regretted the choice of language a few times as I find Python to be a bit more developer-friendly, but in either case, C++ worked out well enough and pretty quickly it was too late to back out.

It would be difficult to build the bot without some helpful tools for testing and debugging.  The key problem-specific tools were:
  • PlanetWarrior. The starter pack came with a game engine which ran the games pretty well, however it didn't quite have the features I was looking for, so the first order of business was to make a new one.  This way I could view the games as they were played, could easily follow the messages between the game engine and the bot, could see the planet IDs and growth rates, and I could do all of this with just a few mouse clicks.  Also, it was a good excuse to play around with Qt SDK.  It took about a week to write, and helped immensely.  A few other people seemed to come to the same conclusion, and there were actually quite a few contestant-made game engines in the end.
  • dhartmei's TCP Server.  This proved to be quite invaluable.  On the official contest server, the contestants could upload only one version of the game, and it got to play maybe every hour or two.  On dhartmei's server I could run as many bots concurrently as my computer could handle, and it was easily possible get two to four hundred games per day for each bot.  This allowed me to leave some bots running in the morning before work, and have a more or less clear winning strategy by the time I got home.  It also allowed for more opportunities to discover bugs and tactical errors.
  • Game State Extractor Bookmarklet.  Back in the days of Tron challenge, one of the contestants wrote a browser bookmarklet that allowed one to extract the current state of the game in the format that could be fed directly to the bot.  No one wrote one for Planet Wars by the time I needed it, so I took the task upon myself.  Later, when my bots' behaviour started to depend on the preceding turns, I made a few enhancements that allowed to extract a series of game states from the start to the turn of interest in a format that could be pasted directly to the bot's stdin to make it replay the whole game up to the point.
  • spz's Visualizer Site. The visualizer.js versions used to display the games on the official server and dhartmei's server were usually a few steps behind the latest and greatest experimental versions, and for a while this was the best way to watch the games that took place on either server.  This way I had easier access to the planet IDs and growth rates, and could scroll through the game.

2. The Basics: Game Timeline, Picking Moves

After making a few simple bots just to see whether I could, I got down to business.  The first thing to be written were GameTimeline and PlanetTimeline classes that were responsible for keeping track of the future game states for the next horizon turns, where horizon was a value that was arbitrarily set to {current map radius + 5}.  PlanetTimeline kept track of such things as the number of ships on the planet at each turn, the owner, the departures and arrivals, the number of ships reserved or not reserved for any purpose.  GameTimeline contained a vector of PlanetTimeline objects, plus another vector of PlanetTimeline objects that represented the "base case" when evaluating effects of moves, plus a few other minor things.  I suspect majority of contestants had a similar setup.

The GameTimeline and PlanetTimeline objects were kept between the turns, though were completely recalculated at the start of each turn as I could not find a reliable way of updating them incrementally.  The PlanetWars, Fleet and Planet classes from the original starter pack were kept with a few changes, though only PlanetWars (renamed to GameMap) was used regularly, mostly to keep track of distances between planets and, for each planet, lists of other planets sorted by distance from it.  The latter were calculated on the first turn and kept throughout the game.  It took a while to make sure that odd but important cases like "opponent attacks planet A, that reduces number of ships there and causes it to not send ships to B, which leaves A with more ships to spare but causes B to be lost a few turns later" were handled properly.

At each turn, the moves were selected by 
  1. iterating through every planet and every possible arrival time t = 1, ..., horizon, coming up with a combination of fleets to invade (or defend) the planet at t, and calculating the return on the invasion by taking the likely number of ships gained divided by the ships spent.
  2. picking the invasion plan that provided the best return, adding it to the list of moves to make, and applying it to the GameTimeline and reserving ships on the planets where they will depart from
  3. repeating steps 1 and 2 until there are no more ships to send or there are no more attack plans with positive returns.
This process remained largely the same throughout the entire development, though there were substantial changes to the details.  Again, I suspect that most contestants in the top half used a similar approach.

3. Moving Ships To Front

A quick observation of the top bots showed that they seemed to shuttle their ships from the rear planets to the front planets.  I shamelessly stole that idea, and that, combined with the above timelines, landed me pretty quickly at the third place on the leader board at the end of September.

To start, this was accomplished by iterating over each planet (source) at the beginning of the turn, and
  1. finding the enemy planet closest to the source (closest enemy)
  2. finding another planet of mine, target, that was closest to the closest enemy, subject to the constraint that the distance from target had to source had to be less the distance from closest enemy to source.
  3. If target did not exist, then the source was a front planet, and didn't have to send its ships anywhere.
  4. If the target did exist, then the source had to send all of its free ships to the target.
After some consideration, in addition to immediately sending all the ships to the front planets, I've decided to add all future front-supplying actions to the timeline, to reflect the fact that the rear planets will not have any ships available and that the front planets will have more ships available in the future.

This approach was a good starting point, however it had its drawbacks.
  • The rear planets could not take nearby neutral planets.  This worked well if there were more good neutral targets closer to the center, but was quite bad if the best neutrals were in the starting neighborhood.
  • For the same reason I started to notice that there were times when a planet would send a fleet to capture a nearby neutral with an expectation that another nearby planet would send an additional fleet that would join it later; however, before that could happen the second planet would become a rear planet and the original neutral target would not be captured. 
  • Rear planets did not help defend nearby planets.  The rear planet under attack would always keep at least enough ships to defend itself from the incoming ships, but any additional help had to arrive from the front planets.
  • The ships started to flow to the front planet only after it became mine; ideally this would start happening as soon as it was known that the planet would be mine, to provide better support and exploit more opportunities for attack.
To get around these problems, about a month later I reworked the movement of ships to the front by splitting it into a two step process:
  1. At the beginning of the turn, each planet was checked for being a rear planet.  If it was, it was marked as such.  Then during the move selection process it was prohibited from sending ships to the enemy planets, but not to my or neutral planets.
  2. After the moves were selected, the rear planets would send unallocated ships to the front planets.  The front planets were now selected from the list of planets that would be mine at any point now and horizon, with an additional check to make sure that they'd actually be mine when the support ships arrive there.

4. Dead End: Opponent's Counter-Moves And Counter-Counter-Moves

At the end of September I had to go on a trip for work, and most of my free time was occupied by other personal projects that I had been putting off for quite a while.  Thus for most of October I didn't touch the bot, and it quietly slid to somewhere in the 50s.

I did have time though to take a look at how the bot played, and noticed that often a move would expose it to an easy attack by the enemy.  For example, the bot would sometimes go for a neutral planet without realizing that the opponent can easily steal the planet on the next turn.  The image below illustrates the problem: red player's fleets are moving in to conquer the neutral planet without realizing that it cannot be held against the blue's forces.  Red will be down 50 ships, and blue will have all the spoils, possibly winning the game.



Alternatively, in close starting positions the bot would sometimes send most of the ships to conquer nearby planets without realizing that the opponent can now easily conquer the starting planet.

When I got home I tried the first obvious solution: when evaluating the number of ships my bot would gain if it makes a move, try to find the opponent's best response to my action and add that to the game timeline.  Then evaluate the difference in the ships gained after my and my opponent's moves.

This partially solved the problem, but now made the bot reluctant to invade planets that could be lost but easily recaptured.  The solution to that, of course, was to find the best response to opponent's response.  This ended up improving my bot, but it still kept making many mistakes trying to capture planets that it inherently could not hold, or incorrectly guessing the opponent's best response to my move.  I decided that this was really not the best solution and went shopping for something better.

5. Potentials

The solution came, again, from observing the top bots.  While thinking about a way of making the bot understand whether it would be inherently unable to hold the planet, I saw bocsimacko's bot do something interesting in one of the games: it sent ships from one of its planets to another, without an immediately apparent reason.  It was as if it "rebalanced" its ship distribution, because the planet on the other side of the map was facing more enemies. This lead me to develop what I at first called "balances" but later renamed "potentials".  

Once fully implemented, at the beginning of each turn the bot would calculate the following values for each planet P, for each future turn t = 1, ..., horizon and for each radius d = 1, ..., t:
  • Defense potential:  this, roughly speaking, answered the question "if I were to do nothing at this turn, and the opponent sends an all-out attack on P that will depart at t - d and arrive at t, what is the difference between the my ships that could arrive at t and the opponent's ships that would arrive at t?"  More formally, it was the difference between
    • My ships on or within d - 1 steps of the planet that could arrive there at turn t, and
    • Enemy ships on or within d steps of the planet that could arrive there at turn t
    Negative defense potentials indicated that the planet would be vulnerable to attack t turns from now and would needed to be supported. The distribution of ships to be sent would be then dictated by the distribution of defense potentials at turn t. For example, there would be little use in sending supporting ships from planet two steps away when the defense potential at d = 3 was negative. For each t then a minimum defense potential was calculated; it helped to determine quickly whether the planet was at risk without going into more details.
  • Support potential: was pretty similar to the defense potential, except instead of taking all my ships within d - 1 of the planet it took all ships within d of the planet that could arrive there at t.  Individual support potentials weren't very useful, however they were used to calculate:
    • Maximum support potential at t, which was the highest support potential at t for all d. If this number was positive for an enemy planet then the planet was potentially vulnerable and it was a good idea to start planning an attack.
    • Full potential at t, which was the support potential at t for all d = t, i.e. my total ships that could arrive at the planet at turn t minus all enemy ships that could get there at turn t, irrespective of the departure time. This value indicated whether it was possible to hold or conquer the planet in theory.
These values were calculated for my, enemy and neutral planets.  In case of neutral planets some additional adjustments had to be made that represented the fact that my or opponent's bot would have to overcome neutral ships as well as mine; usually the adjustment was made by just subtracting the neutral ships from the potentials to represent that my bot would have to fight the neutrals and then the enemy.  In special cases when there were my or enemy arrivals on the neutral planet the adjustments were more complicated.

Only ships not reserved for other purposes were used for calculating the potentials to represent the fact that a single ship cannot go to two different planets.  Additionally, to be consistent with the front reinforcement strategy described a few screens above, ships on the rear planets did not count towards the potentials for the enemy's planets.

The potentials were used to calculate the potential ships gained for each planet.  This was largely done by examining the fate of the planet at each turn, checking who owned the planet and what the full or minimum defense potentials were, and assigning a gain for each turn I'd likely own the planet and a loss for each turn when the opponent would hold it.

After adding implementing the potentials, the move picking algorithm started to look something like this:
  1. Update the game and planet timelines.
  2. Use the ship projections from 1. to calculate the potentials.
  3. For each planet and each arrival time t = 1, ..., horizon, check whether it's a good candidate for invasion by examining minimum defense potential and/or maximum support potential.  If it is, compose an invasion/defense plan and calculate the number of ships potentially gained by
    1. applying the action plan to the timeline
    2. recalculating the potentials for my planets
    3. calculating the in potential ships gained in this new scenario minus potential ships gained if this action is not taken
  4. Pick the invasion plan that returns the highest number of ships per ship spent, add it to the timeline.
  5. Repeat steps 1 - 4 until there are no more plans with positive returns left.
After squashing an incredible number of bugs due to a rather involved incremental potential calculation in step 3.a above, the performance improvement was dramatic.  However, it came at the cost of speed, as calculating the potentials proved to be very expensive.  At this point I added a simple time keeping mechanism to make sure that my bot stops picking moves when it's out of time.

6. More on Supporting My Planets

One of side effects of using potentials was that now the bot could detect and fix weaknesses in its defenses before the enemy could send ships.  Unfortunately that lead to the situations where the bot would send ships from planet A to support planet B, causing planet A to become dangerously weak, causing the bot to send ships from planet B to planet A, which would make B require support again, etc.  

This produced too many ships flying around, which meant that there weren't enough ships on the planets available for other important tasks.  Also it took up too much calculation time, leaving less time to find moves that were actually useful. 

To combat that I added constraints on the ways different planets could be supported.  If a move to support planet A  produced the highest return per ship but caused planet B to be at risk of takeover at some turn t, then it was deemed that it was for the greater good of the nation, and no planet was allowed to support B at t if it was within distance(A, B) of A.  These constraints were kept between the game turns to ensure that they were followed after the bot would receive the next game state from the engine (though, of course, the timing of constraint was updated to reflect that what was t turns in the future would now be t - 1 turns ahead).

To further reduce the number of supporting actions that were selected during the main move picking cycle, I disallowed making any small supporting moves far ahead in the future (e.g. sending more than five ships more than six turns in the future), as well as added an additional step in the beginning where "obvious" supporting moves that didn't weaken any planets were made without going through the whole move picking loop.

7. Adjustments to Return on Move

Previously I mentioned that the return on a move was calculated by taking the potential ships gained and dividing it by the total ships sent.  From the start I considered it to be a somewhat crude approximation, meant to be replaced by a better metric.  Internal Rate of Return was quickly discarded as there were departures and arrivals at various points in time, and IRR is not suited very well for that.  And there was no way of implementing Net Present Value without an arbitrarily chosen discount factor. So I kept the original return on move as it was for a while.  Until a couple problems emerged.


Consider the image above.  Blue can take over the neutral planet two turns from now; the invasion will require 51 ships.  Blue will definitely be able to hold the planet.  If the neutral planet produces four ships per turn, over the next 35 turns the return on invasion would be (35-2) * 4 / 51 = 2.59 ships per ship spent.  But that's not the smartest thing to do.  A better plan would be to send a fleet to arrive five turns from now, when the planet will belong to the red, after the red loses 50 ships to the neutral.  But that will require 55 ships (50 of red's remaining ships, plus 4 gained over the next turn, and an extra ship to take over the planet), and will will result in return of (-4 + (35 - 5) * 4) / 55 = 2.11 ships gained per ship spent.  That's a lower return even though the plan is better!

The difference here is that losing the ships to the neutral is different from losing the ships to the enemy: when fighting the enemy, both sides lose equal number of ships, but only one of the two players loses ships when fighting neutral.

Two solutions were tried: adding future enemy arrivals that will have to be fought to the number of ships spent, or subtracting the ships lost to neutrals from the ships gained.  The former choice resulted in more tactical problems; the latter worked, but produced a side effect that when the planets with high cost and low growth rate were never captured because they never produced enough returns over the horizon.  The simple answer to that was to count the returns all the way to the 200th turn, but that produced even more problems, and was left out of the final version of the bot.

The other problem was that similarly the bot considered that in the cases like the one illustrated below it was better to let the enemy capture the planet and then recapture it back at the same time as the fleet with 30 ships will be arriving there.



The solution there was to subtract the total number of my ships arriving onto my planet after it became enemy's from the total ships spent, representing that we'd be implicitly spending them.

8. Attack is the Best Defense

For much of the competition I made sure to keep at least as many ships on the planets as was necessary to hold it off against any incoming enemy ships.  A post by Astek showed that it was unnecessarily limiting; experiments on dhartmei's server proved that he was right, so I removed that.  I also added some adjustments to the potential calculations to reflect the risk of lower growth planets taking over higher growth planets.

Other than that, aside from the "easy" support moves that were made outside the main move picking loop for speedup purposes, defense was never an extra step and never took precedence over offense.  Whether or not to defend a planet against an incoming fleets was decided based on the return on saving that planet, and was compared to returns on attacking the enemy or taking neutrals.

This was the last version of the bot, and was submitted for the final competition.

8. What Was Missing

  • Unit tests.  I skipped largely because I figured that I would not have too much complicated code, and I'd be completely rewriting most of the bot pretty often, and having to maintain unit tests would only slow me down.  By the time I realized that it was a poor choice it was too late; in the limited time I had left I did add various internal consistency checks to the bot that helped me to catch a number of bugs, but unit tests would  have definitely let me spend more time on developing logic and instead of catching bugs in the functions that calculated potentials.
  • Better strategic decisions.  At the end, most of the losses were due to my bot invading planet A not realizing that this would allow the opponent to invade planet B without fear of losing it.  I spent about a week and a half trying to implement it, but in the end it was nowhere near ready to be submitted.
  • Determining where to move ships.  The algorithm for shuttling the ships from the back planets to the front had a number of deficiencies in the end, the main one being that the front planet where the ships were moved was not necessarily the planet where the ships would have been most useful.  I suspect that if I would have completed "better strategic decisions" part above, this part would have been straightforward as well.
On the whole, however, I am fairly satisfied with my bot.  While there were definitely deficiencies, it turned out to be much better than it could have.



5 comments:

  1. Thank you for the detailed writeup of how you built your bot and the things you considered. I like that the potentials approach worked for you. It's something I have considered (calling it Areas of Influence) and started on an implementation, but I did not get a working implementation up before the contest was over.

    On the last bit, 'determining where to move ships', you mention that simply sending ships to the nearest front lines may be sending them to the wrong planets. Perhaps a good approach would be to look at the defense potential of the enemy planets and sending support ships to the frontline that is nearest to the lowest enemy defense potential. That way, it becomes easier to invade.

    ReplyDelete
  2. Congrats! I was really glad to see a Canadian flag in the top 10!

    ReplyDelete
  3. Impressive!! Thanks for the time you spent on the explanations

    ReplyDelete
  4. I'm impressed that your straight-forward approach did so well. Another example that good results often require more perseverance than truly revolutionary work (this is meant as a complement!)

    ReplyDelete
  5. hi iouri, good to see you on the blogosphere. am utterly impressed by this. happy new year and see you again next year!

    ReplyDelete