Gamasutra: The Art & Business of Making Gamesspacer
Postmortem: Intelligence Engine Design Systems' City Conquest
View All     RSS
October 23, 2014
arrowPress Releases
October 23, 2014
PR Newswire
View All

If you enjoy reading this site, you might also want to check out these UBM Tech sites:

Postmortem: Intelligence Engine Design Systems' City Conquest

February 6, 2013 Article Start Previous Page 2 of 4 Next

What Went Right

1. AI-Assisted Design Process

Our AI-based approach to design paid off in spades. There is no question in our minds that this exceeded our expectations and improved product quality and time-to-market. This was not some high-minded academic fluff but a practical, boots-on-the ground competitive advantage.

The full theory behind our design process is much too complex do it justice in the scope of a postmortem article. We hope to be able to explain it more fully in the future if time permits. But we can touch on two main elements of our design approach that impacted City Conquest.

The first was our optimization-based approach to the core design. The defensive towers and unit types in City Conquest were not based on arbitrary creative decisions: nearly all of the decisions around their functional aspects were guided by an explicit decision modeling process.

We clearly specified the design goals and constraints for all of our towers and units and then constructed a decision model to optimize the features of all the units and defensive towers together as a whole.

We then used an evolutionary optimizer to select the best set of nine towers and nine units that would best work together in concert to satisfy our design goals while remaining within the boundaries of our design constraints.

This approach is broadly similar to the one described in the book Decision-Based Design, although our approach is much simpler and customized for game design decisions.

We firmly believe that this type of optimization-based approach can pay major dividends for game developers. It can help us provide more value to our customers by reducing the complexity of design problems, allowing us to make more optimal decisions more quickly, and in some cases, allowing us to solve problems that are otherwise unsolvable.

The second advantage was Evolver, which we discussed in an earlier interview with Evolver was an automated balancing tool based on coevolutionary genetic algorithms. Every night, it would run a huge number of simulated games between red and blue opponents, with each opponent evolving a "population" of scripts (each script being essentially a fixed build order of buildings within the game).

Evolver would generate random scripts, play them against each other, and assign them a fitness score based on which player won and by how much. It then applied standard evolutionary operators such as crossover and mutation to genetically optimize each population.

This meant that we could wake up every morning, open Evolver, determine which scripts the red and blue players ranked the most highly, and then plug those into the game and watch them play against each other. This instantly told us how our game balancing was working. Were players building too many Rocket Launchers? Not enough Skyscrapers? Were Commandos not useful enough, or were they consistently preferring Crusaders over Gunships?

We could then use this output to tune and refine a few units and buildings every day, tweaking their resource costs, health, speed, damage, rate of fire, and other parameters. It was very much like having an external outsourced testing team that would play the game overnight -- except that it was cheaper and more scalable than a human playtesting team, in addition to being capable of absolute objectivity.

We optimized Evolver by disabling the rendering, adding task-based parallelism, and hand-optimizing the gameplay logic. This allowed us to achieve roughly 1 million simulated games per 12 hours. We later upgraded the genetic algorithm to use an island model and hand-tuned the fitness function to achieve certain outcomes (such as helping the script populations learn to upgrade their Skyscrapers quickly enough to achieve optimal long-term income levels).

This might seem like a lot of work. It wasn't: the work to create, tune, and optimize Evolver was roughly two weeks' worth of development time in total. Considering all the valuable feedback that Evolver gave us, the fact that it gave us better results than we could have achieved by hand, and the fact that doing this initial hand-tuning would have taken far longer than the two weeks we spent on Evolver, we consider this an obvious net win.

It also left us with a system that we could quickly run to test any possible changes to the gameplay parameters to see the ramifications of changing any design changes -- in one case, we were able to quickly identify the issues that would arise from reducing the Roller's cost from 2 crystals to 1 crystal (and reducing the Roller unit's stats accordingly), and Evolver allowed us to immediately identify the problems and abandon this idea before it caused gameplay problems.

We also benefited enormously from having a large number of playtesters in our TestFlight team for eight months leading up to release giving us the invaluable aesthetic, usability, and other subjective feedback that Evolver could not. We eventually invited all of our Kickstarter backers to join us as playtesters.

As a result of all of these factors, the game was fun almost from day one. Every design concept worked.

2. The FBI: Fix Bugs Immediately

I've worked on several projects with 5K+ bug counts in the past. Once the team moves on to the final "bug fixing" phase, there are inevitably dozens of awful discoveries: "Wow, if we'd known that bug was there, we could have saved so much time!" "That bug caused ten other bugs!" "If we'd gotten that bug fixed six months ago, we would have totally changed the way we designed our levels!"

On one memorable project, a producer told his team: "Don't fix bugs -- we don't have time right now! Save them all up until the end!" The project failed catastrophically months later, destroying the studio and nearly pulling the entire franchise down along with it, in no small part due to the overwhelming bug count at launch.

That should never happen. Letting any software defects linger is a risk to the schedule and to the integrity of the code base.

Our approach is to fix bugs immediately. We don't work on any new features or continue any other development until all known bugs are fixed. Design flaws and performance problems count as "bugs." Playtesters' suggestions also usually count as "bugs," especially if more than one playtester reports the same problem or suggests the same change.

Our running bug count remained under 10 at all times throughout development.

Now that we've done it, we can't imagine developing any other way. What's the point of postponing fixes just for the sake of the "schedule"? It's essentially a form of accounting fraud: you're pushing off a problem onto one side of the ledger to pretend that the other side of the ledger is doing fine. You're introducing hidden costs by sacrificing product integrity.

Our playtesters frequently mentioned the relative lack of bugs, and this codebase integrity ensured our testers would be focused on gameplay rather than technology issues.

In our experience, the practice of pushing bugs into a database and waiting to fix them until a later date is a major cause of product delays, developer stress, work-life imbalance, and studio failures. It's a primitive and barbaric practice and it needs to end.

Article Start Previous Page 2 of 4 Next

Related Jobs

Gameloft New Orleans
Gameloft New Orleans — New Orleans, Louisiana, United States

Lead Programmer — Hunt Valley, Maryland, United States

Graphics Software Engineer — Chicago, Illinois, United States

Graphics Software Engineer
BattleCry Studios
BattleCry Studios — Austin, Texas, United States

Senior VFX Artist


Paul Tozour
profile image
I just wanted to note that Gamasutra's tagline on this article, "how automated testing saves time, and how Kickstarter wastes it," is Gamasutra's characterization, not my own, and is not the way I would have characterized this article. I'm working to get this fixed, but in the meantime, Evolver is not exactly an automated testing system, and Kickstarter isn't a waste of time.

Paul Tozour
profile image
EDIT: The Gamasutra editors have tweaked the tagline to avoid confusion. Thanks, Gamasutra!

Don Hogan
profile image
Excellent write-up, Paul. It's always good to hear your take on game development, there's never a shortage of food for thought. Glad to hear the project went well!

Michael DeFazio
profile image
Fantastic article--

Wish the Kickstarter video could have been attached it was also awesome (Seeing how you created algorithms to find optimal strategies and had them play against each other was fabulous.)

Love your philosophy about games (problems spaces) and AI... And the amount of times you mentioned "decisions" in the article put a smile on my face (I'm sorta a gameplay first kinda guy, and great games to me always find a way of presenting "interesting decisions").

You completely sold me on this game (one android copy Sold!)... and I will continue to watch for other revelations/advancements you and your company find about making compelling (and balanced) gameplay in the future.


Paul Tozour
profile image
Thanks, Michael! For anyone who's interested, the Kickstarter video is here:

I'm also going to be giving a talk on Evolver and some other aspects of my approach at the GDC AI Summit in March (along with Damian Isla and Christian Baekkelund, who will be discussing the role of AI in the design of Moonshot Games' terrific new game Third Eye Crime).

Paul Tozour
profile image

GameViewPoint Developer
profile image
I think the AI approach to game testing is definitely interesting but would be a lot of work for a small indie team to implement, perhaps if there were 3rd party tools available it would be a useable solution.

Paul Tozour
profile image
To be clear, the point of Evolver was not as a game testing system. It was designed to help explore the design space and guide the game balancing. It was fundamentally a design tool, NOT a testing tool.

It did help find some bugs, of course, but that was just a nice side-effect. The real point was to help us optimize the balancing between all the different units and towers in the game.

The difficulty of implementing something like this really depends on the game. It's not a magic bullet and it's not an approach that will work for every game. And it's the type of system that has to be carefully designed for the particular game in question, so it's not the kind of thing where you can really create a tool that will work for any game.

In the case of City Conquest, it cost us about 2 weeks' worth of coding and other work, and easily saved us a good 3-4 weeks worth of design time -- while also giving us better results than we would likely have been able to get by hand, AND giving us a system that would give us instant visibility into the ramifications of any given design decision. So, it was clearly a net win just in terms of the time savings alone, even before you consider all of the other benefits.

In any event, as I mentioned in one of the previous comments, I'm going to be speaking about this in more detail at the GDC AI Summit in March. So be sure to stop by if you'll be at GDC.

Denis Timofeev
profile image
Hi! That's a great article, thanks. A lot of inspiration. I'm just wondering how many users you had on TestFlight, how did you get them and how helpful their was.

Paul Tozour
profile image
We had around 50 or so at first, but once we opened the testing up to *all* the Kickstarter backers, we ultimately got to the maximum number of users allowable based on Apple's developer restrictions for the maximum allowable number of devices per app (100). The actual number is below 100 since some of them had multiple devices (iPhone + iPad) so they took up multiple device ID slots.

We started with personal friends and industry contacts, then added backers at the appropriate backing level in Kickstarter, and then, a few months later, opened it up to absolutely all the backers who had iOS devices.

Their feedback was extremely helpful overall. We got a very wide range of feedback from a lot of people at a lot of different skill levels. We had a few industry veterans in there (see the game's Credits for the full list), who provided terrific and detailed feedback, along with a few non-gamers. We also did some diagnostics, such as adding buttons so the testers could send back valuable data on mission completion and achievements earned. So it was nice to see that about 90% of the ones who responded were able to finish the single-player campaign, and every mission was completed by at least one tester at every available difficulty level. Also, they were able to get me invaliable feedback on devices I didn't own / couldn't get my hands on at the time, such as the iPhone 5 and the "new" iPad aka iPad 3.

Louis Gascoigne
profile image
Great article Paul, worth the read just for the tech section.

Bram Stolk
profile image
Impressive stuff, and great article.
Amazing that you could get Computer Aided Game Balancing working in just 2 weeks.
Writing a system like Evolver sounds like more fun than tediously going through manual iterations of game balance.

Jeremy Tate
profile image
@Paul How did you approach actual code testing? It seems like it would be crucial under this model to have technical integrated testers on the team if you were going to maintain a running total of 10 bugs. Otherwise, you are kind of pushing those undiscovered bugs to later in the project.

Paul Tozour
profile image
I definitely agree in principle with having as many testers as possible doing testing from day one. This is a great idea whenever you can afford it, and it worked very nicely for Retro Studios when I worked with them on Metroid Prime 2 and 3 -- the internal elite ninja testing team was super effective.

I think we were able to get away with not having dedicated testers on the City Conquest team thanks to a combination of factors: it was the relatively limited scope of the project, the involvement of our external playtesting team (friends and Kickstarter backers on TestFlight), our defensive programming practices, and our habit of playing through the entire game on a regular basis. Also, the fact that the Evolver tool found a few of the most difficult/subtle bugs on its own just by virtue of the fact that we were running a million simulations overnight and could pretty quickly find anything in the game logic that caused a crash or a hang.

But again, I do agree with you that having dedicated testers throughout the process is the better way to go if at all possible. The earlier you can find bugs, the better.

Paul Tozour
profile image
And when I say "10 bugs," of course I mean 10 *known* bugs. Naturally there's no way to count bugs that you don't know about.

But, yeah -- the earlier you find them, the earlier you can fix them.

Jeremy Tate
profile image
In addition, since a game by its very nature is almost entirely focused on usability vs functionality, so if something has to go, it has to be the latter. It's not like its a piece of medical software where lives depend on the functionality. Which, by focusing on playtesting, was the choice you made.

I'm thinking that a person who would the upkeep of the Evolver scripts, managing the data, disseminating data out and making sure issues are tracked and are acted on would be logical next step though. You could easily scale the concept out and get even more detail.

Paul Tozour
profile image
Absolutely. Evolver is really only scratching the surface of what's possible.

There's also been some very interesting academic work related to this, as well as interesting procedural content generation work, being done by folks like

Alexander Jaffe
Adam Smith
and Gillian Smith

... and several others whose names escape me at the moment.

James Yee
profile image
Good to see your game came out Paul.

As a writer in the Kickstarter community I recall seeing your project at the time and not being overly impressed by it. (Hence the reset you did) Do you think the Kickstarter would have gone better if it wasn't up to YOU to create it? Basically would it have been more cost effective for you to keep working on the game while letting someone else do the PR/Campaign management of Kickstarter?

Also for future Kickstarter creators how do you avoid the Apple Promo code problem you ran into? Just make sure your full game is a separate app if you plan on a free version?

Paul Tozour
profile image
Hi James -- I'd love to get your thoughts on the Kickstarter and what could have been done better there. Please feel free to send us a direct message via Twitter or e-mail us directly; I always appreciate honest, direct feedback.

Yes, the Kickstarter probably would have been more successful if I'd hired separate PR for it, but I'm not sure it would it have been cost-effective or that it ever would have actually brought in more funding than the cost of hiring the PR firm in the first place.

I didn't bring on a PR firm until the game was ready to launch on iOS; I did consider doing it for the Kickstarter campaign but it seemed like overkill.

As for avoiding the Apple Promo code problem: I don't have a good way to recommend that developers make their app available to backers for free outside of the limited promo codes Apple gives you.

Other developers I've spoken with have recommended briefly making your IAPs free for a very short time frame and telling your audience exactly when they need to download them, or putting hidden features / secret codes in your app (players need to click on a certain pattern on invisible hotspots). But the latter approach especially is risky since it risks incurring the wrath of Apple (and potentially getting banned from the App Store) or becoming open knowledge (and allowing people to pirate your game easily once they know the secret code).

Damian Connolly
profile image
Thanks for the excellent post-mortem, and congrats on the game!

Can you elaborate a bit on how you used discovery driven planning with your game? Did it change the game design, and if so, to what extent? Did you use it mainly in the prototyping stage, or throughout the entire process?

Also, your Evolver tech sounds pretty amazing. Is it specific to the game, or do you see yourself eventually spinning it off as middleware? It seems like an ideal candidate, especially for smaller studios.

Paul Tozour
profile image
Hi Damian -- Thanks for the kind words!

Although I'd love to be able to build middleware someday to help with this aspect of game design, I don't think it will be a case of extending Evolver to do that. The genetic algorithm component of Evolver isn't really anything unique or protectable, and the aspects relating to integration with City Conquest is too game-specific to be able to put it in middleware.

Regarding Discovery-Driven Planning: As luck would have it, I'm working on an article (or two) on DDP for Gamasutra right now. I hope to have them up by the end of the month or maybe early in March.

DDP is really a project planning methodology, so you want to use it throughout the project to make sure you have a good handle on the risks of the project, and ensure that you're prioritizing your efforts to learn as much as you can to reduce uncertainty as cheaply and as quickly as you can.

So on City Conquest, it drove every milestone. It didn't really change the game design directly but it did ensure that we worked on the riskiest aspects of the project first to ensure that we reduced our risks and ensured the smoothest possible path a completed, profitable game.

Paul Tozour
profile image
FYI, Gamasutra has now published my article on applying discovery-driven planning to games:

Josh D
profile image
Hi Paul,

Thanks so much for writing this postmortem, I was hoping you could elaborate a little bit on your monetization decision. You said that in retrospect, you didn't think making a free app with one large IAP was the best choice. I'm wondering why that is and what you think would be more effective? I'm currently involved with an app considering a similar pricing structure, so I'm very curious as to your thoughts and experiences with this. Any feedback would be greatly appreciated, thanks again!

Paul Tozour
profile image
Hi Josh,

What it comes down to is that to really maximize your revenues, you need to be able to do proper price discrimination. And by that, I mean that you need to be able to serve all the customers across the whole curve of different levels of willigness to pay -- the 5% of "whales" who will pay $20+ per month of your app without hesitation, the 10% of "dolphins" who might pay $5 per month, and the "minnows" who will pay maybe $1 a month on average ... in addition to all the non-paying users who just won't pay anything, which will likely be 3-10x your number of paying users.

So, I've come around to the realization that the full-unlock-through-a-single-IAP model is suboptimal because it can only do price discrimination at a single point: the decision of whether or not to pay for that one IAP. So you are getting no revenue at all from the many users who consider your price point too high, and you're getting a lot less revenue than you could from the minority of very wealthy "whales" who would be willing to pay much more to enhance their game experience.

Paul Tozour
profile image
I'd also recommend checking out GamesBrief --

They have some interesting papers on this.