Cambridge AI tool could cure headache for games developers
Cambridge startup PROWLER.io has crafted a specialised AI tool that solves a crucial problem for game developers and could open up industry specific and general software markets estimated to be worth around $150 billion in the next five years.
Using Minecraft as a reference game, PROWLER.io developers devised a foolproof automatic tester for games. The technology zips through games, recording problem areas with content and feeds those automatically back to the developers. This accelerates content upgrades and debugging and frees up human testers to get on with their core strength – designing great gameplay.
The technology also cuts critical costs for games developers, of which there is an abundance in Cambridge alone, let alone globally.
PROWLER.io says this breakthrough is just the start for its games testing platform. It plans to keep expanding its gaming toolbox with core technologies for finding hard-to-track bugs, load testing servers in MMORPGs (Massively multiplayer online role-playing games) and making user experiences ever more personalised – and fun.
CEO Vishal Chatrath told Business Weekly: “This is indeed groundbreaking. We have given the example of testing games here; however this technology can be applied to any form of software testing which will be worth over $50 billion by 2022. It is also estimated that the global test automation market will reach $54bn-$98bn by 2022.
“We expect a dramatic cost in the fall in not only the cost of testing (over 50 per cent) but also huge benefits in the quality of software. The goal here is to augment human testers, not to replace them.
“As software is getting exponentially complex it is impossible for totally human-led testing to keep up. In terms of industry impact it equates to the move from human calculators to electronic calculators.”
PROWLER.io’s Matthew Bedder (machine learning researcher), David Beattie (senior software engineer) and Haitham Bou Ammar (reinforcement learning team leader) take you under the hood of the breakthrough in a blog on the company’s website.
They write: “Our specialised AI tool solves a crucial problem for game developers: how can they ensure that levels are playable and completable after a design change?
“Like Tuneable AI, it's one way our agents can autonomously help developers save time and resources by tackling the core issues in gaming: Is the user experience everything the designers intended? How can user-difficulty be personalised? How can one be sure the whole thing just works?
“Game development's mix of technical and non-technical people – designers, programmers, artists, sound engineers – can make it hard to identify when someone makes a change that ‘breaks’ a game. Developers commonly try to solve this by using either hard-coded scripts or human testers to run through their games looking for issues. Neither approach is ideal.
“Simple scripts can break as a result of even the smallest change and more complex ones are costly to implement and maintain. User testing certainly can bring to light both technical and experiential issues, but at even greater expense.
“It’s not feasible to expect users to test after every change so developers must often do the job themselves – further wasting valuable development time. When added to existing continuous integration tooling, our agents allow developers to identify the impacts of changes made to a game, replacing existing test scripts and supplementing user testing.
“To demonstrate this approach, we’ve created a highly simplified game development scenario. We start by hooking up our testing platform to Minecraft’s open source Project Malmo interface, which has a realistic, complex codebase and faces common testing challenges.
“Once that’s done, we start training an agent to run through a level and reach a goal within the game. For simplicity’s sake, we’ve set up a very simple navigation task that consists of one level containing a few rooms with doors between them.
“The agent needs to find its way to a block that represents the goal in one of the rooms. When it starts training over the level, things can look a bit chaotic. But our agents are smart and it doesn’t take them long to learn to achieve mission goals with high accuracy: it soon figures out an optimal route. Once it becomes proficient, we can report statistics on what it did and how long it took, performance metrics for the game engine and so on. If we’re happy with the numbers, we have a baseline for comparison.
Time to try and break things
“Our game artists come in and change all the temporary level textures with new artwork. We rerun the previously trained agent to confirm that it’s still able to solve the mission.
“When we compare the generated report to our baseline, we can see that the changes don’t have a negative impact on our metrics. We can accept the changes and adopt them as a new performance baseline.
“If we follow the same process after a more significant change, such as moving some of the elements of the level around, we can identify and report on this as well. Here we’ve made a relatively innocuous change: replacing the easily avoided lava pit with part of a wall and keeping everything else the same. We just re-run the agent we’d previously trained, and the automatically generated report highlights areas that might be of concern.
“There’s a small impact on performance – it turns out that the new wall adds more light sources, slightly increasing our CPU usage and locally decreasing the frame rate. We can then use this data to make an informed choice about whether this change was worth the cost.
“Finally, even larger changes that force the agent to completely alter its trajectory are detected and reported on – alerting us to potential issues. By adding levers to the doors, the designers have created a much more challenging problem that can effectively force the agent to take much longer to find a solution.
“The agent eventually manages to reach the goal but reports what could be a serious issue to the testing or design team. And so it goes.
“The agent, and thousands like it, can run through levels at great speed detecting and reporting on the impacts of changes big and small, freeing up human testers for more interesting tasks and letting designers and developers get on with what they do best – making great games.”
• Pictured above: PROWLER.io CEO, Vishal Chatrath