If the constant talk about AI these days only leaves you confused then this visually satisfying video may be just the balm you need. Using a neural network and reinforcement learning, YouTuber Yosh set out on a 3 year journey to train an AI to supersede his own 17 years of Trackmania experience.
The premise is a simple one: train an AI to improve at the game and, as Yosh himself puts it, “the more it trains the better it gets”. This isn’t Yosh’s first rodeo either: he’s made previous videos experimenting with the tech and trying to create a Trackmania AI capable of beating himself. His YouTube channel has accumulated over 18 million views worldwide and sits at just under 100,000 subscribers.
The neural network is described in the video as a “mathematical tool which roughly models how a brain works”, and takes in parameter data like turning rate, and speed, and then in response instructs the car what to do. The more it plays, the more data is gathered to optimise performance. Any actions taken by the AI that were predetermined as beneficial, provided it with a reward. This reinforcement learning pushed its decision making towards faster times and more efficient choices.
The venerable Trackmania is almost the perfect focus for this kind of approach: simple and clear rules on tracks and movements, combined with a trial-and-error style of play that itself is visualised by replays which can be layered atop one another. The shots of hundreds of cars attempting, failing, and learning to progress makes the whole learning process easy to understand. It is also extremely satisfying to watch.
Yosh starts the AI on a simple track and, as it begins to develop, more complex ones are introduced as well as the option to brake which initially was left out. This added braking ability was introduced to try and encourage drifting and therefore quicker times. To do this any kind of drifting was initially rewarded, which was a mistake: the AI managed to outsmart its creator and found a way to constantly drift, resulting in plentiful positive feedback for the model but a low top speed. This unintended behaviour was fixed with a simple speed requirement added in so it would only get rewarded for drifting over a certain speed.
The AI’s progress is significant throughout the video and I quickly became invested on how far it could be pushed. If you want to find out if it was able to become truly unbeatable then join the millions of us who have watched it to see for yourself: and if you just want to see the man versus machine showdown, here’s the timestamp.