Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Jacques Cloete and his team of DPhil students created this pirate-themed treasure hunt teaches the Exploration/Exploitation Dilemma in machine learning through interactive play. Here, Jacques explains the game and provides everything you need to use it yourself in your outreach and engagement activities.

This blog post was written by DPhil student Jacques Cloete. Along with his team of fellow DPhil students, they were challenged to develop and deliver a new public engagement with research activity as part of their training in the AIMS and StatML Centre for Doctoral Training (CDT) programmes. 

Ready to hunt for treasure and knowledge? Read more to find out how!


Reinforcement Learning as a Treasure Hunt is a pirate-themed educational children’s game that tasks the player with collecting as many coins as they can within a limited number of turns. The coins are buried on a desert island, their locations initially unknown to the player. However, they follow a pattern in how they are buried, related to the landmarks scattered around the island. By figuring out the pattern, the player can determine where the remaining coins are and use this to maximise their score.

A photo of five individuals stood in front of their game design on a large screen

This game seeks to teach children about the Exploration/Exploitation Dilemma, a key concept in reinforcement learning. The idea is that an agent (in this case, the treasure hunter) should try to explore the unknown environment to learn which actions will give it the highest expected reward (in this case, the number of coins it collects), but it must also take the opportunity to extract this reward; since the agent has a limited number of actions, it must strike a balance between these goals.

The game was originally designed to be played by children aged 5-13, but we found that it was enjoyed by players of all ages. It can be run as an interactive activity perfect for family-friendly events; we ran the game as an activity at the Oxford Maths Festival 2024 to great success. Since the placement of the coins and landmarks is randomly generated for each run, the game offers plenty of replayability, and we had children coming back many times to try and beat their high scores.

The game is written entirely in MATLAB, and can be run by anyone with MATLAB installed on their computer. The game itself as well as information on how to play, credits, an example pre/post-game script and more can be found in the game’s GitHub repository:

Click here to access the game and its supporting materials


Lead Game Designers: Jacques Cloete, Harry Mead

Software Developers: Jacques Cloete, Harry Mead

Maintainer: Jacques Cloete

Documentation Author: Jacques Cloete

Pre/Post-Game Script Writer: Darius Muglich

We thank Luisa Kurth, Shozen Dan, Paula Cordero Encinar, Marcel Hedman and Rafael Brutti for their suggestions that helped to shape the design of the game.