TicTacToe RL

April 20, 2020 · 2 min read

Machine Learning Engineer & Data Scientist

This is an implementation of the classic Tic Tac Toe game, powered by Reinforcement Learning (RL)! This project demonstrates how an RL agent can learn to play Tic Tac Toe optimally through self-play and Temporal Difference (TD) learning.

Project Overview

Purpose: Train an RL agent to play Tic Tac Toe using self-play and TD(0) learning, and provide both a command-line and graphical interface for users to play against the trained agent.
Key Features:
- RL agent learns state values for all possible board configurations (over 19,000 states)
- Epsilon-greedy policy for balancing exploration and exploitation during training
- Pygame-based graphical UI for interactive play
- Command-line interface for quick testing
- Well-documented code and modular structure

How It Works

The RL agent is trained using a simple TD(0) update rule:

v(s) ← v(s) + α (v(s') - v(s))

v(s): Value of the current state
v(s'): Value of the next state
α: Learning rate

During training, the agent plays games against itself, updating state values based on the outcome and gradually improving its strategy. The agent uses an epsilon-greedy policy to occasionally explore random moves, ensuring a robust learning process.

Directory Structure

game_app.py: Pygame-based UI for playing against the RL agent
test_game.py: Command-line interface for testing
training_self_play.py: RL training logic
tic_tac_toe.py: Core game logic and state management
requirements: Python dependencies

Getting Started

Install Dependencies:
- Python 3.5+
- Install required packages:
```
pip install -r requirements
```
Train the RL Agent:
- Run training_self_play.py to train the agent (optional, pre-trained values included)
Play the Game:
- Graphical UI:
```
python game_app.py
```
- Command-line:
```
python test_game.py
```

Gameplay

The RL agent can play as either X or O.
In the Pygame UI, the agent and user take turns; click on a square to make your move.
After each game, click anywhere to restart.

Project Overview​

How It Works​

Directory Structure​

Getting Started​

Gameplay​

References​

Project Overview

How It Works

Directory Structure

Getting Started

Gameplay

References