This project implements an autonomous reinforcement learning agent that learns to play Cities Skylines 2 through pure visual observation and keyboard/mouse inputs. The agent operates with no access to the game's internal state or API, using only what it can "see" on screen to make decisions.
The project uses a deep reinforcement learning approach with the following components:
-
Environment: Captures game state through screen capture, processes observations, and manages interactions with the game.
environment/core
: Core environment interfaces and infrastructureenvironment/input
: Keyboard and mouse input simulationenvironment/menu
: Menu detection and navigationenvironment/rewards
: Reward computation based on visual changesenvironment/mock_environment.py
: Simulated environment for testing without the gameenvironment/optimized_capture.py
: Optimized screen capture implementationenvironment/visual_metrics.py
: Visual metrics and measurements
-
Agent: Implements the PPO (Proximal Policy Optimization) reinforcement learning algorithm.
agent/core
: Core agent components (policy, value, memory, updater)agent/memory_agent.py
: Memory-augmented agent implementationagent/hierarchical_agent.py
: Hierarchical agent architecture
-
Memory: Implements memory-augmented architectures for enhanced agent capabilities.
memory/memory_augmented_network.py
: Neural memory architecturememory/episodic_memory.py
: Episodic memory functionality
-
Model: Neural network architecture for policy and value functions.
model/optimized_network.py
: Optimized CNN network for visual processingmodel/visual_understanding_network.py
: Visual scene understandingmodel/world_model.py
: World modeling for predictionsmodel/error_detection_network.py
: Error detection for the game
-
Training: Manages the training process, checkpoints, and signal handling.
training/trainer.py
: Training loop and managementtraining/checkpointing.py
: Checkpoint saving and loadingtraining/signal_handlers.py
: Handles interrupts and signalstraining/memory_trainer.py
: Trainer for memory-augmented agenttraining/hierarchical_trainer.py
: Trainer for hierarchical agent
-
Utils: Utility functions and services including monitoring capabilities.
utils/image_utils.py
: Image processing utilitiesutils/hardware_monitor.py
: System resource monitoringutils/performance_safeguards.py
: Ensures stable performanceutils/visualization.py
: Visualization toolsutils/path_utils.py
: Path management for consistent file locations
-
Config: Configuration for hardware and action space.
config/hardware_config.py
: Hardware configurationconfig/action_space.py
: Action space definitionconfig/training_config.py
: Training parameters and settingsconfig/config_loader.py
: Configuration loading utilities
-
Benchmarks: Tools for performance analysis and optimization.
benchmarks/benchmark_agent.py
: Measures agent performance metrics
-
Tests: Automated testing infrastructure.
tests/test_mock_environment.py
: Tests for the mock environment
cities-skylines-2/
โโโ src/ # Source code directory
โ โโโ agent/ # Agent modules
โ โ โโโ core/ # Core agent components
โ โ โ โโโ memory.py # Memory buffer implementation
โ โ โ โโโ policy.py # Policy network and action selection
โ โ โ โโโ ppo_agent.py # PPO algorithm implementation
โ โ โ โโโ updater.py # Network update logic
โ โ โ โโโ value.py # Value function implementation
โ โ โโโ hierarchical_agent.py # Hierarchical agent architecture
โ โ โโโ memory_agent.py # Memory-augmented agent implementation
โ โ
โ โโโ benchmarks/ # Benchmarking tools
โ โ โโโ benchmark_agent.py # Agent performance evaluation
โ โ
โ โโโ config/ # Configuration
โ โ โโโ defaults/ # Default configuration files
โ โ โโโ action_space.py # Action space definition
โ โ โโโ config_loader.py # Configuration loading utilities
โ โ โโโ example_config.json # Example configuration file
โ โ โโโ hardware_config.py # Hardware-specific configuration
โ โ โโโ training_config.py # Training parameters and settings
โ โ
โ โโโ environment/ # Environment modules
โ โ โโโ core/ # Core environment components
โ โ โโโ input/ # Keyboard and mouse input simulation
โ โ โโโ menu/ # Menu detection and navigation
โ โ โโโ rewards/ # Reward computation
โ โ โโโ mock_environment.py # Simulated environment
โ โ โโโ optimized_capture.py # Screen capture optimization
โ โ โโโ visual_metrics.py # Visual-based metrics calculation
โ โ
โ โโโ memory/ # Memory-augmented architectures
โ โ โโโ episodic_memory.py # Episodic memory implementation
โ โ โโโ memory_augmented_network.py # Neural network with memory capabilities
โ โ
โ โโโ model/ # Neural network models
โ โ โโโ error_detection_network.py # Error detection for the game
โ โ โโโ optimized_network.py # Optimized CNN architecture
โ โ โโโ visual_understanding_network.py # Visual scene understanding
โ โ โโโ world_model.py # World modeling for predictions
โ โ
โ โโโ training/ # Training infrastructure
โ โ โโโ checkpointing.py # Model checkpoint management
โ โ โโโ hierarchical_trainer.py # Trainer for hierarchical agent
โ โ โโโ memory_trainer.py # Trainer for memory-augmented agent
โ โ โโโ signal_handlers.py # Handles system signals during training
โ โ โโโ trainer.py # Base trainer implementation
โ โ โโโ utils.py # Training utility functions
โ โ
โ โโโ tests/ # Test scripts
โ โ โโโ test_mock_environment.py # Tests for the mock environment
โ โ
โ โโโ utils/ # Utility functions and monitoring
โ โ โโโ hardware_monitor.py # System resource monitoring
โ โ โโโ image_utils.py # Image processing utilities
โ โ โโโ path_utils.py # Path management for consistent file locations
โ โ โโโ performance_safeguards.py # Ensures stable performance
โ โ โโโ visualization.py # Data visualization tools
โ โ
โ โโโ __init__.py # Package initialization
โ โโโ train.py # Main training entry point
โ
โโโ docs/ # Documentation
โ โโโ agent.md # Agent documentation
โ โโโ architecture.md # System architecture overview
โ โโโ environment.md # Environment documentation
โ โโโ improvements.md # Future improvements
โ โโโ model.md # Model documentation
โ โโโ README.md # Documentation overview
โ โโโ training.md # Training process documentation
โ
โโโ scripts/ # Utility scripts
โ โโโ benchmark.py # Performance benchmarking
โ โโโ dashboard.py # Real-time monitoring dashboard
โ โโโ hyperparameter_tuning.py # Hyperparameter optimization
โ โโโ run_environment.py # Run the environment standalone
โ โโโ run_mock_training.py # Training with mock environment
โ โโโ visualize_training.py # Training visualization tools
โ
โโโ .github/ # GitHub configuration
โโโ .gitignore # Git ignore rules
โโโ CONTRIBUTING.md # Contribution guidelines
โโโ LICENSE # Project license
โโโ README.md # Main README (this file)
โโโ requirements.txt # Python dependencies
Note: The following directories are created at runtime:
logs/
: Generated log files and tensorboard dataoutput/
: Generated outputs, visualizations, and benchmark resultscheckpoints/
: Model checkpoints during and after training
These directories are automatically created in the project root when needed by the application. All paths are managed by the path utilities in src/utils/path_utils.py
, ensuring consistent file locations regardless of which directory you run the scripts from.
- Windows 10/11
- NVIDIA GPU (RTX 3080 Ti recommended)
- Cities: Skylines 2
- Python 3.10+
- Clone the repository:
git clone https://github.com/sfatkhutdinov/cities-skylines-2.git
cd cities-skylines-2-agent
- Create a virtual environment:
python -m venv venv
- Activate the virtual environment:
# On Windows
venv\Scripts\activate
# On Linux/Mac
source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
To start training the agent:
python src/train.py
Common options:
--num_episodes=1000 # Number of episodes to train
--max_steps=1000 # Maximum steps per episode
--checkpoint_dir=checkpoints # Directory to save checkpoints
--render # Display visualization during training
--mock_env # Use mock environment for testing
--mixed_precision # Enable mixed precision training
--hardware_config=path/to/config.json # Custom hardware configuration
The project includes a mock environment for testing without requiring the actual game:
python src/tests/test_mock_environment.py
This will run a series of tests on the mock environment, including:
- Basic functionality tests
- Complete episode simulation
- Error condition handling (crashes, freezes, menus)
- Visualization generation
The mock environment simulates:
- City building mechanics
- Population and budget dynamics
- Game crashes and freezes
- Menu interactions
- Reward computation
Run comprehensive benchmarks on the agent and environment:
python src/benchmarks/benchmark_agent.py --episodes=10 --steps=500 --output=benchmark_results
Common options:
--episodes=10 # Number of episodes to run
--steps=500 # Maximum steps per episode
--config=path/to/config.json # Configuration file
--output=benchmark_results # Output directory name
--gpu # Force GPU usage
--cpu # Force CPU usage
--mixed_precision # Use mixed precision
The benchmark will generate:
- Performance metrics (rewards, episode lengths, success rates)
- Hardware utilization statistics (CPU, GPU, memory)
- Visualizations of agent performance
- Detailed JSON and text reports
The project includes several utility scripts to help with development, analysis, and monitoring:
Optimize agent hyperparameters:
python scripts/hyperparameter_tuning.py --method=random --trials=10 --visualize
Common options:
--output=hyperparameter_results # Output directory for results
--method=[grid|random] # Search method (grid or random search)
--trials=10 # Number of trials for random search
--mock # Use mock environment
--epochs=5 # Training epochs per trial
--episodes=5 # Episodes per epoch
--parallel=4 # Number of parallel trials to run
--visualize # Generate visualizations of results
Run a real-time dashboard to monitor agent performance:
streamlit run scripts/dashboard.py -- --log_dir=logs
Common options:
--log_dir=logs # Directory containing training logs
--port=8501 # Port to run the dashboard on
--host=localhost # Host to run the dashboard on
--refresh_interval=30 # Dashboard refresh interval in seconds
See the docs directory for detailed documentation on each component.
Run the tests with:
python -m unittest discover tests
The agent includes several performance optimization features:
- Adaptive Resource Management: Automatically adjusts resource usage based on system capabilities
- Mixed Precision Training: Reduces memory usage and increases training speed on compatible GPUs
- Error Recovery: Handles game crashes and freezes gracefully to continue training
- Hardware Monitoring: Tracks system resource usage to identify bottlenecks
- Performance Benchmarking: Tools to measure and optimize agent performance
Training progress is logged to the logs
directory. You can also use Weights & Biases for more detailed monitoring:
python src/train.py --use_wandb
This project is licensed under the MIT License - see the LICENSE file for details.