LeetCode Scraper is a Python-based tool designed to fetch and store details from LeetCode study plans into a PostgreSQL database. This tool leverages Docker for easy setup and environment management.
- Fetches LeetCode problems and study plans
- Stores data in a PostgreSQL/Supabase database
- Provides caching to reduce redundant requests
- Handles rate limiting with retry mechanisms
Before you begin, ensure you have met the following requirements:
- Docker and Docker Compose installed on your machine.
- Python 3.9 or higher.
- PostgreSQL database.
Clone the Repository
git clone --recurse-submodules https://github.com/daily-coding-problem/leetcode-scraper.git
cd leetcode-scraper
Setup Python Environment
Use the following commands to set up the Python environment if you do not want to use Docker:
python -m venv .venv
source .venv/bin/activate
pip install poetry
poetry install --no-root
Setup Docker
If you would like to use Docker, ensure Docker and Docker Compose are installed on your machine. If not, follow the installation guides for Docker and Docker Compose.
Build Docker Images
docker compose build
Create the Network
docker network create dcp
Environment Variables
Create a .env
file in the project root with the following content:
# LeetCode credentials
CSRF_TOKEN=your_csrf_token
LEETCODE_SESSION=your_leetcode_session
# PostgreSQL credentials
POSTGRES_USER=your_db_user
POSTGRES_PASSWORD=your_db_password
POSTGRES_DB=your_db_name
POSTGRES_PORT=5432
Run the scraper with the specified plans:
docker compose run leetcode-scraper --plans leetcode-75 top-interview-150
Or without Docker:
poetry run python main.py --plans leetcode-75 top-interview-150
Run the scraper with the specified company and timeframe:
docker compose run leetcode-scraper --company google --timeframe 3m
Or without Docker:
poetry run python main.py --company google --timeframe 3m
This will fetch the most asked questions at Google in the last 3 months.
The options for --timeframe
are: 30d
, 3m
, or 6m
.
- If no timeframe is specified, the default is
6m
. - If the timeframe is invalid, the default will be used.
Run the tests with the following command:
poetry run pytest
This project is licensed under the MIT License - see the LICENSE file for details.