This project aims to predict whether a hotel booking will be canceled using a Machine Learning Operations (MLOps) pipeline. It integrates data ingestion, preprocessing, model training, deployment, and a user-friendly web application for real-time predictions. The model is trained on the Hotel Reservations Dataset, which includes various booking attributes influencing cancellations.
├── artifacts/ # Stored models & processed data
├── config/ # Configuration files (YAML, parameter settings)
├── custom_jenkins/ # CI/CD pipeline setup (Jenkins)
├── notebook/ # Exploratory Data Analysis (EDA) & experimentation
├── pipeline/ # ML pipeline for data preprocessing & training
├── src/ # Core scripts (ingestion, preprocessing, model training)
├── utils/ # Utility functions for data transformation & logging
├── static/ # Web UI stylesheets
├── templates/ # HTML templates for the web UI
├── logs/ # Application & model logging
├── venv/ # Virtual environment setup
├── Dockerfile # Containerization setup
├── Jenkinsfile # CI/CD automation pipeline
├── requirements.txt # Required dependencies
└── setup.py # Package installation setup
- Database & Project Setup: Configure and initialize the required infrastructure.
- Data Ingestion & Processing: Load, clean, and preprocess hotel booking data.
- Model Training & Experimentation: Train a LightGBM model with hyperparameter tuning.
- Versioning & Tracking: Maintain data and code versioning for reproducibility.
- CI/CD Pipeline: Automate training, validation, and deployment using Jenkins.
- Web Application: Flask-based UI for real-time cancellation prediction.
- Deployment: Containerized with Docker and deployed on Google Cloud Run.
Ensure you have Python installed (Recommended: Python 3.8+).
# Clone the repository
git clone https://github.com/SM0311/MLOPS-HOTEL-BOOKING-CANCELLATION.git
cd MLOPS-HOTEL-BOOKING-CANCELLATION
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run Flask Application
python application.py
- Train Model:
python training_pipeline.py
- Run Web App Locally:
python application.py
- Docker Deployment:
docker build -t hotel-predict . docker run -p 5000:5000 hotel-predict
This project follows a CI/CD workflow using Jenkins:
- Automated data ingestion and preprocessing.
- Continuous model training and validation.
- Containerized deployment via Docker & Google Cloud Run.
The model leverages LightGBM for high accuracy and efficiency. Performance metrics include:
- Accuracy: 90%+
- Precision & Recall: Balanced trade-off for optimal prediction.
- Feature Importance: Identifies key booking attributes influencing cancellations.
Contributions are welcome! Feel free to fork this repository, create pull requests, and report issues.
For inquiries, reach out via Email: msuraj20@yahoo.com