This project is part of the DSA4264 module at the National University of Singapore (NUS). It was completed by Year 4 students from Data Science and Analytics, as well as Data Science and Economics majors. The primary objective is to identify three bus routes for potential removal based on data-driven insights.
This project leverages data analysis to evaluate bus route efficiency in Singapore and aims to identify routes that can be optimised or removed to improve public transportation efficiency.
To set up the project on your local machine, follow these steps:
-
Clone the repository to your local machine:
git clone https://github.com/your-username/dsa4264_project.git
-
Navigate to the project directory:
cd dsa4264_project
-
Install the required dependencies:
pip install -r requirements.txt
-
Create a
.env
file in the root directory of the project. -
Obtain two API keys from:
-
Add the API keys to the
.env
file as shown below:API_KEY = 'your_lta_api_key_here' # LTA Datamall API Key ACCESS_TOKEN = 'your_onemap_api_key_here' # OneMap API Key
-
Fetch the data files by running the
data_pulling.py
script, which downloads and prepares the necessary datasets.python data_pulling.py
-
Process the data by running
data_processing.ipynb
. Click on Run all, and this will clean and process the data accordingly. Do note that some functions, such as OSRM will take approximately 5 hours to run. -
Analyze and view results by opening the
main.ipynb
Jupyter notebook. Run all cells to review the analytics, code logic, and decision-making process behind identifying bus routes for removal.
After setting up the .env
file and running data_pulling.py
+ data_processing.ipynb
+ main.ipynb
, you can launch the Flask application to visualise the data:
python app.py
The web app will allow users to interact with the data and view analyses performed in the project.
- Data Collection: Automates the retrieval of data from LTA DataMall and OneMap APIs.
- Analysis: Includes comprehensive data analysis to evaluate the viability of bus routes.
- Visualisation: Provides visual representations of key insights for improved understanding and decision-making.
- Data-Driven Decision-Making: Supports recommendations for optimizing the bus route network by suggesting potential route removals.
You can read our technical documentation in technical_report.html
This project uses Python 3.11.5. All other dependencies are listed in requirements.txt
.
This project was completed in five weeks and was completed by (in no order of contribution):
- Brandon NEO (NUS Y4 DSE)
- CHOW Xin Tian (NUS Y4 DSE)
- LIM Choon Hao (NUS Y4 DSA)
- YOUNG Zhan Heng (NUS Y4 DSE)
We will also like to thank your Adjunct Lecturer, Shaun Khoo, for guiding us throughout this project.