Project Summary: Chronic Disease Risk Prediction
Overview:
This project leverages machine learning to identify key lifestyle and health factors influencing chronic disease risk, focusing on diabetes and heart disease. Using the BRFSS dataset, the study applies advanced modeling techniques to predict high-risk populations and improve preventive healthcare strategies.
Key Objectives:
β
Analyze Chronic Disease Risk Factors β Identify key contributors to diabetes and heart disease.
β
Develop Predictive Models β Use machine learning (Logistic Regression, Decision Trees, Random Forest) to forecast risk levels.
β
Segment High-Risk Populations β Define patient groups needing targeted interventions.
Data & Methodology:
π Dataset: BRFSS 2022 (Behavioral Risk Factor Surveillance System)
π Tech Stack: Python π (Pandas, NumPy, Scikit-learn), Power BI, Tableau
π Models Used:
- Diabetes Prediction: Random Forest (85% accuracy, F1-score 0.80), Decision Tree (85% accuracy)
- Heart Disease Prediction: Random Forest (93% accuracy, F1-score 0.91), Decision Tree (94% accuracy)
Key Findings & Insights:
π Diabetes Risk Factors: BMI > 30, arthritis, pulmonary disease, smoking, mental health, cancer history, COVID symptoms.
π Heart Disease Risk Factors: Age > 45, BMI > 30, stroke history, difficulty walking, smoking, poor sleep.
π Feature Engineering & Model Optimization significantly improved prediction accuracy and recall rates.
Impact & Future Scope:
This project provides actionable insights for healthcare providers and policymakers to implement targeted prevention strategies. Future work includes real-time health monitoring integration and expanding models to other chronic diseases.