-
Notifications
You must be signed in to change notification settings - Fork 6
This repo contains a data science project to identify patients at high-risk of Alzheimer's disease.
License
chuktuk/Alzheimers_Disease_Analysis
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repo contains the entire Alzheimer's disease capstone project. There are in depth ipynb files for each section of the project that culminate in a milestone report and a final report. Presentation of the results are included in the AD_Final_Presentation.pptx file. Custom modules (.py files) were created and used for this analysis. Classical statistical techniques were used to identify threshold values for predicting high-risk patients for Alzheimer's disease. Supervised machine learning algorithms were used to generate predictive models capable of identifying high-risk patients for Alzheimer's disease. Word Documents: Capstone1_Ideas Initial ideas for the capstone project Capstone1_Proposal Proposal for capstone project IPython Notebooks: 1-Data_Import_and_Clean.ipynb data wrangling and cleaning steps 2-Data_Storytelling.ipynb exploratory data analysis 3-Statistical_Data_Analysis.ipynb statistical analysis 4-Capstone_Milestone_Report.ipynb summary milestone report 5-Machine_Learning.ipynb in depth analysis, machine learning, and predictive modeling 6-Alzheimers_Final_Report.ipynb final report Additional IPython Notebook: zz2-Data_Storytelling_Raw-Copy1.ipynb additional exploratory analyses omitted from final product Custom modules: adnidatawrangling.py code to wrangle and clean the data (derived from 1-Data_Import_and_Clean.ipynb) eda.py code to produce exploratory data analysis results and visualizations (derived from 2-Data_Storytelling.ipynb) ml.py code to process and summarize machine learning algorithms (derived from 5-Machine_Learning.ipynb) sda.py code for statistical analysis (derived from 3-Statistical_Data_Analysis.ipynb) Borrowed modules: feature_selector.py feature selection tool obtained from https://github.com/WillKoehrsen/feature-selector Data files: ADNIMERGE.csv comma separated values file containing the data used for this analysis ADNIMERGE_DICT.csv comma separated values file containing a data dictionary for ADNIMERGE.csv Additional files from exploratory data analysis: pairplot.png image of pairplot to explore possible correlations pairplot_clin.png image of pairplot to explore possible correlations pairplot_scans.png image of pairplot to explore possible correlations
About
This repo contains a data science project to identify patients at high-risk of Alzheimer's disease.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published