Heart Failure Death Prediction Project in Data Challenge 2020
1.What I did
In Junior Fall 2020, I collaborated with Shaonan Wang, an ISYE major at UW-Madison, on the Data Challenge hosted by Data Science Club. The task was to preprocess the dataset given and develop best prediction statiscal/ML models possible (“best” means highest accuracy)
The dataset came from Kaggle.com, about predicting death due to heart failure based on twelve health characteristics on almost 300 patients (diabetes, high blood pressure, etc.). We applied Logistic Regression and Random Forest, with the latter achieving 87% at the end. For model evaluation, we used confusion matrix and ROC curve. Finally, we did an analysis report and presented our findings to professors and other teams.