Authors
Audrey Han 1 and Howard Lee 2, 1 USA, 2 California State University, USA
Abstract
Parkinsons disease is a prevalent neurodegenerative disease that primarily affects the elderly. While it currently has no cure, early detection is essential in treating and reducing symptoms. Our project aims to make prediction of Parkinsons disease more accessible and convenient for patients by using keystroke data which can be easily gathered by any keyboard. In order to achieve this, we used machine learning, specifically predictive machine learning. We tested several different types of models including Random Forest Classifier, Logistic Regression, Gradient Boosting, and Support Vector Machines. In the end, our logistic regression model had the highest level of accuracy. We proceed to test the models with simulated data, once with a simulated sample of all positive cases and once with a sample of a mixed of positive and negative cases. These experiments supported our conclusion that logistic regression was our most reliable model. We then created a website that allows users to predict whether or not they have Parkinsons using their own keystroke data. This made our model accessible to the general public and easy to use, encouraging proactive testing and preventative measures.
Keywords
Parkinson's, Machine Learning, Python, Scikit-learn