keyboard_arrow_up
Deep Learning Based Data Governance for Chinese Electronic Health Record Analysis

Authors

Junmei Zhong1, Xiu Yi2, Jian Wang2, Zhuquan Shao2, Panpan Wang2 and Sen Lin2, 1Inspur USA Inc, USA and 2Inspur Software Group, China

Abstract

Electronic health record (EHR) analysis can leverage great insights for improving the quality of human health care. However, the low data quality problems of missing values, inconsistency, and errors in the data columns hinder building robust machine learning models for data analysis. In this paper, we develop a methodology of artificial intelligence (AI)-based data governance to predict the missing values or verify if the existing values are correct and what they should be when they are wrong. We demonstrate the performance of this methodology through a case study of patient gender prediction and verification. Experimental results show that the deep learning algorithm works very well according to the testing performance measured by the quantitative metric of F1-Score, and it outperforms support vector machine (SVM) models with different vector representations for documents.

Keywords

EHR Analysis, Data Governance, Vector Space Model, Word Embeddings, Machine Learning, Convolutional Neural Networks.

Full Text  Volume 8, Number 5