Kenneth Griffin PhD, PE - Identifying Abnormal Rhythms on Electrocardiograms with Keras Tensorflow

Methods: Python

Data Analysis: Pandas, Numpy,

Data Visualization: Matplotlib, Pydot, Graphviz

Data Modeling: Keras, Tensorflow, Neural Network

Web Development: HTML, CSS

Identifying Abnormal Rhythms on Electrocardiograms

This example will train an autoencoder to identify abnormal rhythms on ECG5000 electrocardiograms dataset with Keras Tensorflow. The coding exercise is provided in the python jupyter notebook below.
The full dataset contains 5,000 Electrocardiograms with 140 points each. The simplified version of the dataset is used for this example. The simplified version has been labeled either 1 (corresponding to a normal rhythm) or 0 (corresponding to an abnormal rhythm). A neural network model is created in Keras Tensorflow with an input layer, one hidden layer, and one output layer.
The neural network was determined to have an accuracy of 98.9% with 100 epochs. The number of epochs can be adjusted to further optimize the model. A baseline model comparison would be always selecting 0 for every point (41%) or selecting 1 for every point (58%). This model can be used to inspect future Electrocardiograms.

Tables and graphs can be found here: Identifying Abnormal Rhythms on Electrocardiograms with Keras Tensorflow Python Jupyter Notebook.

Results

-The full dataset contains 5,000 Electrocardiograms with 140 points each and was published in: Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. Ch., Mark, R. G., Mietus, J. E., Moody, G. B., Peng, C.-K., & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 101(23), e215-e220.

-The simplified version of the dataset is used for this example. The simplified version has been labeled either 1 (corresponding to a normal rhythm) or 0 (corresponding to an abnormal rhythm).

-The total dataset df is split into 80% training and 20% test dataframes. The dataset is split before being normalized to ensure the training and test dataframes have equal distributions.

-The training and test dataframes are normalized with the calculated means and standard deviations. The X_train and X_test dataframes are converted to Numpy arrays because this is an easy way to input Keras/Tensorflow data.

-The neural network will contain one hidden layer with Relu activation functions. A sigmoid activation function is used for the output layer since this is a binary classification problem.

-The neural network optimization parameters were set to: Optimizer - Adam is selected as a default which is an SGD flavor. Loss Function - The model's output is binary, so binary_crossentropy is selected. Metrics - This is a classifcation problem, so Accuracy is selected.

-20% of the data will be used as a validation dataset. Validation data can be useful for overfitting detection and regularization by early stopping. Early stopping can be completed by identifying the number of epochs where accuracy levels off.

-The training and loss, as well as accuracy plots are inspected to see if there is overfitting. If we suspect there is overfitting at a specific epoch N, then the model can be reinitialize for that number, N epochs. Specifying the number of epochs before overfitting occurs is called early stopping. The test accuracy below can be compared to determine which model is better.

-The neural network was determined to have an accuracy of 98.9% with 100 epochs. The number of epochs can be adjusted to further optimize the model. A baseline model comparison would be always selecting 0 for every point (41%) or selecting 1 for every point (58%). This model can be used to inspect future Electrocardiograms.

Identifying Abnormal Rhythms on Electrocardiograms

Results

Contact