Profile Image
  • Methods: Python
  • Data Analysis: Pandas, Numpy,
  • Data Visualization: Matplotlib, Pydot, Graphviz
  • Data Modeling: Keras, Tensorflow, Neural Network
  • Web Development: HTML, CSS

Identifying Abnormal Rhythms on Electrocardiograms

  • This example will train an autoencoder to identify abnormal rhythms on ECG5000 electrocardiograms dataset with Keras Tensorflow. The coding exercise is provided in the python jupyter notebook below.
  • The full dataset contains 5,000 Electrocardiograms with 140 points each. The simplified version of the dataset is used for this example. The simplified version has been labeled either 1 (corresponding to a normal rhythm) or 0 (corresponding to an abnormal rhythm). A neural network model is created in Keras Tensorflow with an input layer, one hidden layer, and one output layer.
  • The neural network was determined to have an accuracy of 98.9% with 100 epochs. The number of epochs can be adjusted to further optimize the model. A baseline model comparison would be always selecting 0 for every point (41%) or selecting 1 for every point (58%). This model can be used to inspect future Electrocardiograms.
  

Results

-The full dataset contains 5,000 Electrocardiograms with 140 points each and was published in: Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. Ch., Mark, R. G., Mietus, J. E., Moody, G. B., Peng, C.-K., & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 101(23), e215-e220.

-The simplified version of the dataset is used for this example. The simplified version has been labeled either 1 (corresponding to a normal rhythm) or 0 (corresponding to an abnormal rhythm).

-The total dataset df is split into 80% training and 20% test dataframes. The dataset is split before being normalized to ensure the training and test dataframes have equal distributions.

-The training and test dataframes are normalized with the calculated means and standard deviations. The X_train and X_test dataframes are converted to Numpy arrays because this is an easy way to input Keras/Tensorflow data.

-The neural network will contain one hidden layer with Relu activation functions. A sigmoid activation function is used for the output layer since this is a binary classification problem.

-The neural network optimization parameters were set to: Optimizer - Adam is selected as a default which is an SGD flavor. Loss Function - The model's output is binary, so binary_crossentropy is selected. Metrics - This is a classifcation problem, so Accuracy is selected.

-20% of the data will be used as a validation dataset. Validation data can be useful for overfitting detection and regularization by early stopping. Early stopping can be completed by identifying the number of epochs where accuracy levels off.

-The training and loss, as well as accuracy plots are inspected to see if there is overfitting. If we suspect there is overfitting at a specific epoch N, then the model can be reinitialize for that number, N epochs. Specifying the number of epochs before overfitting occurs is called early stopping. The test accuracy below can be compared to determine which model is better.

-The neural network was determined to have an accuracy of 98.9% with 100 epochs. The number of epochs can be adjusted to further optimize the model. A baseline model comparison would be always selecting 0 for every point (41%) or selecting 1 for every point (58%). This model can be used to inspect future Electrocardiograms.


Contact

Please feel free to reach out through the following platforms: