-The full dataset contains 5,000 Electrocardiograms with 140 points each and was published in: Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. Ch., Mark, R. G., Mietus, J. E., Moody, G. B., Peng, C.-K., & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 101(23), e215-e220.
-The simplified version of the dataset is used for this example. The simplified version has been labeled either 1 (corresponding to a normal rhythm) or 0 (corresponding to an abnormal rhythm).
-The total dataset df is split into 80% training and 20% test dataframes. The dataset is split before being normalized to ensure the training and test dataframes have equal distributions.
-The training and test dataframes are normalized with the calculated means and standard deviations. The X_train and X_test dataframes are converted to Numpy arrays because this is an easy way to input Keras/Tensorflow data.
-The neural network will contain one hidden layer with Relu activation functions. A sigmoid activation function is used for the output layer since this is a binary classification problem.
-The neural network optimization parameters were set to: Optimizer - Adam is selected as a default which is an SGD flavor. Loss Function - The model's output is binary, so binary_crossentropy is selected. Metrics - This is a classifcation problem, so Accuracy is selected.
-20% of the data will be used as a validation dataset. Validation data can be useful for overfitting detection and regularization by early stopping. Early stopping can be completed by identifying the number of epochs where accuracy levels off.
-The training and loss, as well as accuracy plots are inspected to see if there is overfitting. If we suspect there is overfitting at a specific epoch N, then the model can be reinitialize for that number, N epochs. Specifying the number of epochs before overfitting occurs is called early stopping. The test accuracy below can be compared to determine which model is better.
-The neural network was determined to have an accuracy of 98.9% with 100 epochs. The number of epochs can be adjusted to further optimize the model. A baseline model comparison would be always selecting 0 for every point (41%) or selecting 1 for every point (58%). This model can be used to inspect future Electrocardiograms.