Most existing electrocardiogram (ECG) feature extraction methods rely on rule-based approaches. It is difficult to manually define all ECG features. We propose an unsupervised feature learning method using an autoencoder and variational autoencoder (VAE) that can extract ECG features with unlabeled data.
Autoencoder was trained using over 2,000,000 ECG samples from 26,481 patients and VAE was trained using 596,000 ECG samples from 1,278 patients, respectively.
Two external datasets, which were, Shaoxing and MIT-BIH dataset, were used for feature validation using two approaches. First, we explored the features without an additional training process. Clustering, latent space exploration, and anomaly detection were conducted. We confirmed that ECG features from models reflected the various types of ECG rhythms. Second, we applied ECG features to new tasks as input data and model’s encoder weights to weight initialization for different models as transfer learning for the classification of 12 types of arrhythmias. For evaluation of autoencoder, features, transfer learning and clustering were applied. For evaluation of VAE, two more methods, which are anomaly detection and latent space exploration, were applied including transfer learning and clustering.
In experiments of transfer learning using features from unsupervised model, the performance of arrhythmia classification was improved when weight initialization was applied. The f1-score for arrhythmia classification with XGBoost were 0.85, 0.86 using autoencoder and VAE features only, respectively. We confirmed that features from models can be clustered reflecting its characteristics. Moreover, in additional experiments for VAE, we found that its features implied anomality of ECG and its feature space imply clinical meaning for ECG.
We confirmed that unsupervised feature learning can extract the characteristics of various types of ECGs and can be an alternative to the feature extraction method for ECGs.