Recognizing the Strongest ECG Activations

  • Score Awaiting client review
  • Date Published
  • Reading Time 3-Minute Read

Machine learning case in medicine

Understanding the task


Biosense Webster, part of the Johnson & Johnson family of companies, Israel.


Building a model to detect cardiac excitation (hereinafter activation) using neural networks.


Biosense Webster catheters produce heterogenous information, including electrocardiogram signals.

Electrocardiogram signals ECGs help cardiologists to observe specific (reference) points that correspond to some physiological events, for example, the heart muscle contraction moment.

Source data

The customer provided a labelled dataset of unipolar and bipolar ECG signals

9 RAR files containing

~ 400 000 ECG fragments each in .csv format

+ 1 secret RAR file with ~ 400 000 ECG fragment

Ecg fragments were collected from different devices at different stages of technology development. Signals came from different patients –healthy and with pathologies

The task is to detect the activation (the moment of heart muscle contraction)

Each ECG fragment contains 2,5 seconds ( 2500 milliseconds).

The Windows of Interest and the activation are labelled in each ECG fragment

Defining the problem in 2 stages

Success criteria for stage 1

Loss function

We solved the problem of detecting an activation in the ECG fragment as a classification task with 2500 classes (the number of classes equals the length of an ECG fragment 2500 milliseconds), where 0 is no activation, 1 is activation. The task was to predict the probability of activation in each millisecond and choose a point with the maximum probability. The loss function is categorical cross entropy.

How the model was validated

At the first stage of the project, several validation strategies were applied iteratively, depending on the objectives. The strategy presented below allows to evaluate the influence of the size of the training dataset on the model accuracy.


1 File 1
2 File 1 File 2

7 File 1 File 2 File 7


Was used to test the model during the development process


Was used to validate the model after development in order to prevent overfitting effect

Model accuracy and dataset size

The customer provided a labelled dataset of unipolar and bipolar ECG signals

The best results were achieved while training the model with 2,8 mln ECG fragments
Train Size = 2,8 M
Accuracy 1 = 97%
Accuracy 5 =

The acceptance criteria was was achieved while training the model with 0,8 mln ECG fragments
Train size= 0,8 M
Accuracy 1 =
Accuracy 5 =

The validating dataset size is a constant: 400 thousand ECG fragments

Is the model stable?

We trained the model with datasets 1-2 (where we reached acceptance criteria) and found activations in all ECG fragments from datasets 3 9, as well as in the secret data set.

ACCURACY > 95 % in all datasets including the secret one

Predictability of the model is the same

Switching to stage 2


* The model execution time (predicting the activation point for a single signal) should not exceed 20 ms. С language is used in the production environment, which the code will be integrated with.

Average prediction speed

It was calculated with 2000 signals

Acceptable prediction speed 20 ms

Project results

400 thousand ECG fragments are enough for the Deep Learning model to detect an activation with an accuracy of 95

Deep Learning model developed in C language provides the prediction for 7 ms on the CPU and for 2 ms on the GPU

Deep Learning technology solved the problem in less than 3 months, which is more than 5 times faster than the development of an algorithm based on business heuristics and filters