Churn Prediction for Telco Provider

  • Score Awaiting client review
  • Date Published
  • Reading Time 2-Minute Read
Case study data science banner

Churn Prediction for Telco Provider.

Client: Big EU based Telco provider

Industry: Telco

Project Duration: 2 months

Goal: Improve accuracy of existing model which predict which companies will churn, i.e. stop using services of the telco provider

Tech: R

The Challenge

A telco provider approached SmartCat to improve existing churn model that telco internal team had been developed. The problem refers to detecting companies (group contract) that are likely to stop using provider services. The general monthly churn rate is very low (less than 2%) with no obvious or easy-to-detect pattern. Because of this, the client’s internal model had modest results and our goal was to increase the accuracy by 5-10% (this is something the client believed as achievable).

The Approach

Our approach to this project included multiple stages, as follows:

  • Phase 1:  Data cleaning and validation. Exploratory data analysis.
  • Phase 2: Feature extraction.
  • Phase 3: Implementation and evaluation of predictive models for churn one and two months in advance.

The Solution

During the first phase, we used historical data to analyze typical patterns, trends and potential seasonality. Different statistics and visualizations were implemented in R. Also, we validated and cleaned data, since we saw that some related columns had inconsistent values in some cases. These steps were done in permanent communication with a dedicated person on the client side. Before the start of the modeling phase, we extracted many features that were used as input to train machine learning models. The accuracy of models was measured using precision and recall for churners (because of highly imbalanced labels in the dataset), and compared with the client’s model. Also, we performed the analysis of seasonalities and anomalies.

The Results

A predictive algorithm was being trained with historical data and optimized as we strived for our defined goal of prediction accuracy. Many features that we designed using provided data significantly improved the final accuracy. Comparing to the client’s baseline model, for the same recall values, our final model had a 5-10% higher precision, which satisfies customer benchmark.