Churn Analysis for a Scientific Publisher

Churn Analysis Result in a table

The Problem

“How can we identify accounts at risk of cancelling their subscription?”—this was the question asked to me by a large Scientific Publishing company. I was given 10,000 records of data across 3 tables, and left to figure it out. The deliverable: A deck of 8 slides with analysis, recommendations, and my methodology.

However…

I knew I would get lost if I tried to gather the insights using my usual methods. And the turnaround time was short—3 days. I only had my evenings to spare towards this project.

I needed a new approach.

So, I decided to use a new tool in my data toolkit.

Enter a Market Basket Analysis

It was time to deploy machine learning and predictive analytics. I wanted to find the group of customers who were most likely to lapse in their subscriptions.

After overcoming encoding and data wrangling challenges, I conducted a Market Basket Analysis on the data, using the mlxtend library in Python.

(If you want more information on the business case, how a Market Basket Analysis works, and the challenges it involves, I wrote a data story about it: How I Used a Market Basket Analysis to Get a Job Offer.)

The analysis looked at over a dozen variables, including the country they were from, the size of the account, their NPS score, and more.

It gave me a result.

The Result

I found that customers who joined in 2012 (the oldest year in the dataset), who were small accounts, and had a missing or low NPS score were 3.12x more likely to churn. These customers represented about 5.3% of their accounts with an estimated revenue of $1.7 million.

I created a presentation with a data story building the case for my recommendations. If 50% of lapsed subscribers were saved, it would save the company over $800,000.