Part 1 and part 2 of my first predictive analytics project can be found here. Read them? Cool. Let’s get into the origins of a market basket analysis, and learn a few terms…
Market Basket Analysis originates from the supermarket industry.
Some questions grocery stores may ask are:
- If the customer buys milk and potatoes, what else are they buying?
- What do customers who spend over £200 and buy once a month buy?
- What are customers who buy and never come again buying?
With razor-thin margins and a Just In Time supply chain, these can be crucial questions. It can be the difference between having too much stock and not enough. Neither is good.
If I start writing about statistics:
- I’ll lose 3/4 of you because that’s just how writing works.
- These posts will quadruple in length.
- The risk of me saying something stupid goes up—I can drive the car. I can’t build the engine. There are now fancy libraries and scripts out there to do that.1I know the fundamentals of the theory. I know its limitations. But deriving it from first principles? Not happening in this life or the next.
But here is one mathematical term that’s important to know: Lift.
A Market Basket Analysis is also called Association Rule Mining (or Association Rule Learning). It looks at a bunch of things, or “variables”, and sees how much they “associate” with each other.
These variables that associate together are called a “rule”.
At the end, it will tell you the “lift”, which is how much more likely that rule (or bunch of things) occurs compared to other rules.
It’s a form of machine learning, because, I kid you not, what your computer does is it just brute-force calculates each individual rule in the dataset. This can be thousands upon thousands of rules…
This, I believe, introduces the first problem you can encounter with a Market Basket Analysis: Choosing your data carefully.
The first time I ran my Market Basket Analysis it took two and a half hours to run on my MacBook Pro, only to give me a bunch of outputs I couldn’t use.
The next time I ran it? 15 minutes, and a clear series of actionable insights.
You can see my breakdown of how a Market Basket Analysis works in my write-up here.
I warn you, the entire post takes 29 minutes to read. Just read that little excerpt.
Next time I’ll discuss the big challenge with a Market Basket Analysis—ensuring your data is encoded properly.
Stay tuned.
This was first published on my LinkedIn.
You can read part 4 of this series here.