I bet you know this feeling: an item you need is on sale, so you gleefully add it to your cart and start thinking, “What else can I buy with all this money I just saved?” A few clicks (or turns around the store) later, and you’ve got a lot more in your cart than you came for.
This is such a common phenomenon that some retailers openly exploit it. Amazon has those “add-on” items that are cheaper (or only available) if you add them to an order of a certain size. I’m pretty sure I’ve heard Target ads that crack jokes about the experience of coming to the store for something essential and leaving with a cartful of things you didn’t need but wanted once you saw what a great deal you were getting.
For a project in my data science bootcamp, I was asked to form and test some hypotheses using a database containing product and sales data from a fictional-but-realistic dealer in fine food products. It’s the Northwind database, and Microsoft created it as a sample for learning how to use some of their database products. While the data isn’t really real, it’s realistic, so most of the time it behaves the way you would expect real sales data to behave. It’s also really, really clean, which is unusual in data science.
In this post I’ll walk you through a hypothesis test using Welch’s t-test to determine whether customers spend more once they have been offered a discount (spoiler alert: they do!), and if so, how much more they spend.
Continue reading “Hypothesis-testing the discount bump”