Cohort analyses for digital businesses: an overview

April 27th, 2012 by Yali

The cohort analysis blog post series

Cohort analysis is a conceptually simple but very powerful analysis approach. Many people in the startup community extol the importance of cohort analysis, most recently Eric Ries in his excellent book The Lean Startup. The reason is simple: cohort analysis provides a straightforward view on the progress made against a specific business goal, be it around product development (product-market fit in lean startup parlance), or equally around improvements in marketing or CRM campaigns.

But cohort analysis is a technique that can be usefully applied in many cases other than startups – indeed, the first uses of cohort analysis were in longitudal studies performed in science and social sciences, for more information on this see Wikipedia.

For companies working in online (be they startup or mature businesses), performing cohort analyses on their web analytics data is a common requirement. To give some examples of the kinds of business questions which a cohort analysis on web analytics data can answer:

  1. To what extent have changes to the checkout funnel driven improved conversion rates?
  2. Does the introduction of feature X increase the likelihood that new customers will sign up for the service?
  3. Are customers acquired via email marketing more likely to repeat purchase or be upsold, compared to those acquired e.g. via AdWords marketing?

Unfortunately, performing cohort analyses on web analytics data can be tricky – especially for those analysts who rely on Google Analytics. In this series of blog posts, we outline:

  1. What is cohort analysis?
  2. Why is cohort analysis valuable?
  3. The methodology for performing cohort analysis
  4. How to perform cohort analyses using SnowPlow
  5. How to “fake it” (get as close as you can) using Google Analytics

What is cohort analysis?

A cohort analysis is simply an analysis that compares the behaviour of two or more groups of customers, where each group of customer is called a “cohort”, and shares particular internal characteristics.

How we define the different cohorts to compare, and what we compare about their behaviour, will vary depending on the type of business question we want to answer. In the case of a lean startup, cohort analysis is used to determine whether a new version of the product is “better” than the old version. In this case, we compare the cohort of users who were introduced to the old version of the product with those who are introduced to the new version of the product. The specific behaviour to compare might be: what percentage of each cohort goes on subscribe to the product? (There may be a number of different stages in a customer journey funnel, in which case what percentage of people reach each stage between the two cohorts can be compared.) The cohort analysis can then be used to check if the new version of the product really is superior, by demonstrating that the percentage of newly introduced users who go on to subscribe is significantly higher for the second group than the first.

Let’s look at a totally different example. For a number of services, we expect the way that users engage with the service to change over time, as they come to understand the service better. This is especially true for services like Twitter and eBay which require a bit of upfront investment and “getting used to” before users can get the most out of them. The managers of businesses like eBay and Twitter will want to analyse to what extent they are getting better at encouraging users to engage more effectively, earlier in their individual customer journeys, over time. To perform this analysis, we would define each cohort by the date at which they initially sign up to the service (e.g. January 2011, February 2011 etc…) and then compare the level of engagement for each cohort after one month of use, then two months of use, then three months of use and so on.

The specific example of Twitter is described in this excellent (if rather old) blog post by Joshua Porter. The data from the Twitter analysis is reproduced below:

The important thing to note is that splitting users into different cohorts by when they signed up is the only reliable way to assess if Twitter is getting better at encouraging users to engage over time, because just looking at the overall engagement metrics for the complete userbase by month will reflect the change in the composition of the userbase as well as the ways in which site improvements are driving improved engagement. To illustrate this point: if in March the number of Twitter users grew enormously (i.e. many new users were added), then even if the site improved significantly that month, it would still be possible that average engagement measures would be dragged downwards by the new users to the service, who will only start using the service effectively in their second or third month.

Joshua Porter’s Twitter case study is often front of mind when digital folk talk about cohort analyses. However, there are many other ways in which we can define cohorts to answer diverse and challenging business questions. To return to an earlier example, an online retailer might want to understand how their customer lifetime value differs between customers who have been acquired through different marketing channels. If looking over the complete time period that a customer is “active” (i.e. over their entire customer lifetime), it turns out that the average profit earnt from a customers acquired on channel A is higher than channel B, then it might would make sense to spend more to acquire new customers on channel A than channel B. In this case, each of our cohorts would be defined based on their original acquiring channel, and we would then average customer lifetime value within those different cohorts.

What makes cohort analyses so good?

Cohort analyses have two particular strengths:

  1. They typically deal with real customers (who make up each cohort), rather than the “visits” which web analysts typically think in terms of
  2. By comparing one cohort of real customers against another cohort of real customers, it is easy to get an understanding of the size and potential impact of any difference observed

Talking in terms of customers rather than visits is preferable, because customers are the real entities that businesses are built around, whereas “visits” are artificial constructions based on a technical notion of a web browsing session. Interpreting data related to customers is much easier in a cohort analysis, because we can easily imagine the impact in terms of the behaviour of real people.

To understand how impactful it is to make the difference in results between cohorts easily comprehensible, consider the case of the retailer looking at different average customer lifetime values between two different cohorts. If customers acquired from marketing channel A are worth on average £25 and those on channel B are only worth £15, it not only means the retailer should be happy to spend significantly more acquiring users on channel A, but also that the retailer should try to understand why those users are so much more valuable (what is the underlying driver of that value) and look to leverage it elsewhere. It is easy to see how a tool which can easily surface these sorts of insights can be so transformative in an industry such as e-commerce.

Methodology for developing cohort analyses

There are four steps in the Keplar methodology for developing cohort analyses:

  1. Start by defining your business question
  2. Work out what is the most appropriate metric (or set of metrics) to measure, given your business question
  3. Define your cohorts, given your business question
  4. Perform the analysis

Start with your business question

People have a bad habit of talking about different types of analyses and analytics tools in abstract, without tying them to a specific business question. Since the cohort analysis for Twitter was published a couple of years ago, it has become popular to talk about the need to perform “cohort analysis”, as if it is obvious what analysis this implies. As we hope we have made clear above, the term cohort analysis can encompass many different types of analysis, only one of which is exemplified by the Twitter case.

Analyses only ever make sense in the context of a business question. So before a cohort analysis is even prescribed, a clear business question needs to be formulated. To recap the three examples given:

  1. Is this version of the digital product better than the last version? How much better is it?
  2. Are we getting better at getting our users to engage deeply with our service, over time?
  3. Do the customers acquired from one marketing channel differ in behaviour from those acquired from other marketing channels? Is there a difference in their value?

Cohort analysis can be used to answer a great many different types of business question, because the range of cohorts we can define and the effects we can compare between different cohorts is so enormous.

Work out what is the most appropriate metric (or set of metrics), given the business question

Each of the three business questions asked above ask us to assess something. In the case of the first question, we are asked to assess the effectiveness of a digital product. In the second, we are asked to assess user engagement levels. And in the third, we are asked to assess customer behaviour in general, and value specifically.

Choosing an appropriate set of metrics to reflect the effect you wish to assess is critical to making the cohort analysis meaningful. It is not always straightforward. To take the second (Twitter) example: how do we measure engagement on Twitter? The number of times a user logs in per month? The number of days per month they are logged in? The amount of time they are logged in? Or perhaps the number of times they post?

We will write a complete blog post on metrics for measuring engagement in the future. Suffice to say for this blog post, there is no single answer that is correct for all cases. For Twitter, what was important, for the sakes of the cohort analysis, was that the pattern of engagement (however that was measured) was known to change for each user over time, as each user became familiar with the service, and so started to use it more effectively. As soon as the data scientists at Twitter understood that, they realised that they had to map engagement levels after one month, tw months, three months and so on, for each cohort, and then see how that overall curve changed over time. Any analysis that looked at just the first month, say, might miss important step changes in behaviour later on in that user’s lifetime.

Define your cohorts

Once you know what you are measuring, you need to pick two or more different cohorts to compare.

In the initial example where we want to measure the difference between two versions of a digital product, it makes sense to compare the behaviour of people introduced to each different version. Because we want the results to reflect differences in the product each group is using, rather than differences in the group themselves, the two cohorts being compared should be as similar as possible – for example they should:

  • Know as much or as little about the product as each other
  • Have had just as much opportunity to learn about the product as each other
  • Be just as technically literate as each other
  • Be acquired in just the same way as each other

Ensuring that both groups are as similar as possible is not easy: it is one reason why split testing is a good approach – because customers are acquired through the same channel, and randomly assigned to try one of the two versions of the product.

In the Twitter exmaple, the idea was to measure the progress that the Twitter team had made in encouraging users to engage more deeply with the product over time. It therefore made sense that different cohorts are defined in terms of the point in time at which they signed up and started using the service. The analysis compared cohorts by month, but this is not necessary the best unit to use. If the product was being updated weekly, it might make more sense to track progress weekly, in other words to define cohorts by week. Of course, to ensure that the results are robust, it is important that there are enough users in each cohort so that any differences are statistically significant, which could lead to choosing a wider time period for each cohort. (For a service as popular as Twitter’s, that is probably not a problem, however.)

Performing the analyses

As we have hopefully made clear in this post, the methodology behind using cohort analyses is pretty straightforward. What is more difficult, particularly when it comes to looking at web analytics data, is the mechanics of actually performing the analysis. In the next blog post in this series, we look at how to perform cohort analysis using our own SnowPlow tool, which has been specifically designed to enable cohort analyses.

Leave a Reply