How we use data: To learn from a metric, break it apart
February 9, 2012
This is the second in a series of posts from Typekit’s resident data analyst, Mike Sall. Read the first.
One of the trickiest questions we can ask of data is how to compare a bunch of smaller items all at once. It’s also one of the most common questions, whether we want to analyze online advertisements, or understand different products, or compare customer segments by region. And to find an answer, we usually end up doing the same thing every time: we’ll take that list of items and rank them by some important value. For example, which online advertisements have the highest click rate? Or which products are purchased the most often? Or what are the top regions by revenue?
It’s an easy way to quickly see what’s important, which is why we do it. But rankings also obscure all the details that can help us figure out what to actually do about those important items. If we know which products are purchased most often, how can we tell whether it’s due to pricing, or features, or marketing? What we really need is to be able to discover trends — not just single numbers — that in turn reveal how we should act upon them.
In our last post on data we discussed one technique for analyzing trends, by visualizing one metric across other dimensions, such as the cancellation rate across time and period of use. This is great when examining the customer base as a whole or considering just a few segments, but it starts to break down when we try to compare dozens or even hundreds of segments at the same time; many graphs all together can be as opaque as the original data. So we need a new technique when we want to explore many segments at once.
To explain our approach, let’s use an example applicable to anyone with a website: traffic sources. We care about traffic sources because if we can better understand who’s sending us new users, we can better serve those users. Most tools like Google Analytics offer lots of data about traffic sources. For each source we can consider a range of metrics, from the number of visitors, to how long they spend on the site, how many of them sign up, and more.
To get to the trends we care about, though, we need to take a step back. First, we should focus our attention on the single most important metric; then, to find the trends, we can take that metric and break it apart — that is, we can look at its components. While the overall metric can help us understand which items are most important, its components will tell us why they are important.
For many online businesses like us, that key metric for web traffic is straightforward enough: revenue. Of course all of the other metrics around engagement are important too, but ultimately we hope traffic will drive people to sign up and purchase a plan.
With this metric in hand, we can now break it apart. To do that, let’s walk through the thought process for visitors purchasing a plan. First, they’ll decide to visit our site. Then, after browsing around, they’ll determine we fit their needs and decide to sign up for a plan. And lastly, they’ll look at the plans we offer and decide which plan they want. So, to understand how traffic sources are driving revenue, we can think about this metric in terms of the sub-decisions that comprise it: visits from the traffic source, then sign-ups from those visits, and then revenue from those sign-ups. Viewing the components in this way produces a nifty equation:

Now we can start to plot our segments (that is, the traffic sources) against these component values. In this case, we went with the following bubble chart:

In this chart, we’re plotting the 200 traffic sources that drive the most revenue for us — together, they represent over 95% of our online revenue. The x-axis represents the percent of visitors for each traffic source who signed up for a plan. The y-axis represents the revenue we received per sign up, on average. And the size of the bubble represents the total number of visitors. We also graphed both the y-axis and x-axis along a logarithmic scale, equally spacing 1, 10, 100, and so forth. Depending on the data, logarithmic scales can work better in cases like this, when larger values are spaced further and further apart from each other.
Immediately, we can see that traffic from direct visits and Google searches dwarf our other traffic sources in terms of total visits (and consequently, total revenue as well). They are also somewhat further to the top right than average, which means they achieve higher revenue per visit. This makes sense: visitors who are ready to make a purchase or know what they want are more likely to search for related terms or seek us out directly. Those bubbles are so big, though, that they’re hiding differences among the other sources. We can remove these two data points to see a clearer picture of the remaining traffic sources:

Now we can dig into what the chart shows us about these segments. Take the visitors arriving from Typekit colophons, for example, where a high volume of visitors and strong sign up rates are bolstering a relatively low proportion of paid plans. This tells us that to increase revenue here, we want to focus on improving the proportion of paid plans. That might mean reexamining how we’re presenting the benefits of each plan or perhaps how we’re directing these visitors through the sign-up flow. On the other hand, the Lost World’s Fairs traffic source has the opposite problem — plenty of traffic and a decent proportion of paid plans, but not many sign ups per visit. We know, then, that we should focus on sign up rates for this segment, perhaps through custom landing pages. And lastly, we can see that several web design blogs are sending quality visitors, with both strong sign up rates and a large proportion of paid plans, but in low volume. So perhaps we can advertise on sites like these to increase overall traffic.
These are all insights a ranking could never reveal. With this new view, we’ve moved past simply figuring out what’s important to thinking about what we can actually do. We can even draft broad strategies tailored to different areas of the chart:

Examining segments in context like this can also help show when certain opportunities stand out. When we did this analysis last summer, we were pleasantly surprised by how well our blog was performing. In addition to decent traffic volume, both the sign up rate and paid plan proportion were higher than average. So we decided to invest in the blog and prioritized several design and content changes.
To better integrate the blog and encourage more browsing into the Typekit site, we added our site-wide navigation to the blog header. To increase sign up rates, we improved our promotional material, including a more prominent message box in the top right and a new footer, as well as a list of recent fonts in the right sidebar. We also thought more about the content. Realizing that posts about new fonts were driving strong traffic, we made them even better. We put more time into making our font specimen images larger and more colorful, and we started suggesting font pairings as well. We also introduced a new series of posts, About Face, where we could feature more content like this. Here’s a brief look at the before and after:

The former blog design.
The new (current) blog design.
It paid off. Before we made those changes, 9% of visitors to our blog were clicking into typekit.com, leading to 4 cents in revenue per blog visit. Today, 27% click into typekit.com, and we see 10 cents in revenue per blog visit. Added to increased traffic overall, our blog is now driving about 190% more revenue per month.
Of course, we didn’t necessarily need the chart to know we could improve our blog. We’re always aware of lots of things we’d like to do when we can get around to it. But we do need charts like this to make sense of our options — to understand how they relate to each other so we can make better decisions about our priorities and approach. That’s where this technique is most beneficial. By simplifying our focus to one metric and then looking at how it breaks apart, we can see which items have the most potential, learn what specific areas need attention, and think of smarter ways to improve.
Together with the technique described in our last post, these methods can help us dive into our data and explore what insights it has to offer. It’s a process that involves asking general questions and looking at what happens when we spread the answer out in different ways. But just as important as it is to begin with questions and dive deeper, we also need to constantly monitor activity at a higher level so we can quickly catch when issues arise. In our next post, we’ll discuss our approach to dashboard metrics and how we go about building team-wide transparency into what’s happening with our users and product on an ongoing basis.
How we use data: Shades of gray
December 5, 2011
This is the first in a series of posts from Typekit’s resident data analyst, Mike Sall.
Data is an incredibly valuable resource, but translating it into something useful isn’t always straightforward. Actually, it’s a lot like apartment hunting on Craigslist: you can’t always trust the postings, the photos can be deceptive, and a lot of information is missing. But after checking out enough listings, you start to get a feel for the market. It becomes easier to spot the best options. And you might even learn to ask about new things, like the best local grocery or the easiest place to find parking.
At Typekit, we approach our data in much the same way. We might not know exactly what we are looking for, so we want to be able to discover the things that matter. At the same time, though, we need to be careful about how we interpret what we see.
To do that, our method is threefold: first, we ask generalized questions; second, we illustrate the answers to those questions across many dimensions; and third, we focus on the trends they reveal rather than any single, potentially distorted value.
It’s an exploratory approach that involves simple calculations and visual illustrations. Like apartment hunting, it’s an iterative process. And it runs the risk of oversimplification, since it doesn’t offer the same level of precision that heavy statistical algorithms might. But if you’re cautious, and you focus on trends rather than single values, you can unearth far more valuable insights in the process. Plus, you can pretty much do everything in a basic spreadsheet.
So, how do we put it into practice? In this series we’ll detail four techniques we use to guide us.
The answer is never black and white, so work towards shades of gray
Frequently, the biggest questions that data can help us answer concern understanding customer behavior. These can cover a wide range of topics. For example, what kinds of products are customers buying? Or, how are they interacting with different features?
To show how we approach these kinds of questions, let’s walk through one that is very important to us at Typekit: how many of our customers are cancelling their subscriptions? We put a lot of effort into understanding cancellation behavior because we want to keep our customers happy. If something is driving our customers to cancel, we need to know about it.
To start tackling this question, let’s look at the equation we need to use:

Divide the number of customers who cancel by the total number of customers to arrive at your cancellation percent.
Thankfully this equation is pretty simple — we only need two numbers!
But as soon as we take the next step and look for those numbers, we run into all sorts of new questions. Should we include the customers who used our beta version? Does it still count if the customer renewed for two years before cancelling? Does it make sense to include the customers who cancelled before we shipped a major feature? What if customers cancelled within a couple minutes of signing up — should we still pool them together with the customers who tried us out for a few months?
Of course, we could answer these questions one by one, imposing limitations on who to include until we had a “typical” population of customers, but the result would be totally myopic. For all we know, some of the most interesting insights might be found among the customers that we excluded.
So, instead of trying to shove the data into a single black and white answer, it’s better to spread it out. The shades of gray are a good thing; given our generalized question, we want to find ways to answer it across multiple dimensions.
That’s when all those questions we run into become an asset. Looking over them, they seem to converge on two basic factors: the point in time when customers signed up, and the duration for which they had their subscription before cancelling. Ah ha! Now we have our dimensions, and we can start plotting the cancellation percentage across them.
After trying out several different groupings and visualizations, here’s the chart we ended up with:

Customer cancellation behavior, charted across the month when the customer signed-up.
Let’s walk through this chart. On the x-axis, we have the months when customers signed up. So, the area above Jan ’11 represents all the customers who signed up during that month. On the y-axis, the total height of the area represents the total percentage of customers who cancelled. We then split that percentage into stacked bands that represent the different durations for which the customers had their subscriptions before cancelling. For example, the light green band on the bottom represents all the customers who cancelled within a day of signing up. Above that, the darker green band represents all the additional customers who cancelled within the 30-day trial period but after the first day.
Now we can really explore what the chart reveals. Most prominently, cancellations appear to happen at three specific points in time: the 30-day trial period, the year-end renewal mark, and later when enough failed payment attempts essentially render a customer inactive. Within the 30-day trial period, customers act quickly, with around a third of those cancellations occurring on the first day alone. These numbers are changing, however. The overall cancellation percentage of older customers has been decreasing, mostly due to fewer cancellations at the renewal mark, while failed payment attempts have remained steady. Conversely, cancellations prior to the renewal mark have increased since last year, particularly around last November. Still, these trends are separate from the unique behavior we see in our earliest customers — a spike in failed payments but also fewer initial cancellations (likely attributed to the extra patience of early adopters).
All put together, this is incredibly valuable. It represents a multifaceted understanding of our customers’ cancellation behavior. To improve customer retention, we know there are specific points in time when we can focus additional messaging or improvements. On the first day especially, when customers make quick decisions based on initial impressions, we need to nail that on-boarding experience. Likewise, we know there are large swathes of time when customers are not making these decisions, so we don’t need to focus our attention there.
On that point, now that we know how cancellations are distributed across these periods of time, we can better prioritize our efforts. If we think the renewal period cancellations seem high, for example, we might try out different notification messages or re-examine how we’re welcoming customers back to their account settings. As for initial cancellations, we might take a closer look at the increase we’re seeing; we added some major functionality to our transaction system around November of last year, so perhaps some new step or message produced an unwanted side effect.
The chart also leads to additional questions. For instance, how do these trends and distributions differ for customers who have upgraded or downgraded, or for each of our individual plans? When we look at some of these more specific segments, we find even more differences:

Customer cancellation behavior for each of the main payment plans.
It appears that a larger portion of our Personal Plan customers cancel, especially early on during the free trial period, so we can tailor messaging specifically to these users at this time. It also looks like the increase in cancellations prior to the renewal mark is limited to Portfolio Plans, so we can focus improvements on those customers. And for Performance Plan customers, we can feel comfortable making few changes since we see very little cancellation behavior among them.
The more questions we ask, the more insights we can gain. Plus, this is just what we’re seeing today. We reproduce this analysis on a regular basis, so if any of these trends changes significantly, we’ll see it, and we’ll be able to react.
That’s the beauty of quantifying values like these across other dimensions — it allows us to quickly examine the whole landscape of customer behavior. And beyond that, it can reveal answers to questions we never thought to ask. In this case, we weren’t initially asking how specific groups of customers were cancelling at specific times, but now we’re acting on those discoveries.
Still, this particular technique has its limitations. It works best when we want to understand all our customers generally or the differences between just a few groups, like our three subscription plans. When there are many more categories, such as customers from different countries or the results of different advertising messages, it can become burdensome. In the next post we’ll discuss how we approach that end of the spectrum.