Go back to Blog

Growth & Marketing

Jes Kirkwood on November 15th 2021

Shopify's VP, Growth Morgan Brown reveals how the company's growth team drives results in an exclusive interview.

All Growth & Marketing articles

Jake Peterson on April 14th 2016

As Head of Customer Success, I wonder whether we should continue to offer support on our free tier because it’s been hard to nail down the costs and benefits. Adding support as a feature is becoming more popular in the SaaS industury. For example, Optimizely recently switched to two plans, and you only get email support as an enterprise customer. My concern is: what if we’re spending too much time supporting very small customers, and it is prohibitively costly to the business?

Beyond free support, I also consider: How efficient is our team? Are we hiring quickly enough or maybe too fast? How can we make the support experience so magical that customers use Segment more and recommend us?

Answering these questions has been difficult, since the data sets required to do these analyses have been siloed within tools. Most of the data I needed to analyze is in Zendesk. Billing and plan data is in Stripe and product usage is in Segment. Now, with Segment Sources, I can tie all of those records into one data warehouse. Using data, instead of our collective intuition, to make better decisions faster feels good.

In this post, I’ll outline my major questions about our success team at Segment and the queries I used to investigate them. We used BIME Analytics since they have a number of out-of-the-box dashboards for Zendesk. I’ve included links to queries were appropriate!

Kindly note that, we replaced the number values in these charts with fake data, but the trends and percentages we saw are real. We hope you can take this as a starting point for your own analysis! Many thanks to analysts Will and Perry on our team for helping me with some of the queries!

Here is a fictitious Zendesk dashboard created with BIME Analytics.

Does our success team have a positive impact on activation, revenue?

To begin my analysis I split our customer base in two cohorts: those who have created a ticket with our Success team and those who haven’t (related queries can be found here). I wanted to see what the differences were between these groups.

API Usage

Since API usage, or how much data people send through Segment, is a good proxy for the value our customers get from our product, I took a look at the average monthly API calls are for both cohorts.

Customers who have talked to support on average send more than 10x more data to us than customers who don’t. Though in this initial analysis, there are many inactive accounts that are skewing the results.

To make this analysis more actionable, we looked at “active” customers (accounts sending more than 500 API calls per month). Here, the gap is smaller but still significant. This isn’t surprising, since we’ve removed a non-trivial sample of accounts that don’t send much data.

Though we should be mindful that this correlation does not imply causation, these queries show that active accounts use Segment twice as much when they also have engaged with our success team.

Revenue Impact

Activation in the form of API calls is certainly important, but we also want to know if working with customers to get set up correctly means that they will get enough value to upgrade to a paid service plan.

If users typically use Segment more after talking to support, do they also upgrade and pay more?

Forty-two percent of users who have talked to our success team converted to paying (starting a subscription at any billing plan). Conversely, users who have not talked to our success team converted at a lower rate of 22%.

It’s certainly encouraging to see that our success team has a positive revenue impact, as this result complements the increased API usage analysis of the previous section.

To further confirm success’s contribution to our bottom line, we’d need to do some more exploratory analysis. For example, looking at a handful of accounts in each of those cohorts, seeing what tickets they’ve written in, and learning when/why they upgrade to a paid account. Then, we can work to repeat that with other customers.

How much does our success team cost?

Since our product is highly technical, our amazing success team does not come cheap. Most of them are engineers! Understanding support costs helps us in two ways: 1. knowing how better to allocate our success resources and 2. building a hiring plan.

By combining the total cost of our success team (salaries + benefits + tools our team uses) and the data afforded by the zendesk.ticket_metrics table (tickets, replies, resolution times), we can calculate the:

  • cost per ticket

  • cost per reply, and

  • cost per “day when a ticket remains open”.

To keep things simple, our analysis is only going to focus on the cost per ticket and for January 2016.

The total cost of success for that month is $58,333 (the sum of all annualized salaries and tools expenses divided by 12) and we had a total of 1,380 tickets. The cost per ticket is $42.27.

This means that if we theoretically charge our users more than $42.27 to create a ticket with our success team, then we would be covering all of our costs. But who really does that anyway? (Don’t worry, we won’t—karma is one of our values 😃.)

However, this cost does help us figure out how to best allocate our resources. If our customers and ticket volumes grow too quickly, then we would need to find ways to allocate our success resources to alleviate ticket strain. Some obvious methods would be creating help articles or a Stack Overflow-esque communitywhere users can answer other users’ questions (which we’re working on!). Other options would be to sell à la carte premium support, where there are contractual SLAs and maybe even direct phone support. A strong understanding of support costs can help guide the decision around approaching any of those options.

How can we better prioritize new product requests?

While the squeaky wheel gets the grease, we may be missing the larger picture when it comes to prioritizing our road map. For example, one customer might keep asking us for a specific feature, but that might align with our overall product strategy or other customers’ requests.

We found that layering on the LTV of customers who request features, in addition to the total number of requests for a particular fix or feature, helps us make sure we’re prioritizing the most impactful projects for the product team. While we ultimately don’t want to transform our company into a dedicated dev shop for our highest paying enterprise customers (or grant them full control over our product vision), incorporating LTV and number of requests can be a helpful proxy for demand of smaller features.

This analysis can be done by querying Zendesk tickets table, generating a cohort of customers who have asked for a specific feature, adding their billing plan, and then calculating the LTV. The LTV would then be value associated with the product request.

We frequently get integration requests via Zendesk that follow a pattern in the zendesk.ticket.subject: “Requesting integration for “ via our contact page.

We can count the number of discrete integration requests, as well as sum the LTV of all users who submitted requests in SQL using the Zendesk source:

This gives us a strong idea of how much each new integration is worth. The cost of developer resources to onboard or build the integration and time can be weighed against the LTV of the cohort who has requested the integration.

How can we scale our success team?

Planning for hiring is part art and part science. The most important key to our hiring is making sure we have enough success engineers for “product coverage”.

The idea of “product coverage” is based on ensuring that our customers who write into our success team will get a response within a reasonable timeframe. Given historic data, we can assume that an individual success engineer can, on average, close 10 tickets per day or 200 tickets per month (given 20 business days).

If there are not enough success engineers to answer tickets, then the average ticket resolution time will creep up. The prevailing equation here is:

Then, we can monitor “Tickets Per Product Per Month” for different productsover time to see if the average number of tickets grows too much to become unmanageable by our success team. This would be an early indicator of needing to hire a new success engineer.

In this example, we look at January 2016 and include enterprise as its own “product”, since the SLA demands a higher threshold for coverage.

We see here that in January, the number of enterprise tickets surpassed the required “coverage” by the success engineers. We dug into this number to make sure the reason is genuinely increased support demand from that segment (other reasons tickets could surge include broken product, confused customers, etc.). After looking through the tickets, talking to our success engineers, and confirming with other metrics (new enterprise contracts closed), it turns out we need to hire more!

Another benefit of splitting ticket coverage by product is that we now have the option of hiring specialized success engineers, focused on supporting a specific product. Since our enterprise customers have grown faster than we anticipated, we’re now looking specifically for Implementation Engineers, who are similar to success engineers, but dedicated to working with our enterprise customers.

Data-informed Success

Nowadays, customer service teams can be the reason why you win an enterprise contract over your competitor or why your users buy clothes from your store. Having Zendesk, Stripe, and Segment data available to analyze in Redshift not only helps us make decisions faster and with more confidence, but also be more disciplined about launching experiments.

Hopefully, this post inspired you to ask similar questions about your support team. Keep in mind that the queries and results alone aren’t everything! Be sure to take the time to read through a sample of the tickets to confirm any hypotheses the results are suggesting. Many times there could be an unexpected lurking reason that may skew the data in unexpected ways.

Lastly, there are many intangible benefits a success team can provide that can’t really (yet) be quantified. These benefits can build defensible brand equity and leave lasting impressions on customers. Some of these include customer referrals, which can drive down customer acquisition costs, or the reputation of having a personalized, consultative success team, which can help close enterprise deals.

Unite your data today or learn more at our upcoming webinar.

Diana Smith on April 12th 2016

“The ultimate goal of customer support is to make customers insanely happy while growing—and not draining—revenue.”

These are words of wisdom from Graham Murphy, head of customer operations at Mesosphere, an enterprise software company that builds the Datacenter Operating System (DCOS). The software behind the Mesosphere DCOS (including Apache Mesos) runs some of the largest datacenters in the world, along with all of Apple, Twitter, and big portions of Verizon and others. The Mesosphere DCOS is the first operating system that makes it easy for traditional enterprises to run large-scale distributed systems in their datacenter and cloud.

Customer support is critical to Mesosphere’s success because the product plays a key role in customers’ infrastructure. Any time there is a question or an issue, Graham’s team needs to respond quickly to make sure Mesosphere customers’ systems run smoothly. We sat down with Graham to get a sense for what type of analyses he’s running on his support team, what data is critical to understanding their efficiency, and how he’s using Segment Sources to improve the customer experience.

In our Q&A, Graham Covers

  • The top analyses support teams should be running

  • How to prove your support team is contributing to, not just eating up revenue

  • Tips for making sure product and engineering teams listen to your customers

  • How Mesosphere is using Segment Sources to pull together Zendesk, Salesforce, and Intercom data in Amazon Redshift to uncover these insights

Dive into the interview below, or get the PDF.

Customer Operations at Mesosphere

Diana: Thanks for taking the time with us today, Graham. So, tell us about Mesosphere and your role.

Graham: Sure! I’m the head of customer operations for Mesosphere. In short, we’re building an operating system for datacenters, the DCOS. Mesosphere organizes your entire datacenter and cloud infrastructure as if it were a single computer. We help businesses that run distributed systems build, deploy, and manage them much more efficiently.

My role here is to make sure we are providing exceptional service to our customers. Basically everything customer-facing falls into my world: support, escalating tickets to engineering, prioritizing product feedback, and resolving customer problems.

Diana: So what does the support team look like at Mesosphere? How important is it to the business?

Graham: Well, we are an enterprise software company, and we have very large customers who rely on DCOS to power their datacenters. We need to be ultra-responsive, answer customer questions, and squash issues as quickly as possible. Each contract has strict support SLAs (service-level agreements for response time, etc), so closing tickets quickly and efficiently is critical.

As far as our team goes, we’re made up of highly skilled engineers. You can’t support Mesosphere if you’re not an engineer, so.… our team is not cheap. We need to make sure we’re maximizing customer happiness while also being fiscally responsible.

Tying Customer Support to Revenue

Seems like your team is under a lot of pressure to perform quickly. How are you measuring performance and working to improve it?

Right now, I’m focused on a few things to make sure our team is really contributing to the overall organization:

  • Better understanding the scope and costs of support

  • Tying support interactions to potential and existing revenue

  • Prioritizing customer feedback for our product and engineering teams

These deliverables made me really excited about Segment Sources because they each require support and sales data in one place. We’re already fairly heavy users of Segment’s other products, but Sources is extremely valuable to me because I used to have to tie together my data from Zendesk, Intercom, and Salesforce by hand.

What do you mean “by hand”?

In the past I had to pull data from the tools’ API’s with generally hacky scripts that would push the data into Redshift or another SQL database. I could build the initial script myself or put a few engineers on it for a few weeks, but they are already at capacity handling support. And, thinking about other companies, I doubt most support teams would be technical enough to do this themselves.

And then, even when you get a hacky cron job put together (or chronos job, in our case), you still have the headache of really maintaining it. For example, what happens when APIs and schemas change? Simple things like adding an additional custom field to your Zendesk script become a complete pain when you consider the fact that they will also require adding another column to your Zendesk table in Redshift.

We learned the hard way that there are inevitably going to be 10 to 20 different idiosyncrasies between various systems, data structures, and APIs that we’re not going to have thought of the first time around. This means we are going to have to go and fix them and, frankly, that is not our specialty. That’s your focus. I’d rather have you guys do it for us.

All of the Data in One Place

Glad to hear it! So, what Segment Sources are you pulling data from right now?

Zendesk is a big one — it’s the support channel that is under SLA for us. That’s usually where meatier customer questions come in. We also offer support through Intercom in certain parts of our product, so we need that too. We’re syncing Slack data from our customer channel over now with the .set() API for custom object loading, but I’d be more than happy to take a few Segment engineers out for happy hour if I could fast track that source.

To tie our support data to cash, we also use Salesforce, which is our source of truth for revenue. Having worked at Salesforce for a while, I’m a big advocate of the software. Easy access to this data — accounts and opportunities, specifically — in Redshift is very useful for me because it allows me to slice and dice the revenue data and tie it directly to our support and customer data.

At one point, we considered doing all of this analysis in Salesforce by pulling more support information in there but, honestly, I find it lacking a lot of the integrations I’d want. With Segment Sources, I can supplement in our data in Salesforce with queries in Redshift across support and product data, which means it’s much more flexible.

Unifying the data sets through Segment makes my performance analysis easier. Now that I can run a single query against a few Redshift tables, I’ve saved dozens of hours of customization work. It’s the unification of the data sources that’s really powerful for me.

Understanding Revenue Impact

What types of analyses are you running now that you have Segment Sources? How are you tying your team’s contributions to revenue? What’s your main goal?

That is a multifaceted question. The short answer is that the ultimate goal of customer support in any organization is to prove you’re providing value. After all, it’s very easy to view customer support as a cost center. The good news is that you can demonstrate customer support is not a cost center, but actually protects and grows revenue, if you have the right data.

Here’s an example: Say you had a hundred customers that encountered a bug. Fifty came and contacted support, and 50 didn’t. With your product usage data, your support data, and your Salesforce data, you can look at the differences between the two groups and speculate as to whether the support interaction influenced whether or not they stayed on as a customer. Note that this is a simplification of a principle that John Goodman covers in detail in “Strategic Customer Service.”

Prioritizing Customer Feedback

Nice! What else are you doing with your support data?

The second way we’re using the data is to help prioritize fixes for our product and engineering teams and make sure our customers’ voices are heard. When we say X number of customers, equaling Y amount of revenue are experiencing this bug, that’s a lot more compelling when we ask for a fix. And, I can also tie in things like actual impact on revenue, potentially impacted revenue, and future revenue. This makes the support team’s product requests hold a lot more weight.

Besides feature feedback, we’re also working on building individual timelines for each customer across all channels. With my Zendesk, Salesforce, and Intercom data together in one database, it’s really easy to figure out exactly when our last contact with a certain organization happened. I want to do that in a single query, and use it for all sorts of stuff. It can be a dashboard. It can be a report for the exec team. And, it can be a tool for support and sales agents reaching out to high-value customers. There are a lot of possibilities.

Calculating Cost Per Ticket

That’s such a great idea! We might steal that one. I want to bring back cost per ticket for a second, which you mentioned earlier. What are you doing with that?

What I’m really trying to do is look across all channels (SLA-governed or not) and all customers (community customers that aren’t paying us anything, plus customers that are paying us a ton) and get a holistic view of what my team is doing. I want to measure every output.

Once I get all of the top level data (replies, notes, tickets), it’s easy to run some pretty critical projects:

  • Calculating cost per ticket

  • Extrapolating cost to support specific and future customers

  • Building a hiring model

The way that you generally calculate the cost of a ticket is to take the target earnings of all the people on your team. You then add your overhead costs associated with the team, tooling or anything else, and the variable costs. This is your “all in” costs. You then compare that to the number of tickets that were created in that time period. That would be industry standard way to calculate cost per ticket.

With this baseline you can get into more creative metrics, like support cost over customer buckets—such as paying customers versus community customers. And, what I look at now is how much it costs for each interaction, because maybe one ticket had 20 interactions.

Those seem like very useful analyses!

Yes, especially because the support we provide is very costly. Right now our enterprise ticket costs are pretty high. Therefore, it’s important for us to actually tie the cost per ticket numbers back to the contracts on specific deals. I’m making up numbers, but let’s say one customer pays us $100 a month, but it costs $500 dollars to support them. Obviously, that’s not sustainable.

So in terms of comparing the cost of goods sold to internal operational costs, it’s very important for us to be able to have that holistic view of every ticket and conversation. And again, for me, it’s really important because if I were to look at all the Sources, I could make a generalization that an Intercom reply costs $100 dollars, while a support ticket costs us $2,000. Then applying that to requests per specific customer, I have a much more realistic view of the impact support is having on our organization. I also know how many agents we need to make sure our customers have a great experience.

These types of analyses would really annoying to do without all of my data in Redshift. Segment Sources makes it easy.

Many thanks to Graham for taking the time to chat with us about customer operations at Mesosphere. If you’re interested in checking out Sources — Segment’s new product for pulling data from cloud services like Zendesk and Salesforce into a SQL database, you can see a demo at our webinar next week.

Dave Gerhardt on March 15th 2016

Data is powerful because it’s black and white. It’s concrete. Data takes the guesswork and gut calls out of decisions. And data has helped transform marketing’s perception from fluffy and “arts and crafts” to a role that is crucial to driving growth and predictable revenue.

But when it comes to startups, sometimes we can go too far in the other direction. We have a tendency to focus so much on being data-driven that we forget about being customer-driven.

As Segment CEO Peter Reinhardt said, “20 hours of great interviews probably would’ve saved us an accrued 18 months of building useless stuff.”

Data is crucial, but data is a look in the rear view mirror.

In the early days of a startup, you can’t lead with just data. Plus, for most early stage companies, it would take months to get results back that are statistically significant – and on the path to finding product market fit, who can afford to pull over and wait to see what the data says?

Quantitative data is most powerful when it’s paired with customer feedback – and the only true way to get customer feedback is to put in the time and effort to talk to your customers. But you don’t need to spend weeks on weeks talking to thousands of people. Henry Devries and Chris Stiehl taught us that just 12-15 one-on-one customer interviews reveal about 80 percent of all possible pain points for your segment.

Here are three simple ways to talk to your customers more often without ever having to leave your standup desk.

Three Ways To Talk To Your Customers More Often

1) Send a welcome email with a purpose

Too many people treat their welcome email as a throw away, or something that gets written at the last minute as a finishing touch to onboarding.

But here’s what you need to know about your welcome email: it is the single most important email that you will send.

At Drift, our welcome email blows all of our other emails out of the water when it comes to engagement, with a 76% open rate and a 25% click rate – compared the industry average of 21% and 2% respectively (thanks to MailChimp’s benchmarks).

It’s rare that you’re going to get this level of attention again without putting in a ton of effort, so make sure to write your welcome email with a purpose.

Yes, it should be smart, funny, and welcoming – but it should also be designed to get a response. Ask new users why they signed up, what they are looking to accomplish, what they’re struggling with, or what brought them to use your product or service.

Put yourself in the shoes of the reader – what’s the one question you could put in your welcome email that would get a response?

2) Start measuring Net Promoter Score

Net Promoter Score (NPS) was introduced to the world by Bain & Company in 2003, and since then has become the standard for measuring customer satisfaction in SaaS.

Slack CMO Bill Macaitis has said he’s not satisfied when someone signs up and becomes a paying customer. He cares about whether or not new users will recommend Slack to their friends and colleagues.

Atlassian President Jay Simons has said that NPS is the most important leading indicator of future growth – and as a result, every employee at Atlassian gets a digest of customer feedback each month that includes the latest NPS scores and any of the comments that came with it.

Start by asking all of your customers the NPS question daily so you can begin benchmarking the score and, most importantly, track the qualitative feedback.

This is where the real gold is – not the actual score. Take the time to read each response and bucket the feedback appropriately.

Daily NPS works well for early stage startups who might not have a dedicated customer success team. This way, you can manage a few responses a day, vs. trying to deal with thousands of responses at once. Use this as an opportunity to reply to each customer personally and start a conversation. At Drift, for example, we ask promoters for referrals, and reach out to passives and detractors directly to figure out where we need to improve.

3) Sometimes, you just have to ask

One of the best ways to start talking to more customers is also the most obvious: you just have to ask.

In late 2014, Groove CEO Alex Turnbull noticed a spike in churn and wanted to go beyond the metrics and the dashboards to figure out why – so he sent an email to every single customer asking for a few minutes of their time to talk. He spent more than 100 hours talking to 500 Groove customers and ended up with feedback that helped him right the ship.

In Alex’s case, he reached out to everyone via email at once which produced hundreds of responses But just like we talked about with NPS, you can also create on-going campaigns designed to get customer feedback one by one.

Sending an email blast to every customer is certainly one way to get feedback, but you can also use your customer data to create relevant segments.

In-app messages have become the most valuable tool for us to get feedback from our customers when we need it at Drift, since we can show the message to a particular set of customers when they are actually doing something right inside of our product – and in some cases, you can get feedback in seconds. And more often than not, people are reading email on the go, or at a time when the last thing they’re thinking about is Drift.

How Often Will You Talk To Your Customers?

Whether you work on product or in sales and marketing, talking to a customer is the single most impactful thing you can do everyday.

It’s the only that you’ll truly experience the real things they care about, the frustrations they feel every day, and the exact language they use to describe your product and their challenges every day.

So, how often do you talk to your customers?

P.S. Start talking to customers today, live on your site for free. Learn more about Drift.

Brent Summer on February 4th 2016

We’re all tired of hearing about “big data.” But if you want to learn how to use data to accelerate your company’s growth, don’t tune out yet. As consumers, we are producing more data than ever before: by some reports, as much as 90% of the world’s data was created the last year. The Internet in Real-Time clearly illustrates the stupifying amounts of data generated on the open web.

Consumers are growing increasingly comfortable sharing their preferences whether it’s passively like the brides on theknot.com (an early Segment customer!) who explore around 2,000 fashion products each minute or more actively with Amazon wish lists and Facebook likes.

Is your business taking full advantage of all of this data from your customers?

A 2013 report from Bain found that top performing companies are both twice as likely to use data “very frequently” to make decisions and five times as likely to make decisions faster than their market peers. Also, companies that pay attention to what their customers are telling them are more apt to continue making products people actually want.

As a result, we’ve seen the rise of the Chief Data Officer (CDO), and many companies are making a commitment to vigilance regarding the use of data in marketing and product development.

What is a “Data Czar”? (Chief Data Officer)

A data czar, also known as a Chief Data Officer, is a company official tasked with managing a business’s overall data strategy. This includes overseeing data security, ensuring compliance with data collection, and developing novel ways to use data assets for the purpose of business intelligence and analytics. The ultimate aim of a data czar is to harness data to obtain actionable insights and improve decision-making processes that drive business value. 

But What Exactly Does A Chief Data Officer Do?

Forrester surveyed 3,000 companies and found that 45% of companies have assigned someone to oversee data governance strategy. In younger companies, especially, data management may not always rise to a senior level c-suite role. Instead, this tends to become the responsibility of a Software Engineer, a Product Manager, a Head of Analytics, or a Growth Hacker.

Data collection has obvious benefits and, in some cases, carries significant risk. To capitalize on the opportunity and mitigate the risk, companies, no matter what size, must nominate someone to be in charge of the data. Regardless of where the CDO role sits in the organization, these “Data Czars” seem to have a common set of responsibilities:

  • Identify opportunities for innovation across the entire customer journey.

  • Focus on revenue generation activities.

  • Develop and implement an enterprise-wide reporting solution for metrics and KPIs.

  • Meet regulatory demands and manage risk.

  • Coordinate strategy with other business executives, such as the CTO and COO.

  • Cultivate cross-functional relationships between business units to ensure insights become actionable and drive desirable business outcomes.

  • Ensure data is in a clean, consistent, accurate and actionable format for the rest of the company to easily use.

  • Help assist data literacy across the company so various stakeholders can put your data to use.

  • Demonstrate ROI through data collection, enrichment and integration activities.

These data-driven organizations all realize that a well-defined process around data collection and management crucially informs business strategy and improves the company’s ability to model revenue, develop desirable products, undertake impactful initiatives, and a whole host of other activities that can confer a competitive advantage..

In early 2014, Gartner published this advisory report which urges Chief Information Officers (CIO) to get involved in defining the CDO job description and reporting relationships. This same report also reminds us that the CDOs do not “own” the data. Rather a CDO is analogous to a Chief Financial Officer (CFO), wherein the CDO owns a few things, like the schema perhaps, but they are expected to coordinate the use of data by other teams.

Tips for new Data Czars

According to Paul Gillin of SiliconANGLE, “Uber and Netflix are data-driven at their core; that’s their entire business. They don’t need a CDO when everyone is effectively doing that job already.” We’ve seen that at Segment, too. Many of our customers have one person in charge of the data strategy and keeping the execution tidy. Some of our biggest customer advocates have a title like Head of Analytics or VP of Growth while others are a Data Architect or Growth Hacker.

Assigning someone to be responsible for defining the schema, selecting the tool set and ultimately empowering them to prove their own ROI is an essential step toward businesses maximizing an investment in data. Without clear authority and processes, data becomes muddy and untrustworthy as different teams look at their respective tools and draw different conclusions.

For example, you’ve probably heard this type of question: “Why does Mixpanel say there were 3,000 signups when Google Analytics says there were 3,400?” Debugging data discrepancies is time-consuming, and costly. When someone has clear ownership over the way data is collected and routed, these sorts of situations can simply be avoided altogether.

Here are a few tips for those courageous individuals championing the effective use of data in their company:

1. Create a data dictionary

This asserts a specific syntax and the uses of event-level data. A tracking plan is a good place to start. Make sure this is in a publicly-accessible location, no passwords allowed. The aim is to make this a standard reference point for everyone who engages with the data you collect.

2. Be adamant about formatting

To ensure data quality across the board, mMake sure everyone in your company has a guide for how event data should be named and implemented. Being strict about using a certain format, whether that’s verb-oriented (Viewed Ticket) or object-action (Account Created), is the only way you’ll have clean, easy to use data in all of your dashboards. Even simple cAsiNg can mean the difference between clean, trustworthy data, or a real mess.

3. Start small

It’s easy to track too much; avoid becoming overwhelmed (and radically high technology licenses) by limiting the number of events you track. You can always add more tracking later. Start with the most important questions you have, and track only what you need to answer them.

4. It takes a village

Educating the core product engineers on your data strategy leaves them in charge of their own codebase and focuses the data team on compliance with the data dictionary and meaningful analysis.

5. Build dashboards

Making data visible is one sure-fire way to help others. Partner with business stakeholders to understand what they want to know and deliver that in short order to get more people on your side.

6. Protect the integrity of the system

Above all else introduce changes in an orderly fashion and build a process around collecting new types of data. The efficacy of your data hinges on your ability to collect it in a consistent, controlled, manner. When people trust your data, you win.

I asked Jon Hawkins, Head of Analytics and SEO (aka “Data Cop”) at XO Groupfor his advice to someone building out a data team. He said, “Your colleagues should be able to rely on you for a detailed understanding of how the software you bring into the company works, and how the data is modeled. When someone has a question, try to answer it first before directing them to vendor support. Demonstrate your expertise to avoid their losing faith in the system that you’ve architected, the system in which you’ll need their help to maintain its integrity.”

Using Data to Inspire New Features

When I asked Jon for an example of a data-inspired feature, he told me about The Knot Wedding Planner App. The app is pretty robust with impressive features to budget for your wedding, find and communicate with vendors, and all sorts of other things that help couples “tie the knot” in style. A recent analysis yielded some interesting results.

The red arrow shows a feature of the app that was discovered to have incredibly high retention correlation across all cohorts of users.

Billions of API calls pass through Segment to a Redshift database (and a variety of other integrations) that allow Jon’s team to use SQL to run some queries and visualize the data in Mode. This starburst analysis made it obvious that there are some very common user actions in the app. With that insight, they dug deeper using Mixpanel to identify the specific features that had proven to be so popular. It turns out, a surprising number of users were coming back, day after day, to look at their Wedding Countdown.

This wedding countdown is a popular feature among repeat visitors.

Jon said, “The starburst tipped us off that an easter egg feature in our UI may be encouraging some awesome engagement. So we looked at the related screen views and how frequently users came back just to view that screen. It turns out it has huge retention value for us. As each day goes by an overwhelming percentage of users will come back the following day and see how many days are left until their wedding in an unprovoked way (no email, push, etc.), similar to how one would check the weather.” The team of data scientists at The Knot had discovered this simple feature was a huge asset!

The colors show cohorts of users who return to the app after 1, 2, 3, etc days. The majority of users will return multiple times within the same week and engage with this feature.

Jon continued, “This analysis taught us that a very simple feature was bringing people back to our app again and again. Most apps don’t have that kind of stickiness.” Understanding this user behavior has inspired updates to the UI that will consider metadata like the wedding date and location to make smart, machine-learning powered, recommendations that should help our users discover even more value from the app in a way that is natural and helpful.

In Conclusion

The data revolution is in full swing. According to Deloitte, “The control of processes and systems handling, dealing with and exploiting that data is no longer a ‘nice to have’ but is now becoming a ‘must have’ to contain associated costs within reasonable limits.”

Companies with data in their DNA seem likely to experience even greater ROI if they nominate a Data Czar–someone to oversee data analytics, control the data collection, and implement management protocols For those companies lagging behind, but ready to catch up, it may require making room at the executive table for a Chief Data Officer. And Segment is here to help you, regardless of your title.

Diana Smith on December 23rd 2015

This time of year, many of us are cooking up New Year’s resolutions to kickstart our personal growth — Hit the gym 3 times a week; be kinder; read 5 books. But what about resolutions for our work lives?

This year, we invite you to join in on our 2016 New Year’s Resolution: Clean Slate Data. Together we want to answer, “How would you redo your tracking if you could start from scratch?” and implement it.

This article outlines 7 tips for you to get started.

Here’s the problem

You go to run your end of year reports, and it takes you three-times longer than you thought. Your drop downs in your analytics tools are stuffed with user actions you stopped caring about a while ago. A bunch of different developers added tracking at different times, so all of your events are named differently.

You have to ask three teammates what one event (Subscription Started) in your reporting tool means because there is another one (Added Credit Card) that might be telling you the the same thing.

You run funnel reports with both Account Created and Signup, but they don’t match. You don’t know which one to believe.

You have a basic grip on the data model, but onboarding a new teammate next year to learn the idiosyncratic schema seems daunting.

It doesn’t have to be this way. What if your data was squeaky clean?

The promise of clean data

To us, clean data means recording just the important events, in a standard format, with a spec accessible to your whole team. With clean data, you’ll be able to:

  • Reduce time to insight with fewer events clogging your systems

  • Empower each team to answer their questions with easy to understand event naming

  • Quickly measure company and team-specific KPIs by designing tracking with these in mind

  • Make confident decisions based on your data again

To make it here, you need to ask yourself a few tough questions:

How would you set up your tracking if you started from scratch? What would you cut? What would you add? How can you make the data in end tools more accurate and easier to use?

Why now

It’s very hard to revamp your data model without breaking analyses. That’s why many people avoid the clean up and live with crufty data.

But all of your reports turn over on the first of the year. You’ve done your reporting for 2015. It’s the perfect time to get clean. Plus, you’re probably planning for 2016 goals across your org. You’ll want to update your tracking to reflect the metrics you’re focused on in the coming months.

If you think about it, the time and effort you put into the scrub down will compound down the road: Your whole team won’t only save time on analysis, but the data will be more actionable, and more accurate. You can stop second guessing your charts and start making confident decisions.

7 tips for getting started

If you’re convinced, here are 7 steps you can follow to achieve clean data.

1. Finalize, document, and save your 2015 reports. If you’re going to be moving to new tracking, it’s important you save all reports that document your performance in 2015. You’ll want to be able to look back and see how your current metrics compare to last year. Save your reports, document what each chart means, and circulate to your team to make sure you’re not missing anything.

2. Identify the problems with your current tracking. Before you start making changes identify what’s wrong with your current setup. Do you have duplicate events you need to cut? Are your events inaccurate? Should you be sending them from a different location in your app? Does no one on the team know how they can use your data? Collect these issues, and make sure you address them with your new plan.

3. Focus on the most important metrics. Your company overall and each team likely landed on the most important metrics and goals for 2016. Identify the events you need to track to very simply calculate those.

For example, if you’re focusing on driving “paid accounts” as a company goal, make paid account a user trait, rather than forcing your team to analyze that with a bunch of events like UpgradedGrowth Plan, and Startup Plan. If your marketing team needs to know how many pieces of content and which content drives acquisition and retention, make it easy for them with a Marketing Content Viewedevent that can be tied to Account Created and Account Upgraded.

Don’t add in a bunch of superfluous events that aren’t answering critical questions you have about your customers or that aren’t tied to your KPIs. Focus on the metrics that matter.

4. Create a new implementation spec, or tracking plan. This tracking plan should serve as the “source of truth” for your tracking. It identifies each event, the properties associated with it, where it should fire, and why you are collecting it. (You can download a sample tracking plan template here.)

To simplify your setup try to group as many similar events as possible into a higher level rollup event. For example, instead of capturing Viewed Warehouses Landing Page and Viewed Integrations Landing Page, create one Viewed Landing Pageevent with name and url properties.

Also, be adamant about naming conventions. Choose one way to name events, whether that’s plain text (Completed Order) or object-action (Order Completed). Make sure each event follows your preferred convention.

Once you’ve got your events all spec’d out, send the tracking document to the entire company, so they can use it to build their own analyses, funnels, and marketing segments. No more questions like, “How do I measure X?” should circulate around.

5. Build a “translator” for the old tracking to the new. As you switch over to the new schema, it will be helpful to document how old events translate into new ones. If your team was used to looking at Logins, and those will now be under App Login, you should help them easily discover changes.

To make the new schema more accessible, we suggest you create a “translator” attached to your tracking plan to help people understand how events they were used to seeing are now tracked and why you may have deleted some things. Think of this document as an “French to English” dictionary.

The translator should also help you in 6 months when you want to look back at your 2015 reports to compare key metrics. By then, you’ll probably have forgotten how you changed your schema.

6. Work with tools that can clean up your data. There are a few different options for switching over to your new data model in end analytics tools. The least manual option (great if you don’t need to see your 2015 and 2016 data together) is to send your new tracking to a new project in your out-of-the-box tools. You can easily switch between projects for old analysis, but the new data will be squeaky clean.

Only one out-of-the-box tool that we know of, Indicative, lets you merge two events and rename them without code. (That’s a sweet feature! Other platforms, take note!)

If you’re not using Indicative and you need your historical data to match your new schema, then you can write a script to translate data from the old to the new events. This is possible for Segment customers and folks using other analytics platforms with an HTTP API. (Write a script on top of the S3 integration back into the HTTP API). We’ll admit this is a pretty manual process, so consider if just switching projects will work for your team.

If you’re using a data warehouse, your schema consolidation will be much easier. Tools like Xplenty can help you merge and clean columns, or you can use the SELECT INTO SQL function to append old event tables onto new ones.

7. Communicate the new process to your team. Now that you’ve scrubbed down your data, it’s important to keep it clean! The best way to do this is to host a meeting with your team where you outline your new schema, share the tracking plan, explain your naming convention, and discuss the process for tracking new events.

Who do they have to run events by? When in the product development process are events decided and implemented? (Hint: Before launch.) Write this process up for future hires.

In a number of our customer’s companies, we’re seeing the rise of a new role that we’re calling the Data Czar.” This person is responsible for owning the schema and approving all new events that go into their apps and websites. They work with product and analytics leads to ensure each new event is necessary, follows their spec, and is being tracked from the right location (client vs. server). You might want to consider it!

We’re embarking on our own data clean up project this year, and we’ll be sharing our progress along the way. We’d also love to tell your stories. Hit us up with how you’ve been able to clean up your data, and we’d love to feature you on the blog!

Happy cleaning. 🚿

— The Segment Team

Diana Smith on December 14th 2015

If you don’t trust your data, it’s useless. This is just one of many helpful nuggets our customer Fareed Mosavat, Senior Growth PM at Instacart, shared with us this week.

If you’re not familiar, Instacart is an awesome service for delivering groceries and is leading the on demand economy with crazy growth.

We took some time with Fareed to discuss challenges that face most data-driven product and growth teams. To Fareed, understanding user behavior is essential for devising, running, and interpreting experiments. And, making data easily accessible in a raw format is the only way the company can measure and achieve their specific goals.

In the Q&A Fareed Covers:

  • How the growth team operates at Instacart

  • When out-of-the-box analytics tools work, and when they fall short

  • Why row-level event data is vital for data confidence and advanced analysis

  • How the team discovered that building a data pipeline to Redshift is harder than it looks

  • Why combining data sources into a single database is the “Holy Grail”

Dive in to the interview below, or get the PDF.

Growth Team at Instacart

Diana: Fareed! Thanks for chatting with us today. Why don’t we start with you telling us a little bit about your role and responsibilities at Instacart.

Fareed: Sure, Diana! I’m the growth product manager. Our team is responsible for consumer growth, so growing our user base, retaining them and activating them through the whole funnel.

Diana: Not a small task! Are you a part of the product team?

Fareed: Yes, It’s a multidisciplinary team, but it’s a product team. We’ve got a designer, a couple engineers and myself. And then we work closely with a bunch of other teams, like analytics and marketing.

Segment loves Instacart

Diana: What’s your main focus right now on the growth team?

Fareed: Number one is just making sure that we have everything that our users are doing recorded, measured, and in a good place. Having our data in order is the only way we can make good product decisions, and Segment helps a lot with that.

Then the second thing we’re really focused on is first-time user activation. So, figuring out what are people doing in their first session, why are they dropping off, when are they dropping off, how can we help them get to their first order.

We know there’s a lot of real-world stuff that happens after their first experience, like quality of service and fulfillment, that are sort of outside of our team’s control. So we’re focusing on getting people that first wonderful experience as quickly as possible, working across the mobile apps and the website.

Switching to SQL

Diana: So I know you’re using Segment Warehouses that loads your user data into Amazon Redshift. Why did you want to do this type of analysis in SQL compared to something like Google Analytics or Mixpanel?

Fareed: Yeah, so we’ve been using Amplitude through Segment for a bunch of stuff, and it’s super helpful. Amplitude helped us look at aggregate numbers, counts, funnels, analyze user segments, and understand how many people are taking certain actions in our product.

But I think there are a couple other reasons why the row-level individual event data, whether it be SQL or somewhere else, is super important.

One is that SQL makes tracking easier to debug. I can watch events fly by in the Segment debugger, but if you have like a lot of data, it’s hard to catch everything. We have very specific taxonomy, and rules, and event names, and everything that need to be correct. With SQL, we can easily diagnose issues like when we forget to pass important traits like userID.

The second is we have a couple of self-defined metrics that are important and specific to the company. And those things tend to be buried in a database somewhere, usually in SQL and sometimes outside of this event data. Being able to merge those metrics and that analysis with our event analytics is really important.

The Holy Grail: One Database. All of the Data.

Diana: What are some of those metrics?

Fareed: Things like quality of service, refunds, consumer support, how much people spend.

We have a lot of steps in our funnel for each order, some of which are online and some of which are not. They are all recorded somewhere, but they happen in different places in the process. The Holy Grail we’re working towards is getting all of the data into one place.

Combining this fulfillment and shipping data with our Segment user behavior data is key for us to connect the dots across the entire customer experience.

With Segment Warehouses, we can put all of the data about how customers are using our apps and websites right into our own Redshift alongside this other data to query as we want.

Diana: What are some of the questions that you’re querying across your Segment data and the internal data to answer?

Fareed: The big one is AB testing and understanding the behavior of users in one test group versus another. We’re currently working towards removing signup from the onboarding flow and making it part of checkout. There isn’t a clear event there, so we need to be able to watch an anonymous user from beginning to end and see their conversion.

We’re measuring what percentage of users that have zero orders, whether they check out on the same day or within seven days, and we want to be able to use any window we want. Those kinds of things are a little bit easier to just define in a set of rules in SQL than it is to try and manipulate a UI to give us exactly what we want.

The second is defining metrics specific to Instacart. Let’s take visitors for example. We have different definitions for visitors: landing page visitors, storefront visitors, and visitors per region. You might be able to figure this out in out-of-the-box tools, but to really trust it you have to know exactly how it was defined. While these out-of-the-box options are great for quick analysis, they tend to be a little bit opaque for understanding session measurement, new vs. returning users, and stuff like that.

Third is making sense of multi-touch attribution. We find that users will visit a couple times before they actually place an order. Previously, we were only attributing that to the last click. But now, because we have all the data, we’re actually able to do a longer attribution cycle and understand how many touches a user had before they actually convert. It’s already been really helpful, but it will become even more important over time as we understand what our marketing looks ROI like and can maximize results from our spends.

On Building a Data Pipeline

Diana: I heard you were building your own Redshift pipeline for a bit before you chose to go with Warehouses. What made you make the switch?

Fareed: We’ve tried a lot of different things here. I think the biggest reason we use Segment is because it gives us the most portability, so we can use whatever services we want and still be able to like keep our instrumentation clean and in a single place. So, we have used a plethora of things.

Our team has played around with S3. The shopper team is using Outbound, we’re using Amplitude, we’ve tried Mixpanel, Google Analytics, tag manager, etc. We try all this stuff, and it’s just flipping switches.

So with Redshift, it was another thing like that where we said to ourselves, “We can bake it off against our own internal system or just turn it on with Segment.” Our main goal was finding something sustainable long term without sacrificing the costs of getting up and running quickly.

Going from schema-less data to like a schema style SQL database turns out to be harder than it looks. With a team of engineers and number of months, we could build something, but it would take a long time and a lot of work to be as fully featured or scalable as Segment Warehouses.

Building data pipelines is not our primary job. But we do need the data in SQL. Luckily, other people have already solved this problem for us.

Plus, it was really important for this data to be in our own Redshift, which Segment could do for us. At the end of the day this is our data, right? Our users took these actions, they did their thing, and it’s important that no matter what we choose from a vendor standpoint that we own this data and that it exists with us.

Thanks so much to Fareed for chatting with us about the growth team at Instacart, getting your data squeaky clean and all into one place to analyze.

If you’re curious about how Segment Warehouses can help you load your web and mobile data into Redshift or Postgres without writing a line of ingestion code, you can learn more here!

Stephen Levin on November 23rd 2015

When your analytics questions run into the edges of out-of-the-box tools, it’s probably time for you to choose a database for analytics. It’s not a good idea to write scripts to query your production database, because you could reorder the data and likely slow down your app. You might also accidentally delete important info if you have data analysts or engineers poking around in there.

You need a separate kind of database for analysis. But which one is right?

In this post, we’ll go over suggestions and best practices for the average company that’s just getting started. Whichever set up you choose, you can make tradeoffs along the way to improve the performance from what we discuss here.

Working with lots of customers to get their DB up and running, we’ve found that the most important criteria to consider are:

  • the type of data you’re analyzing

  • how much of that data you have

  • your engineering team focus

  • how quickly you need it

What is an analytics database?

An analytics database, also called an analytical database, is a data management platform that stores and organizes data for the purpose of business intelligence and analytics. Analytics databases are read-only systems that specialize in quickly returning queries and are more easily scalable. They are typically part of a broader data warehouse.

What types of data are you analyzing?

Think about the data you want to analyze. Does it fit nicely into rows and columns, like a ginormous Excel spreadsheet? Or would it make more sense if you dumped it into a Word Doc?

If you answered Excel, a relational database like Postgres, MySQL, Amazon Redshift or BigQuery will fit your needs. These structured, relational databases are great when you know exactly what kind of data you’re going to receive and how it links together — basically how rows and columns relate. For most types of analytics for customer engagement, relational databases work well. User traits like names, emails, and billing plans fit nicely into a table as do user events and their properties.

On the other hand, if your data fits better on a sheet of paper, you should look into a non-relational (NoSQL) database like Hadoop or Mongo.

Non-relational databases excel with extremely large amounts of data points (think millions) of semi-structured data. Classic examples of semi-structured data are texts like email, books, and social media, audio/visual data, and geographical data. If you’re doing a large amount of text mining, language processing, or image processing, you will likely need to use non-relational data stores.

How much data are you dealing with?

The next question to ask yourself is how much data you’re dealing with. If you're dealing with large volumes of data, then it's more helpful to have a non-relational database because it won’t impose restraints on incoming data, allowing you to write faster and with scalability in mind.

Here’s a handy chart to help you figure out which option is right for you.

These aren’t strict limitations and each can handle more or less data depending on various factors — but we’ve found each to excel within these bounds.

If you’re under 1 TB of data, Postgres will give you a good price to performance ratio. But, it slows down around 6 TB. If you like MySQL but need a little more scale, Aurora (Amazon’s proprietary version) can go up to 64 TB. For petabyte scale, Amazon Redshift is usually a good bet since it’s optimized for running analytics up to 2PB. For parallel processing or even MOAR data, it’s likely time to look into Hadoop.

That said, AWS has told us they run Amazon.com on Redshift, so if you’ve got a top-notch team of DBAs you may be able to scale beyond the 2PB “limit.”

What is your engineering team focused on?

This is another important question to ask yourself in the database discussion. The smaller your overall team, the more likely it is that you’ll need your engineers focusing mostly on building product rather than database pipelines and management. The number of folks you can devote to these projects will greatly affect your options.

With some engineering resources you have more choices — you can go either to a relational or non-relational database. Relational DBs take less time to manage than NoSQL.

If you have some engineers to work on the setup, but can’t put anyone on maintenance, choosing something like PostgresGoogle SQL (a hosted MySQL option) or Segment Warehouses (a hosted Redshift) is likely a better option than Redshift, Aurora or BigQuery, since those require occasional data pipeline fixes. With more time for maintenance, choosing Redshift or BigQuery will give you faster queries at scale.

Side bar: You can use Segment to collect customer data from anywhere and send it to your data warehouse of choice. See how it works here 👉

Relational databases come with another advantage: you can use SQL to query them. SQL is well-known among analysts and engineers alike, and it’s easier to learn than most programming languages.

On the other hand, running analytics on semi-structured data generally requires, at a minimum, an object-oriented programming background, or better, a code-heavy data science background. Even with the very recent emergence of analytics tools like Hunk for Hadoop, or Slamdata for MongoDB, analyzing these types of data sets will require an advanced analyst or data scientist.

How quickly do you need that data?

While “real-time analytics” is all the rage for use cases like fraud detection and system monitoring, most analyses don’t require real-time data or immediate insights.

When you’re answering questions like what is causing users to churn or how people are moving from your app to your website, accessing your data sources with a slight lag (hourly or daily intervals) is fine. Your data doesn’t change THAT much minute-by-minute.

Therefore, if you’re mostly working on after-the-fact analysis, you should go for a database that is optimized for analytics like Redshift or BigQuery. These kind of databases are designed under the hood to accommodate a large amount of data and to quickly read and join data, making queries fast. They can also load data reasonably fast (hourly) as long as you have someone vacuuming, resizing, and monitoring the cluster.

If you absolutely need real-time data, you should look at an unstructured database like Hadoop. You can design your Hadoop database to load very quickly, though queries may take longer at scale depending on RAM usage, available disk space, and how you structure the data.

Postgres vs. Amazon Redshift vs. Google BigQuery

You’ve probably figured out by now that for most types of user behavior analysis, a relational database is going to be your best bet. Information about how your users interact with your site and apps can easily fit into a structured format.

analytics.track('Completed Order') — select * from ios.completed_order

So now the question is, which SQL database to use? There are four criteria to consider.

Scale vs. Speed

When you need speed, consider Postgres: Under 1TB, Postgres is quite fast for loading and querying. Plus, it’s affordable. As you get closer to their limit of 6TB (inherited by Amazon RDS), your queries will slow down.

That’s why when you need scale, we usually recommend you check out Redshift. In our experience we’ve found Redshift to have the best cost to value ratio.

Flavor of SQL

Redshift is built on a variation of Postgres, and both support good ol’ SQL. Redshift doesn’t support every single data type and function that postgres does, but it’s much closer to industry standard than BigQuery, which has its own flavor of SQL.

Unlike many other SQL-based systems, BigQuery uses the comma syntax to indicate table unions, not joins according to their docs. This means that without being careful regular SQL queries might error out or produce unexpected results. Therefore, many teams we’ve met have trouble convincing their analysts to learn BigQuery’s SQL.

Third-party Ecosystem

Rarely does your data warehouse live on its own. You need to get the data into the database, and you need to use some sort of software on top for data analysis. (Unless you’re a-run-SQL-from-the-command-line kind of gal.)

That’s why folks often like that Redshift has a very large ecosystem of third-party tools. AWS has options like Segment Data Warehouses to load data into Redshift from an analytics API, and they also work with nearly every data visualization tool on the market. Fewer third-party services connect with Google, so pushing the same data into BigQuery may require more engineering time, and you won’t have as many options for BI software.

You can see Amazon’s partners here, and Google’s here.

That said, if you already use Google Cloud Storage instead of Amazon S3, you may benefit from staying in the Google ecosystem. Both services make loading data easiest if if already exists in their respective cloud storage repository, so while it won’t be a deal breaker either way, it’s definitely easier if you already use one to stay with that provider.

Getting Set Up

Now that you have a better idea of what database to use, the next step is figuring out how you’re going to get your data into the database in the first place.

Many people that are new to database design underestimate just how hard it is to build a scalable data pipeline. You have to write your own extraction layer, data collection API, queuing and transformation layers. Each has to scale. Plus, you need to figure out the right schema down to the size and type of each column. The MVP is replicating your production database in a new instance, but that usually means going with a database that’s not optimized for analytics.

Luckily, there are a few options on the market that can help bypass some of these hurdles and automatically do the ETL for you.

But whether you build or buy, getting data into SQL is worth it.

Only with your raw user data in a flexible, SQL format can you answer granular questions about what your customers are doing, accurately measure attribution, understand cross-platform behavior, build company-specific dashboards, and more.

Segment can help!

You can use Segment to collect user data and send it to data warehouses like Redshift, Snowflake, Big Query and more — all in real time and with our simple, powerful analytics API. Get started here 👉

Andy Jiang on November 3rd 2015

Today, we’re excited to share Analytics Academy—a helpful guide to becoming an analytics expert. You’ll start with the basics of how to think about analytics and level up into when is the right time to consider SQL, and more!

The Academy is chock full of best practices we’ve learned by working with thousands of companies on their analytics setup, tool stack, and internal infrastructure.

You can sign up (via email or Slack) for the intro course today! Each week we’ll send a lesson to you on topics including proper analytics implementation, maintaining clean and consistent data, which tools are right for you, and how to leverage your raw data.

Why now?

You may have read Analytics Academy from Segment a few years back. But since then, a few important things have changed.

  • It’s easier than ever to start collecting data and using analytics tools. What’s hard is narrowing your focus and making the data in your tools actually useful.

  • The types of things you can do with your data and the landscape of tools to manage those projects has exploded.

  • On the Segment side, we’ve developed a stronger set of best practices by working with thousands of customers on their analytics setup, preferred stack, and evolving set of data challenges.

And now, we’re bringing these learnings to you!

The new content focuses on a single narrative of analytics—how to use it to improve your product, best practices in data collection, frameworks for selecting a set of tools, and ways to cultivate a data-forward organization.

We hope you find the Academy educational and helpful. If you have any topics you’d like us to cover or ideas on the content, we’d love to hear them! Tweet us your thoughts @Segment!

Diana Smith on October 29th 2015

As mobile user acquisition costs skyrocket, push notifications are becoming a key strategy to keep your users engaged past the install. However, choosing a push notification tool can be a job of its own.

In this blog post, we’ll provide a framework for evaluating push notification tools, cover the top players in the market, and discuss which tools are best for you based on your role, company size, and objective.

Developing your criteria

Before you embark on your journey to find an awesome push tool, you have to ask yourself two questions.

1. What types of messages are you sending?

The first question to answer is what types of messages you want to send. There are two types of push notifications your messages can fall under. Similar to the email world, the first (and admittedly less sophisticated) type of message is “batch and blast” or what is called Mechanical push. You’ll send these messages when users opt-in to notifications for specific types of topics — think Breaking News or San Francisco 49ers score updates.

Also included in the mechanical category are immediate, event-triggered notifications. For example, you might receive a message from Venmo that “Emily just paid you $23 for Ramen,” or from Visa letting you know that “Your credit card was just charged”. The key with understanding Mechanical push is that they are immediate, list-based, or event-triggered notifications.

If Mechanical messages are the only types of notifications you want to send and you’re looking for other developer services like crash reporting, going with Parseis a good bet for a team of developers. It’s the Mailchimp equivalent for the mobile world — lightweight and easy to get started with.

The second type of notification is Behavioral, when you send a personalized message based on past activity someone has done in your app or real-time information like location. These messages are often super personalized and deep link into a particular, contextual page of your app. For example, Steve Madden sends you a message that those boots you were browsing last week are now available in your size. When you tap in, the message takes you right to the product page for the shoe with your size checked.

Behavioral messages, in both the email and the app world, perform much better, and it’s not rocket science to figure out why! The messages are highly targeted, timely, and personalized. And, basically every provider that does behavioral messages, also supports mechanical. They usually compete on scale, reliability, and functionality of behavioral triggers.

There are only few reasons you might not want to use Behavioral notifications: Your app is very transactional, or you’re resource constrained. Perhaps you don’t have technical help to track behaviors your customers are doing in your app or time to set up these campaigns.

2. Do you want an all-in-one or focused solution?

The next decision you’ll have to make is if you want a tool that focuses completely on messaging, or if you want something that also offers product analytics and in-app a/b testing.

There are pros and cons to each.

Going with a tool that’s focused on messaging means that the team you’re working with is fully invested in making that experience amazing. Most tools in the messaging category have full-featured options for sending personalized messages, reporting on campaigns, a/b testing copy, and delivering notifications at the right time for each user. Many of the players in the space who started in push are starting to offer email, so look out for that if you’re interested. Kahuna, and Appboy are the top players here.

That said, Outbound and Iterable are also good options to check out if you’re looking for a cross-platform tool covering push, email, and SMS. These platforms subscribe to the belief that you should first think about where your customers are getting stuck in your product and then use the right channel to send them a message if they don’t take the next step.

Another type of tool is what we call the “all-in-one” mobile marketing and product suites. They offer push and email services, but also do in-app a/b testing, and funnel analytics. If SDK bloat is a big issue for you, these types of tools including MixpanelUrban AirshipLocalytics, and Leanplum might be a better choice.

In terms of pricing, most tools will let you send up to 1,000,000 messages or communicate with 10,000 users for free or on a trial. Then the cost levels up or you have to call to get enterprise pricing. For many of the all-in-one solutions, the push notification and other marketing features are priced as an add-on to the core analytics offering.

Now that you have an idea of your options, let’s get into the nitty gritty of what each of the popular services offer.

The pull of push tools


Kahuna focuses largely on behavioral push notifications. If you are a marketer in the ecommerce space and mobile drives most of your business, Kahuna is a great tool for you. Most of their features help you deliver hyper-personalized communication, from pre-crunching user segments with machine learning to offering delivery at the right time based on a user’s past engagement. Because this segmentation is useful elsewhere, they have also recently added email and Facebook audience targetting capabilities. Kahuna is a great option if you are a part of a larger company (no self-service plan) with significant downloads and are working on engagement and retention. Their features include:

  • Dynamic deep linking that takes users to unique place in the app based on their past behavior.

  • A “ghost push” feature to track which push notifications are causing people to uninstall or opt out of any message in any campaign that helps you find the line between helpful and spammy.

  • The option to use their pre-built segments of new, dormant, active, and inactive users or create custom segments based on an unlimited number of events and attributes.

  • A-E testing and message send time optimization to make sure the highest performing message hits your audience at the right time for each individual.

Kahuna is great for

  • Role: Marketers and Product Managers

  • Customer Company Size: 200-10,000

  • Monthly Active Users: 50,000-25,000,000

  • Objective: Drive customer lifetime value through repeat purchases.

  • Industry: Commerce, Media, Travel

Urban Airship

Urban Airship has over 30,000 apps using its service, from startups to large organizations like Walgreens, ABC News, Alaska Airlines and Airbnb. Helpful for both developer and marketing teams, they offer well-documented APIs for mechanical messaging as well as behavioral targeting capabilities that allow marketers to define segments and deliver highly relevant, real-time messages. Urban Airship offers numerous messaging types such as interactive push notifications, in-app messages an in-app inbox (message center), and mobile wallet. It also offers audience intelligence, funnel app analytics, a-b testing, and a user-level data streaming service to power behavioral messaging in other channels like e-mail, ad platforms and more.

Urban Airship offers:

  • A full suite of mobile messaging and content publishing tools including interactive push notifications, in-app messages, rich landing pages, mobile wallet, and sport for rich messaging (images and video).

  • Easy to use campaign tools and templates that allow you to add a deep-link, develop a landing page, social share, and define segment attributes for any message.

  • Segmentation tools that allow you to personalize your messaging based off in-app behaviors, location (triggers and history), preference center, app events as well as cross-channel data from a CRM or other system.

  • Audience intelligence to identify user-level trends for future campaigns. App analytics are also provided with funnel analysis, conversion reporting, cohort analysis and RFM reporting.

  • A mobile data streaming service that allows you to send user-level information to business systems for omni-channel behavioral targeting and analysis.

Urban Airship is great for

  • Role: Developers, Product Owners and Marketers

  • Customer Company Size: 100-10,000

  • Monthly Active Users: 1,000-50,000,000

  • Objective: Grow and retain your mobile audience.

  • Industry: Retail, Travel, Media, Financial Services


Parse, recently acquired by Facebook, is a developer friendly platform for push notifications and analytics. If you’re concerned about the health of your app, want to send basic notifications, and price is a factor, Parse is a good option. Beyond basic push features, they will also help you monitor and investigate bugs and crash issues. With Parse, you can

  • Segment audiences based on age, location, and language, and schedule messages in advance.

  • Preview notifications exactly as they will appear across the devices you target.

  • Send notifications via the web portal, REST API, or client SDKs.

  • Monitor the effectiveness of your push campaigns open rate analytics.

  • Confidently evaluate your push messaging ideas with A/B tests to create the most effective, engaging notifications for your app.

Parse is great for

  • Role: Developers

  • Customer Company Size: 25-2,000

  • Monthly Active Users: 500,000-20,000,000

  • Objective: Send basic notifications to customers and understand what is making your app crash.

  • Industry: Gaming, Education, Entertainment


Appboy is a communication platform built for mobile-first marketers. If you have mobile product, but email and in-app notifications sometimes make sense, Appboy has a pretty comprehensive set of features for tackling these channels. Beyond the basics of event-triggered notifications and batch “product update” style in-app messages, Appboy will help you target messages based on a variety of customer attributes. They do very well in the messaging and media industries. Using Appboy you can,

  • Target content based on detailed user profiles and past in-app behavior, such as the frequency of app use and number of in-app purchases.

  • Set up user action goals for each message and measure conversions.

  • Create groups of users based on their location to send geo-located messages.

  • Send messages based on when each user will most likely engage with your app.

  • Automatically insert details like nearby movie times, customized recommendations and proprietary content directly into messages.

Appboy is great for

  • Role: Mobile Marketers

  • Customer Company Size: 100-1,000

  • Monthly Active Users: 50,000+

  • Objective: Drive mobile-centric customer engagement, retention and advocacy.

  • Industry: Commerce, Media, Messaging


Outbound is a newer tool that’s great for smaller sized companies and startups in the on-demand economy. If you’re laser-focused on getting folks through your funnel, Outbound makes it easy to send messages based on what a user has and hasn’t done in your app. If you have a web-first product with complementary mobile apps, Outbound will help you communicate across email, SMS, and push. You can easily customize notifications based on the device each user prefers. Outbound helps you

  • Set up automated, trigger-based campaigns based on what users do in your app for email, push, and SMS.

  • Send broadcast campaigns with a one-time message to a segment of your users based on their history.

  • Track any action your users take in your website or app and immediately trigger an event, without creating a segment in advance.

  • Automatically update user info — like new email address — with the Outbound API

  • A/B test if email, push, or SMS performs better for a particular message

Outbound is great for

  • Role: Growth Marketers

  • Customer Company Size: 5–200

  • Monthly Active Users: 50,000-1,000,000

  • Objective: Activate users for a cross-platform product.

  • Industry: On-demand, Financial Tech, Marketplaces


Iterable started as a marketing automation platform and now also offers push notifications and SMS. It’s best if you’re working with a consumer audience, want to send messages across web and mobile channels, and use both mechanical and behavioral messages in your camapigns. They have a visual workflow for designing lifecycle and engagement campaigns, which makes it easier to map out your messages.

Iterable is great for

  • Role: B2C Marketers

  • Customer Company Size: 50 - 5,000

  • Monthly Active Users: 20,000 - 20,000,000

  • Objective: Send multi-channel messages for both blast and behavioral campaigns

  • Industry: Commerce, Education, Consumer


Leanplum started in mobile a/b testing but has moved into other parts of mobile optimization including analytics and push notifications. They are best for mobile-first companies that have over 100,000 MAUs and are looking to measure how push notifications affect key performance indicators down funnel. They offer a bunch of helpful features for behavioral push notifications and enable you to a/b test mobile designs without resubmitting to the App Store. With Leanplum you can,

  • Trigger personalized messages based on user attributes and specific in-app behaviors.

  • Programmatically tailor the user interface to individual users, for example, switch the default share option to a user’s favorite social platform.

  • Insert custom values within variables and messages to make them feel very personalized.

  • Send messages at the most optimal time, based on each individual user’s past usage pattern.

Leanplum is great for

  • Role: Marketers

  • Customer Company Size: 200-1,000

  • Monthly Active Users: 100,000-10,000,000

  • Objective: Drive retention and engagement with event-driven push notifications.

  • Industry: Travel, Media, Retail


Localytics is tailored for companies that plan their communications based on how people are interacting in their app. It’s best for commerce, entertainment, and media apps. In Localytics, the push and marketing features are very connected to their analytics insights, meaning that they encourage you to message users who are dropping off along your funnel. They have attribution, cohort, and funnel reporting to help you understand those drop offs, as well as email and push message options. Localytics helps you:

  • Target campaigns to an existing segment of users, or create an ad hoc filter for each campaign.

  • Schedule campaigns to go out immediately, at a single point in the future, or set up an automated campaign that sends on a recurring, scheduled basis to newly qualified users.

  • Test up to 5 message variants for each campaign and see what wins.

  • Track impressions, clicks, and conversions over time for each campaign, and set up funnels to see how users from each campaign engage overtime.

Localytics is great for

  • Role: Growth and Enterprise Marketers

  • Customer Company Size: 250-500

  • Monthly Active Users: 50,000-50,000,000

  • Objective: Drive engagement across the entire user lifecycle.

  • Industry: Commerce, Entertainment, Media


Mixpanel is best known for delivering super helpful in-app event analytics. Their funnel analysis, engagement and retention reports are polished and easy to use. Over the past few years, they’ve expanded their scope to include email and push messaging as well as mobile a/b testing. If you’re a product manager who already really enjoys Mixpanel’s analytics offering, using them for basic engagement and retention projects could save you some time. Here’s a look at their push features:

  • Send emails, push notifications, in-app notifications, or SMS text messages based on which platforms people prefer using your product.

  • Set your messages to go out at optimal time, and Mixpanel will tailor that to each user’s timezone.

  • Experiment with your notifications by a/b testing the subject in an email or the message in a push notification.

  • See how your messages perform against conversion events you’re already sending to Mixpanel.

Mixpanel is great for

  • Role: Product Managers

  • Customer Company Size: 5–500

  • Monthly Active Users: 1,000-1,500,000

  • Objective: Improve funnel conversions with personalized notifications.

  • Industry: Consumer Apps, Social, Media

Bundling and Unbundling

We hope this deep dive gave you a better idea of which push tool is best for you!

As you can see from this list, many of the push platforms are trending toward bundling analytics and marketing “jobs to be done” into a single solution. However, it will be interesting to see if this continues.

In the history of web analytics and email tools, we’ve seen single point solutionseat away market share from monolithic suites, and predict this trend to continue in mobile. The biggest factor to watch is how technology to reduce SDK bloat evolves.


If you’re interested in exploring these push tools, you might want to look into Segment. We make it easier for you to try push notification, analytics, and optimization services. Instead of integrating each SDK one by one, you can collect customer interaction data with our API, integrate one SDK, and then flip a switch to integrate new tools. (No submitting to the app store!)

We currently offer most of these push tools on our platform. You check out our full list of integrations here, and request support for new tools here.

Become a data expert.

Get the latest articles on all things data, product, and growth delivered straight to your inbox.