A Simple Guide to Running & Optimizing A/B Tests
An introductory guide to building and running A/B tests for greater audience insight.
An introductory guide to building and running A/B tests for greater audience insight.
A/B testing is an experimentation method where you test two versions of the same variable on a subset of customers. In marketing, you run A/B tests to determine if you can influence customer behavior and improve campaign performance by changing a particular element in your marketing materials, UX, or product.
You run A/B tests to influence user behavior at touchpoints that are crucial for moving customers down the funnel. Common examples of A/B testing goals include:
Increasing email open rates
Improving ROI from paid ads
Increasing conversions
Increasing account creations/free trial signups
Decreasing landing page bounce rates
Reducing customer churn
These are broad examples, but in practice, these goals should be tied to specific metrics to track performance. For example, if you were focusing on improving the ROI of paid ads, you would track CTA clicks, lead quality, and subsequent conversions (as a few examples) to understand if these advertisements are resonating with the right audience (on the right channels).
The way you design an A/B test is crucial for getting valid results. As a general outline, make sure you have a clear goal tied to each test, that you have a large enough sample size, and that the test runs for the necessary amount of time before considering the results valid. (Here’s a quick refresher on important A/B testing terms.)
Say you email a coupon to 2,000 customers and test two types of offers:
Version A: “Get 10% off your next purchase”
Version B: “Get 10% off on any item.”
5% of recipients click on Version A, while 8% click on Version B. If you’re optimizing for click-through rate, Version B wins.
Or if you wanted to measure which coupon leads to more sales, you would look at the conversion rate between the two variations. In this scenario, 60% of people who saw Version A used the coupon to make a purchase (whereas only 10% did with Version B). In this case, Version A would win.
So, should you go ahead and implement Version A across the board? Not just yet. With any A/B test, it’s important to compare the results with your baseline performance. Are you seeing a significant improvement? Or are the results only negligible? Any change will require team resources, so it’s important to be savvy about how you’re spending your time (and to measure improvement between your baseline and variations – not just their isolated performances).
A basic A/B testing hypothesis goes something like this: “[By changing this element I will improve this metric].” For example, “Adding social proof to the email subject will improve open rate.”
Another way to phrase your hypothesis is “Using [variant A of an element] will yield a higher [metric] than using [variant B].” For example, “Inserting a CTA within the top 25% of the blog post will yield a higher click-through rate than inserting the CTA at the end of the post.”
You can make your hypothesis more granular by focusing on a particular customer segment, such as highest-value buyers or customers living in a given city.
Instead of testing every element you can, focus on optimizing touchpoints that have the most impact on conversions, as well as improving touchpoints that have the highest drop-off rate. You’ll know what these are by mapping and analyzing your customer journey. Marketers often test elements like email subject lines, CTA placement and phrasing, pricing plans, and personalization features.
Say your LinkedIn post for a downloadable guide generates high traffic but few downloads. A high percentage of customers drop off your funnel between arriving on the landing page and clicking on the download button.
The next step is determining which element and variation will significantly boost downloads. Let your data guide you: What elements are customers interacting with and ignoring? What changes have you made to landing pages in the past to improve their performance?
Perhaps customer data shows that a majority of page visitors are rejected when they click on the download button because they’ve entered invalid data. You hypothesize that simplifying your form and removing unnecessary fields will increase downloads – and put it to the test.
A standard recommendation for A/B testing is to use a sample size of at least 1,000 people. The goal is for the sample size to accurately reflect your larger audience; and with larger sample sizes, there’s a stronger case for statistical significance (i.e., the results aren’t random, or skewed).
A/B testing tools like AB Tasty and Optimizely come with built-in sample size calculators. These calculators ask you to provide data like:
Baseline conversion rate — How your original version is performing.
Population size — The total number of customers who’ve been exposed to your campaign
Minimum detectable effect — The smallest improvement in conversions you’re willing to accept.
Desired statistical significance — As mentioned above, statistical significance refers to a test being deemed statistically sound – or not the byproduct of chance. The accepted standard is that test results are deemed accurate when there is 95%.
A/B testing tools have capabilities like choosing a randomized test group, splitting the audience, and delivering analytics reports (sometimes in real time). Different tools serve various use cases.
Run A/B tests using tools like:
Apptimize for cross-platform UX
Optimizely for webpages and single-page apps
Twilio SendGrid for email campaigns
Meta Ads Manager for Facebook and Instagram ads
Get the most from your A/B tests by connecting your testing tool to a customer data platform (CDP) like Twilio Segment.
Segment is a customer data platform (CDP), meaning that is helps businesses collect and consolidate from across their organization in real time. With unified customer profiles, every team is able to see a customer’s historical data, behavior, and known preferences.
With Segment, teams are able to segment audiences based on their shared traits or behaviors (whether it’s job title, their place in the funnel, their engagement rates, and so forth). By leveraging this data, teams can have more precision on the A/B tests they run. For example, if they’re looking to decrease customer churn by re-engaging low-interest customers, it would make sense to exclude highly engaged customer from this segment (who would likely skew results).
Segment also has hundreds of integrations, which include various A/B testing tools, to help teams easily connect this data to their platform of choice and launch experiments.
Vista, a multinational design and marketing firm, discovered that their customers found it difficult to navigate to Studio, the part of their website that hosted customers’ design projects. The company hypothesized that a personalized homepage would improve user experience and boost click-through rate from the dashboard. The test homepage’s dashboard gave customers a quick way to access ongoing projects and previous orders, and showed personalized product recommendations.
They tested the performance of their existing homepage (Version A) with a homepage that contained a personalized dashboard (Version B). Vista used Segment to collect all of their event data in one place. Over a six-month period, they saw the following results:
121% increase in click-through rate (CTR) from homepage dashboard
3.28% increase in traffic to Studio stage from homepage dashboard
4.27% increase in traffic to Studio Review stage from homepage dashboard
Connect with a Segment expert who can share more about what Segment can do for you.
We'll get back to you shortly. For now, you can create your workspace by clicking below.
A/B testing only isolates one element in your UX, marketing materials, etc. If you’re interested in changing multiple different elements for an experiment, it would be better to run a multivariate test.
A/B testing can be simpler to design and quicker to execute than multivariate testing. Analyzing results can also be easier with A/B testing, as you only need to compare two variables.
Choose the elements that help you nudge customers to take action. For example, a subject line entices a reader to open an email, a CTA urges the audience to click, and a coupon encourages a customer to make a purchase. These elements should be strategically chosen based on your specific goals (e.g. improving email open rates, increasing click-through rate, etc.).
Segment provides companies with a central hub for customer data, including data on customer traits, behaviors, and engagement with marketing campaigns. Segment also helps businesses create highly granular customer segments, which can be leveraged for running targeted A/B tests (e.g. if you’re focused on a specific persona, or interested in a specific section of the funnel).