A/B Testing in Data Science

Understand A/B testing from a data science perspective.

What is A/B Testing in Data Science?

A/B testing is a type of experiment in which you split your web traffic or user base into two groups, and show two different versions of a web page, app, email, and so on, with the goal of comparing the results to find the more successful version. With an A/B test, one element is changed between the original (a.k.a, “the control”) and the test version to see if this modification has any impact on user behavior or conversion rates.

From a data scientist’s perspective, A/B testing is a form of statistical hypothesis testing or a significance test.

Twilio Engage

A growth automation platform

Scale your growth strategy with a blend of automation, communications APIs, and real-time data.


A/B Testing need-to-know terms

The data science behind A/B testing can get complex pretty quickly. But, we’ve highlighted a few need-to-know terms to start with the basics. 

Null hypothesis 

The null hypothesis, or H0, posits that there is no difference between two variables. In A/B testing, the null hypothesis would assume that changing one variable on a web page (or marketing asset) would have no impact on user behavior.  

Alternative hypothesis 

On the flip side, an alternative hypothesis suggests the opposite of the null hypothesis: that changing an element will impact user behavior. Take the example below: 

Null hypothesis: The size of a call-to-action button does not impact click rates. 

Alternative hypothesis: Larger call-to-actions buttons result in higher click rates. 

Statistical significance

Statistical significance is meant to signify that the results of an A/B test are not due to chance (rejecting the null hypothesis). 

This is calculated by measuring the p-value, or probability value. So, if the p-value is low, it is saying that it’s unlikely the results of the A/B test were random. 

A rule of thumb tends to be that when the p-value is 5% or lower, the A/B test is statistically significant. 

Confidence level

Think of the confidence level as the inverse of the p-value. The confidence level is the indication of how likely it is that the results of your experiment are due to the changed variable (that is, these results are not random or a fluke occurrence). 

If a test is considered statistically significant when the p-value is at 5%, then the confidence level would be 95%.

Frequently asked questions