Retail Marketplace Marketing Analytics Beginner Statsig

Optimize purchase conversions with Statsig’s experimentation, powered by Twilio Segment

This recipe will walk you through using the data collected from Twilio Segment as the basis for experimentation using a tool like Statsig. We’ll walk through how to measure a purchase conversion funnel, design an experiment to maximize a success metric (purchase event), and glean insights based on user actions to optimize experiences going forward.

Made by Logan Bates

What do you need?

Statsig
Twilio Segment

Easily personalize customer experiences with first-party data

With a huge integration catalog and plenty of no-code features, Segment provides easy-to-maintain capability to your teams with minimal engineering effort. Great data doesn't have to be hard work!

Get started for free

The crux of any good product decision process is an ability to observe what your customers are doing on your platform so that you can make informed decisions about how to best serve them. Companies that adopt a CDP like Twilio Segment understand the value of automating data-collection and devoting more expensive engineering resources to building exciting new features and iterating on products. These companies have made great steps forward in adopting a data-driven culture, but many fall short in leveraging their data to drive product decisions (be wary of the HiPPO).

This is why the need for a culture of experimentation has become increasingly more apparent. The success of product development should be measured in metric impact rather than number of features shipped; there is no easy way to know what feature is more impactful unless you’ve run a controlled experiment, comparing the status quo

with a potential new feature release. Unfortunately, limited engineering resources prevent many companies from building an internal experimentation platform afforded to top organizations.

Luckily, an A/B testing platform like Statsig is here to help! Twilio Segment integrates seamlessly with Statsig, making log collection a non-redundant step. This means you start with data to power your experiments and simply have to add a few additional steps to begin your journey towards experiment-driven product development!

In this recipe, we’re going to validate a hypothesis using observations collected by Twilio Segment and use Statsig to build our experiment, serve product variants to our users, and ultimately choose a winning variant based on statistical analysis.

Step 1: Send customer data collected by Twilio Segment to your Statsig Workspace

This recipe assumes Segment is collecting and tracking customer purchase events. If not, follow this guide to set up tracking.

Enabling the Twilio Segment integration for Statsig allows Statsig to pull in your tracked customer data. This data will power Statsig’s experiment analysis without instrumenting additional logging or overloading engineering efforts.

Events that Statsig receives will be collected and aggregated in the Metrics tab in the Statsig console. These events will automatically be included in your Pulse results for A/B tests with Statsig's feature gates as well as all your Experiment results.

In the Segment App, click Add Destination in the Destinations catalog page. Search for “Statsig” in the Destinations Catalog, and select the Statsig destination.
Choose which Source should send data to the Statsig destination.
From the Statsig dashboard, copy the Statsig "Server Secret Key”.

4. Enter the Statsig “Server Secret Key” in the “Statsig” destination settings in Segment.

5. On the Statsig Integration page enable the Segment integration.

6. As your Segment events flow into Statsig, we'll see a live Log Stream in the Metrics tab in the Statsig console. We can click one of these events to see the details that are logged as part of the event. Read more about the Statsig Destination here.

Step 2: Determine how to measure success and define your target metric

Now that Twilio Segment data is informing Statsig’s experimentation platform, it’s time to define a success metric so we can determine how effective our experimentation efforts are. In this example, we’re going to use a common success indicator: a purchase conversion funnel, which is a composition of several underlying metrics.

Define Your Success Metric:

Suppose we’re an eCommerce business selling a product, we might define a successful funnel as one where a user enters through a landing page showcasing our flagship products and exits with a purchase event. Therefore a full conversion funnel would be composed of the following events:

landing_page_view (awareness) > product_view (consideration) > add_to_cart (preference) > checkout_event (intent) > purchase_event (conversion)

Ultimately, we want to design our product to maximize the number of users reaching the conversion step (which implicitly optimizes revenue). This will be the metric we use to design our experiment around and determine how successful we are.

Within the Statsig metric catalog we have the ability to view and analyze the metrics coming from Twilio Segment, but also the ability to create custom metrics, including event counts, user counts, aggregations (ex: total purchases), ratio metrics, and funnel metrics. Let’s create a funnel metric that represents the desired customer flow above:

Name, describe and tag our metric.

2. Choose a funnel metric, count events, and enter the order of events representing our funnel.

3. After funnels are created and populated, we can view the lineage, trends in the metric value, and see a graphical representation of each step and conversions along the way:

Want to learn more about funnel metrics? Check out this doc.

Step 3: Set up your experiment, Analyze the results

Now that we have the data ingestion in place to observe what customers are doing on our website and we have a funnel metric defined to measure success, all that’s left to do is create an experiment to understand the impact of any changes we make. Because we’re interested in maximizing our success metric, we will focus on product experiences that we believe will positively impact the overall funnel conversion rate.

1. Determine the Hypothesis that is Being Tested

Suppose we’ve done some market research and determined that the shape of a product image icon has been shown to positively or negatively impact the likelihood of a user clicking to see more information. Today we have rounded circle (squircle) icons, but this change is fairly easy to implement and has a large potential revenue impact, so it’s a great candidate for a controlled A/B/n test. Therefore, we will test the following hypothesis:

Altering the product icon shapes will increase the likelihood of users engaging with and ultimately purchasing products. This will lead to an overall increase in funnel conversion rate.

We’ve identified two product shapes we’d like to test against the incumbent, and measure impact against our success metric. The control group is our current state, rounded-square icons and our two test groups are 1. circular and 2.square icons

The plan is to randomly distribute our users into one of three icon shape experiences and evaluate the overall funnel conversion rate.

2. Create the Experiment in Statsig

Now that we have all the pieces in place, we can set the experiment up with Statsig. Log into the Statsig console at https://console.statsig.com/

Navigate to Experiments in the left-hand navigation panel

Click on the Create button

Enter the name, hypothesis, and ID type (the unit of randomization) for your experiment as shown in the figure below:

There are several advanced options, but we’ll leave them for now. Want to learn more? Check out Statsig’s experimentation docs.

Configure your Scorecard

This is helpful in ensuring other members of your team viewing your experiment have context on the hypothesis being tested and how success is being measured. Additionally, all metrics added to the Scorecard are pre-computed daily, as well as eligible for more advanced statistical treatments like CUPED and Sequential Testing.

Scorecard metrics include primary metrics - the metrics you are looking to influence directly with your experiment, in this case the purchase conversion funnel.

And our Secondary metrics - the set of metrics you may want to monitor or ensure don't regress during your test, but aren't directly trying to influence. Read more about metrics here.

Read more about best practices for configuring your Scorecard here.

Configure Your Experiment Groups and Parameters

Now we need to configure the different groups we are targeting for testing and define the allocation of our population that should be uniformly distributed across the different variants. In this case we want to run this experiment on all of our users, so we’ll use a 100% allocation.

Next, we need to define any specific targeting we’d like to do on our allocated audience. In this case, we want to target all allocated users, but we have the ability to target users based on user properties (such as email, UserID, audience membership) and environment properties (such as browser, IP address, locale).

Tip: Statsig can sync Twilio Engage Audiences, which can be used for targeting subsets of users that meet certain evaluation criteria.

Select the split %, which is simply the percentage of the allocated population that should go to each experiment group. We’ll choose an even split (33.3%) across the control (rounded square) and the two test groups.

All that’s left to do is configuring our test groups and parameters. For each group the users will be allocated to, Statsig will return one or more parameters to inform us about which experiment variant a user should be exposed to. By having a consistent “shape” parameter across the three groups, we can dynamically serve a different user experience without over-complicating our code. Enter the values that the experiment parameter will take for each variant, in this case the three variants we’ve been discussing: rounded square, circle, and square. Read more about Groups vs. Parameters here.

Note: you cannot start your experiment without adding at least one parameter.

Now that our experiment is properly orchestrated, there are a few advanced features we can optionally configure. Most notably is the target duration (in days) that we’d like to run the experiment. If you’re unsure how long you’d like to run your experiment, the power analysis calculator can help. If we want to target a minimum detectable effect (MDE) against a certain metric, the calculator can help. In this example we are targeting a 5% MDE.

Additionally we can choose to apply sequential testing to a time-bound experiment, select our confidence interval, apply a Bonferroni correction, and more.

3. Implement the Experiment

To deploy our experiment, our application must pull the experiment configuration from Statsig and receive the events we are experimenting on (which we’ve done in step 1). For this example we’ll use the javascript SDK, but the full list of SDKs can be found here.

A. Install and initialize the Statsig javascript SDK You can install the Statsig SDK via npm, yarn or jsdelivr:

npm install statsig-js

After you install the SDK, you must initialize the SDK using an active Client API Key from the API Keys tab under Settings in the Statsig console.

In addition to the Client API key, you must also pass in a Statsig User ID to ensure users can be assigned and tracked for different variants of the experiment.

Note: other units of randomization can be used, but User ID is used here for simplicity.

const Statsig = require("statsig-js");
// initialize returns a promise which always resolves
await Statsig.initialize(
  "client-sdk-key",
  { userID: "some_user_id" },
{ environment: { tier: "staging" } } // optional, pass options here if needed );

B. Check the experiment in your application to serve different variants

const productShapeExp = statsig.getExperiment("product_logo_icon_shapes");
const productShape = productShapeExp.get("shape", "rounded_square");

Based on the user’s allocation, we will receive one of the product shape variants, which we can then use to render the shape of product icons in our front end. Note we provide a default value of rounded_square as a fallback default value in the second argument.

As users move through product experiences, Statsig will serve different variations. Twilio Segment will observe user actions and report them back to the Statsig experimentation platform, establishing a feedback loop for the experimentation platform.

4. Analyze the Results, Make a Decision

After we’ve allowed some time for our experiment to run in Statsig, we’re ready to read the results on our scorecard.

To read the results of our experiment, go to the Results tab, where we will see our experiment exposures, and then Metric Lifts. The Metric Lifts section has two main tabs: Scorecard and All Metrics, let's focus on our scorecard.

In the example below, the Square variant shows a lift in the overall funnel conversion rate, the success metric we were targeting. Expanding the metrics to examine the entire funnel reveals two key insights:

Both the Square and Circle variants show a lift in top-of-funnel DAU (Land Page View Start DAU). However, only the Square variant shows statistically significant increase in end-of-funnel DAU (Purchase Event End DAU).

The overall funnel conversion rate improvement for Square is primarily due to the higher conversion from Checkout Event to Purchase Event stages in the funnel.

In this case, the results are clear and our hypothesis has been confirmed: Altering the product shape did increase user engagement and ultimately led to more purchases for the square variation.

The final step is to make a decision about which variant to ship as the winner, in this case we’ll choose Square:

When you ship a group in an experiment, the parameter values from the shipped group will become the default values for all your users going forward. In this case, all product icon shapes will become square.

Wrapping up

By following these three steps, you can integrate a Twilio Segment data pipeline into Statsig's experimentation platform and start making data-driven decisions. With this integration, you'll be able to quickly analyze the data collected by Twilio Segment without additional logging, and use it to improve your product and user experience.

Although we’ve made great progress towards experiment-driven product design, there’s still much work left to do! Statsig is a full-stack solution (server, client, mobile) so the experimentation potential is truly endless. Check out our walkthrough guides to keep the momentum going!

Getting started is easy

Start connecting your data with Segment.

Get a demo Create a free account