Recipes
Any Industry

Build a Data Health Dashboard With Segment Protocols

Segment Connections is a powerful tool to unify data across all business tools. But not all data is good data. As the common trope goes, “garbage in, garbage out.” How do we prevent this bad data from overwhelming us?

Greg Yeutter

Made by Greg Yeutter

What do you need?

  • Segment Protocols

  • Mixpanel

Easily personalize customer experiences with first-party data

With a huge integration catalog and plenty of no-code features, Segment provides easy-to-maintain capability to your teams with minimal engineering effort. Great data doesn't have to be hard work!

Get started for free

Segment Protocols enables Business Tier users to proactively prevent and transform bad data so it is clean and consistent across tools. The first major steps to prevent bad data are:

  1. Creating a tracking plan to define what bad data looks like

  2. Analyzing which data points are not adhering to the tracking plan

Segment Protocols provides tools to help you monitor which data points are not adhering to the tracking plan. We also provide the ability to analyze bad data in downstream analytics and business intelligence tools. In this guide, we will explore how Protocols enables you to manage your data easily.

For this recipe, we’ll use Mixpanel to build the dashboard, but the concepts are similar for other Segment-supported analytics tools. We’ll set up this dashboard for one Source, but other Sources can be added to the same dashboard by repeating steps 1-4.

Step 1: Create a Tracking Plan

Log into the Segment web application. If you have not already connected a Source, do so now.

Once the Source is enabled, head to the Protocols tab, and click New Tracking Plan:

greg1

Give your tracking plan a name. If your Source has already sent in data, you can easily build your tracking plan based on events already sent in. In this case, click Import events from source. If this a new Source with no events yet, select Add events manually:

greg2

Select the Source, then click Import and save:

greg3

The tracking plan will auto-populate with events that have already flown in. If you click the arrow next to any event, you can see all properties or traits that have been seen:

greg4

From here, you can choose to require certain properties or even enforce the data type:

greg5

You may also add events and properties to the tracking plan manually. Refer to the Segment documentation for a complete guide to creating your tracking plan.

Once you are happy with your tracking plan, click the dropdown that says 0 connected sources, then Connect source. Select your Source, then click Next:

greg6
greg7

Review the consequences, then click Connect source:

greg8

Your tracking plan should now be live for the selected Source.

Step 2: Configure the Schema

Now, we’ll choose what happens to events that violate the tracking plan. Potential reasons for violation include:

  • Missing required properties. For example, for an order_completed event, if the price property is set to required and missing, the event is in violation.

  • Invalid property value data types. For example, if you require a String but receive an integer for an order ID, throw a violation.

  • Property values that do not pass applied conditional filtering. For example, for an email_opened event, if the campaign_id property does not satisfy the regular expression, throw a violation.

Protocols can also monitor for unplanned events, properties, and traits that are not explicitly listed in the tracking plan.

We can choose to allow or block violations and unplanned events:

  • Allow: send to the Destination, despite being unplanned or in violation

  • Block: do not send to the Destination

For unplanned or violating event properties and traits, we can additionally choose to omit those properties or traits while sending the non-violating and planned properties/traits to Destinations.

To set this up in your Segment workspace, navigate to the desired Source and select Settings, then Schema Configuration:

greg9

In the first section (Unplanned Events, Properties and Values), use the matrix to decide which values are allowed, blocked, or omitted:

greg10

Note that this is done by Source, so the settings applied here will not apply for other Sources.

Step 3: Forward Violations & Blocked Events

To demonstrate the concept of violation forwarding, we are going to set up an additional Source, which I will call a Violation Source. The Violation Source receives violations and/or blocked events from the standard Source. From the Violation Source, events can be sent to an analytics Destination (such as Mixpanel) for monitoring.

greg11

To set this up, we are going to create a new Source. From the Connections tab, select Sources then Add Source:

greg12

Select JavaScript as the type, then Add Source:

greg13
greg14

Give it a name such as Violation Source, then click Add Source:

greg15

We then want to return to the Schema Configuration page from step 2. From the Segment webapp, navigate back to the standard Source (not the Violation Source) and select Settings, then Schema Configuration:

greg16

Scroll to Forwarding Settings, then enable forwarding for Violations and/or Blocked Events and Traits, and select the Violation Source.

greg17

Step 4: Connect Your Analytics Tool

Now, we’ll set up Mixpanel (the Segment Destination where our dashboard will live) and connect it to the Violation Source. This will be a brief overview of setting up Mixpanel, but you can refer to the full instructions within the Mixpanel (Actions) Destination Segment Documentation.

From your Segment workspace under Connections, click Destinations, then New Destination:

greg18

Search for and select the Mixpanel (Actions) Destination. Click “Configure Mixpanel”:

greg19
greg20

Connect the Violation Source, then click Next:

greg21

Give your Destination a name, such as Mixpanel Violation Destination, then click Save:

greg22

If you haven’t already, in a separate tab, create a free Mixpanel account.

Log into your Mixpanel account, then go to the Mixpanel project settings and copy the unique token and API secret.

Back in Segment in the Basic Settings tab of the Destination, paste the token and secret. Turn on the switch for Enable Destination, then click Save Changes:

greg23
greg24

At this point, ensure data is flowing in from your standard Source using the Debugger tab.

greg25

Step 5: Set Up Your Dashboard

In this step, we’ll create a basic data quality dashboard showing:

 1) all violating events over time 

 2) the events ranked by number of violations. 

This will help you understand the recency and total amount of bad data, respectively, broken down by event. You are welcome to create more, but this is a solid starting point.

Log in to Mixpanel, then select New Dashboard in the upper-left corner:

greg26

Give the dashboard a name, then click Add, then Insights Report:

greg27
greg28

Give the chart a name, such as All Violating Events, then under Events & Cohorts, select Your Top Events. Click Save:

greg29

If you go back to the Violations dashboard, you should see the first chart. Click Add content to create another chart. Select Insights Report again:

greg30
greg31

Give it a name, such as All Events with Violations. Under Events & Cohorts, select Your Top Events again.

greg32

On the right side, choose Bar chart, then click Save:

greg33

Back in your dashboard, you’ll see both charts. You did it!

greg34

Wrap Up

In this recipe, we created a tracking plan, defined what to do with violations, and forwarded violations to Mixpanel to create a centralized data health dashboard. There are many more possibilities to understanding your data health, such as forwarding violations to Slack and enforcing your tracking plan in code before it is even pushed to production with Typewriter. To learn more about the possibilities of Segment protocols, refer to the documentation and this video.

Getting started is easy

Start connecting your data with Segment.