Introducing Protocols: Say goodbye to bad data

By Francisco Alberini, Andy Schumeister, Niels Tindbaek

You can’t make informed decisions when you don’t trust your underlying data. Unfortunately, finding data mistakes is far too common for most companies. Whether your Order Completed event was accidentally implemented as OrdreCompleted or your products property was coded as a string instead of an array, you likely have spent time cleaning up your data set just to make it useful. Rest assured, you’re not alone: 83% of companies have a dirty secret…dirty data.*

Until now, ensuring high-quality data across your organization has been seen as a people and process problem. But even the most stringent processes are subject to human error and organizational complexity. At Segment, we believe that technology — not process — is the best way to protect data quality at scale.

Today, we’re launching Protocols to help you protect the integrity of your data and the decisions you make with it. Protocols is a new data governance product by Segment and is now available. 

What good is bad data?

If you trusted your data, you wouldn’t have to go on wild goose chases to ensure your analytics and campaign triggers were accurate. You’d feel confident in each report your executive team reads. This would require that everyone at your company got on the same page about what data you’re collecting and why. You’d need alerting if any data was invalid, and you’d need to block bad data from hitting production.  

That’s why we built Protocols. Protocols addresses the biggest challenges to achieving data quality:

  • Alignment: Standardize customer data collection with an actionable Tracking Plan.

  • Validation: Diagnose data quality issues with automated reports and alerts.

  • Enforcement: Lock down your schema to keep data in your marketing and analytics tools clean.

Beta customers, like Creative Market and Typeform, have already used Protocols to align their organization around a standardized implementation spec. With Protocols, Creative Market reduced the time it takes to detect data quality issues by 93%. Here’s how you can put the product to use to clean up your company’s data.

Align your company around a standard Tracking Plan

Companies that take data quality seriously typically create implementation specs or tracking plans to align their business objectives with the metrics and events they track. However, these documents are often stored in a spreadsheet or JSON file. As a result, they are difficult to update or access and even harder to enforce. 

To help you standardize customer data collection throughout your organization, your Tracking Plan is now a living, actionable resource within the Segment app. This makes it easy to get engineering, product, marketing, and analytics teams on the same page about which events you collect, what they mean, and the business metrics they drive.


Tracking Plans get your whole company on the same page about customer data.

Whether your company already has a spec or you’re ready to put a more rigorous data governance strategy in place, it’s easy to get started. To create a Tracking Plan, you can:

  • Upload your existing spec from a spreadsheet or JSON file

  • Use the Protocols API 

  • Start from scratch with a simple UI

  • Use industry-specific templates: e-commerce, mobile, video, B2B SaaS, and more


Streamline testing by automatically validating your data

Even with rigorous naming conventions and instrumentation instructions, engineers don’t receive automated feedback to help them identify and resolve issues during implementation. That’s why engineers must manually validate that the implementation is correct. When you’re responsible for reviewing thousands of lines of code across dozens of events, it’s inevitable that mistakes will happen. 

A single tracking error on a business-critical event, like Lead Captured, can cost your business hundreds of thousands of dollars. The problem is that these bugs are typically detected weeks or months later, and by that time, the damage has been done.

Because your Tracking Plan can be applied to one or many Sources, we can automate this process for you and detect mistakes before they impact production. Instead of manually comparing event payloads against a spec, the Data Validation Report will automatically confirm when your data matches your Tracking Plan. 


The Data Validation Report helps you detect tracking issues.

More importantly, the Data Validation Report will notify and alert you when data doesn’t adhere to your Tracking Plan. With context on the specific violation, you’ll have the information you need to quickly address any errors. You can review violations in the app, subscribe to the daily email digest, or use the Protocols API for real-time alerts. 


Defend against rogue events by enforcing your Tracking Plan

Protocols is not just a tool for monitoring and reactive validation. It also provides an enforcement engine designed to protect your production data and keep your marketing tools and warehouses clean. Once you’ve solidified your Tracking Plan and implementation, you can configure your settings to automatically block any data that doesn’t adhere to your spec. That way, only planned and approved data will make it to your Destinations in Segment.

If you’re concerned about permanently discarding data from blocked events, we’ve got you covered. You can send your blocked data to an isolated warehouse to identify the root of the tracking problem. That way if you decide you do want the data in your marketing or analytics tools, you can work with Segment to re-send the data into your Destinations. 

Built with developers in mind 

The Protocols API enables to you build, manage, update, and access your Tracking Plan. We also designed the API to make it easy to build applications on top of your Tracking Plan. To streamline your implementation, we’re releasing an open-source NPM project called Typewriter that’s built on the Protocols API.

Typewriter automatically generates a custom analytics client based on your Segment Tracking Plan. This enables you to pull data from a Tracking Plan directly into your code editor of choice, eliminating the need to switch tabs or manually triple-check your analytics implementation.


Typewriter provides inline data spec validation.

Instead of reading from a ticket or spreadsheet, Typewriter generates inline documentation and highlights any issues in your code before you deploy. It’s spellcheck and autocomplete for your Segment instrumentation.

Protect the integrity of your data

Segment already helps you send consistent and complete data to every tool your team needs. But that’s not enough to build a customer-first organization. To ensure the experiences you build are relevant and the decisions you make are effective, the customer data feeding your marketing tech stack must be accurate. 

We believe data governance is a key component to your customer data infrastructure, and we’re excited for you to get your hands on Protocols. Data quality is the first problem we’re tackling with Protocols, and it’s just the beginning. Next, we will build tools to ensure data privacy and security with PII management, so stay tuned!

Ready to clean up your data? Click here to request access to Protocols. 

To learn more about how about you can use Protocols to automate your data governance strategy, sign up for our upcoming webinar here or check out the docs.

The state of personalization 2023

The State of Personalization 2023

Our annual look at how attitudes, preferences, and experiences with personalization have evolved over the past year.

Recommended articles


Want to keep updated on Segment launches, events, and updates?