The Segment Infrastructure

Under the hood of the system that processes 1M+ events per second

Collection

Customer data lives everywhere: your website, your mobile apps, and internal tools

That’s why collecting and processing all of it is a tricky problem. Segment has built libraries, automatic sources, and functions to collect data from anywhere—hundreds of thousands of times per second.

We’ve carefully designed each of these areas to ensure they’re:

Circle icon Performant (batching, async, real-time, off-page)
Circle icon Reliable (cross-platform, handle rate-limits, retries)
Circle icon Easy (setup with a few clicks, elegant, modern API)
Here’s how we do it.
Collection

Processing

Data can be messy. As anyone who has dealt with third-party APIs, JSON blobs, and semi-structured text knows that only 20-30% of your time is spent driving insights. Most of your time is spent cleaning the data you already have.

At minimum, you’ll want to make sure your data infrastructure can:

Circle icon Handle GDPR suppressions across millions of users
Circle icon Validate and enforces arbitrary inputs
Circle icon Allow you to transform and format individual events
Circle icon Deduplicate retried requests
Processing