At this point it's well-accepted that analytics data is the beating heart of a great customer experience. It's what allows us to understand our customer's journey through the product and pinpoint opportunities for improvement.
“How should I secure my production environment, but still give developers access to my hosts?” It’s a simple question, but one which requires a fairly nuanced answer. You need a system which is secure, resilient to failure, and scalable to thousands of engineers–as well as being fully auditable.
Design systems are emerging as a vital tool for product design at scale. These systems are collections of components, styles, and processes to help teams design and build consistent user experiences. It seems like everyone is building one, but there is no playbook on how to take it from the first button to a production-ready system adopted across an organization. Much of the…
“How should we test this?”“Let’s just run it in production and monitor it closely.”— You and your coworker, probably.
Segment is a hub for a tremendous amount of data. It processes peaks of 230,000 events per second inbound, and 280,000 events per second outbound between more than 200 integration partners. You may think of Segment as black box for delivering all this data. You send data once to its tracking API, and it coordinates translating data and delivering it to many destinations.
Segment loads billions of rows of arbitrary events into our customers’ data warehouses every single day. How do we test a change that can corrupt only one field in millions, across thousands of warehouses? How can we verify the output when we don’t even control the input?
Unless you’ve been living under a rock, you probably already know that microservices is the architecture du jour. Coming of age alongside this trend, Segment adopted this as a best practice early-on, which served us well in some cases, and, as you’ll soon learn, not so well in others.
Today, we’re excited to share the architecture for Centrifuge–Segment’s system for reliably sending billions of messages per day to hundreds of public APIs. This post explores the problems Centrifuge solves, as well as the data model we use to run it in production.
Segment receives billions of events from our customers daily and has grown in to dozens of AWS accounts. Expanding in to many more accounts was necessary in order to best align with our GDPR and security initiatives, but it comes at a large complexity cost. In order to continue scaling gracefully we are investing in building tooling for employees to use with many accounts, and…
Go has a robust built-in testing library. If you write Go, you already know this. In this post we will discuss a handful of strategies to level up your Go testing. We have learned from experience on our large Go codebase that these strategies work to save time and effort maintaining the code.
It’s not hyperbole to say that Segment would not exist, if not for open source. We’re heavy users of Kafka, Redis, Terraform, Docker, Golang, and Node.js, just to name a few of the tools we use. And we literally got our start as an open source library launched on Hacker News.
Memory management can be tricky, to say the least. However, after reading the literature, one might be led to believe that all the problems are solved: sophisticated automated systems that manage the lifecycle of memory allocation free us from these burdens.
The way companies manage application secrets is critical. Even today, improper secrets management has resulted in an astonishing number of high profile breaches.
Today we’re excited to open source the various pieces of our logging pipeline. We’ve released a rate-limiting-syslog proxy, a journald fanout service, and a cloudwatch logs CLI. To understand how they work in concert, read on.
The single requirement of all data pipelines is that they cannot lose data. Data can usually be delayed or re-ordered–but never dropped.
Today we’re releasing ksuid, a Golang library for unique ID generation. It borrows core ideas from the ubiquitous UUID standard, adding time-based ordering and more friendly representation formats. In doing the research that went into this library, we uncovered a compelling story that we wanted to share with a larger audience.
Recently we shared the techniques we used to save more than a million dollars annually on our AWS bill. While we went into detail about the various problems and solutions, the most common question we heard was: "I know I’m spending a ton on AWS, but how do I actually break that into understandable pieces?"
In a scrappy B2B startup, user feedback is super valuable, but guerrilla research won’t cut it when you need a more targeted group of users. The Segment Design team found the users we needed and developed an automated process for recruitment and coordinating interviews using our own product and a few integrated applications.
The barrier holding back most open source projects is surprisingly mundane. It’s not test coverage. It’s not performance. It’s not code quality.
Running QA tests for Segment’s UI was taking way too long. Sure, we had strong component-level tests for our UI kit. But to test our whole app we needed to painstakingly poke around looking for oddities.
Today we’re proud to announce the Segment Open Fellowship. The Fellowship is a three month long program supporting three to five open-source developers with $8k per month to focus full-time on their project, no other strings attached.
For an early startup, using the cloud isn’t even a question these days. No RFPs, provisioning orders, or physical shipments of servers. Just the promise of getting up and running on “infinitely scalable” compute power within minutes.
Nightmare is a browser automation library for node.js, designed to be much simpler and easier to use than Phantomjs. We originally built Nightmare to create integration logos with 99Designs Tasks before they had an API, and we still use it in Sherlock. But the vast majority of Nightmare developers—now 55k+ downloads per month—use it for web UI testing and crawling.
At Segment, focus is one of our four core values. But it was difficult for team members to focus in the office, so in June we ran an internal team survey about what helps and hurts focus. The results showed that “chatter and noise” was one of the biggest culprits for distraction around the office. “Slack group channels” came in second.
When it comes to your app, size makes a difference. Bigger apps have fewer downloads, worse reviews, and a harder time penetrating the international market. We measured the exact impact of increased app size, shown below. We’ve also included learnings on how to prevent bloat in your own app.
A few months ago we were at a conference in Half Moon Bay talking with a general manager at a large mobile company, and he said, “One of my projects for the next six months is to reduce SDK bloat in all our apps.” Six months is a substantial investment! So we asked why this mattered so much to him.
At Segment, we’re working hard to make our mobile SDKs the best possible collection options for your analytics data. An SDK can make your data more durable, minimize data transfer, and optimize your app’s battery usage. In this article, we’ll take you through what happens under the hood as a piece of data flows through our iOS and Android SDKs from a button handler in your app…
As part of our push to open up what’s going on internally at Segment – we’d like to share how we run our CI builds. Most of our approaches follow standard practices, but we wanted to share a few tips and tricks we use to speed up our build pipeline.
AWS is the default for running production infrastructure. It’s cheap, scalable, and flexible to whatever configuration you’d like to run on top of it. But that flexibility comes with a cost: it makes AWS endlessly configurable.
Segment’s mobile SDKs are designed to track behavioral data from your app and translate and route that data to hundreds of downstream integrations. One of the SDK’s core tasks is to upload behavioral data to our servers. Since every network request requires your app to power up the device’s radio, uploading this data in real-time can quickly drain a battery.
For the past year, we’ve been heavy users of Amazon’s EC2 Container Service (ECS). It’s given us an easy way to run and deploy thousands of containers across our infrastructure.
Since Segment’s first launch in 2012, we’ve used queues everywhere. Our API queues messages immediately. Our workers communicate by consuming from one queue and then publishing to another. It’s given us a ton of leeway when it comes to dealing with sudden batches of events or ensuring fault tolerance between services.
I recently jumped back into frontend development for the first time in months, and I was immediately struck by one thing: everything had changed.