Josephine Liu, Sherry Huang on June 9th 2021

Our latest feature, Journeys, empowers teams to unify touchpoints across the end-to-end customer journey.

Recent articles

Jim Young on June 22nd 2021

Insights for UK businesses about how attitudes, preferences, and experiences with personalisation have evolved over the past year.

Geoffrey Keating on June 21st 2021

This guide explains what data management is and how it lets organizations capture their data’s upside while removing the downsides of unmanaged data.

Kate Butterfield on June 16th 2021

Get an inside look at the design process for Journeys.

Katrina Wong on June 10th 2021

Joy Mangano shares her journey from QVC to omnichannel retail.

Geoffrey Keating on June 9th 2021

A customer data hub is the primary collection point for all your customer information. It connects to all the channels, platforms, and products your customers use.

Josephine Liu, Sherry Huang on June 9th 2021

Our latest feature, Journeys, empowers teams to unify touchpoints across the end-to-end customer journey.

Geoffrey Keating on June 8th 2021

Data Modeling 101: What Are Data Models?

Through data models, developers, data architects, business analysts, and other stakeholders can agree on the data they’ll capture and for what purposes before building databases and warehouses.

A data model specifies what information to capture, how it relates to each other, and how to store it, establishing data standards for your entire organization. For example, a model for an eCommerce website might specify the customer data you’ll capture. It will define how to label that data and its relation to product information and the sales process.

Like a blueprint for a house, a data model defines what to build and how, before starting construction, when things become much more complicated to change. This approach prevents database design and development errors, capturing unnecessary data, and duplicating data in multiple locations.

In this article, we’ll cover these basics of data modeling:

  • Understanding different types of data models

  • Why data models are necessary for building a data infrastructure

  • Top three data modeling techniques

Understanding different types of data models

Data models get divided into three categories: abstract, conceptual, and physical models. They help align stakeholders around the why, how, and what of your data project. Each type of model serves a different purpose and audience in the data modeling process.

Conceptual data models

Conceptual data models visualize the concepts and rules that govern the business processes you’re modeling without going into technical details. You use this visualization to align business stakeholders, system architects, and developers on the project and business requirements: what information the data system will contain, how elements should relate to each other, and their dependencies.

Typically, a conceptual model shows a high-level view of the system’s content, organization, and relevant business rules. For example, a data model for an eCommerce business will contain vendors, products, customers, and sales. A business rule could be that each vendor needs to supply at least one product.

There’s no standard format for conceptual models. What matters is that it helps both technical and non-technical stakeholders align and agree on the purpose, scope, and design of their data project. All of the below images could be examples of conceptual data models.

Logical data models

A logical data model is based on the conceptual model and defines the project’s data elements and relationships. You’ll see the names of specific entities in the database, as well as their attributes. To stay with the eCommerce example: A logical model shows products are identified through a “product ID,” with properties like a description, category, and unit price.

Data architects and business analysts use the logical data model to plan the implementation of a database management system—software that stores, retrieves, defines, and manages data in a database.

Physical data models

The physical data model gets technical. Database analysts and developers use it for the design of the database. The model specifies the types of data you’ll store along with technical requirements.

An example of data type specifications is whether a piece of data will be an integer—a number without a decimal point—or a float—a number with a decimal place. Technical requirements include details on storage needs, access speed, and data redundancy—storing a piece of data in multiple locations to increase durability and improve query performance.

In practice, only very large projects, say modeling a container shipping business, move from conceptual to logical to physical models. Most other projects skip the conceptual phase and spend most of their time in logical modeling. Some teams even cover elements from the physical phase simultaneously because the people working on the logical model also do the technical implementation.

Why data models are necessary for building a data infrastructure

Data models turn abstract ideas (“we want to track our global container shipments in real time”) into a technical implementation plan (“we will store an attribute called ‘container GPS location’ in a table called ‘Containers’ as an integer”). They help avoid costly demolition and reconstruction of your data infrastructure because you need to think about the data you’ll need, its relations, the database framework, and even whether your project is viable before creating databases and warehouses.

Data models also help with data governance and legal compliance. They allow you to set standards from the start of the project so teams don’t end up with conflicting data formats that need cleaning up before they can use it or, worse, can’t use at all.

Data models and standardization help avoid situations like a sign-up field labeled in nearly a dozen different ways across the organization.

You can also identify sensitive information—social security numbers, passwords, credit card numbers—while you’re modeling so you can involve security and legal experts before you start building.

With safe, accurate, and high-quality data, all teams benefit. Product teams can iterate faster and build immersive user experiences. Analytics teams can create queries without heavy workarounds. And marketing teams can improve advertising efforts by personalizing messaging according to user behaviors and traits.

Customer Data Platforms (CDPs) like Segment can do much of the heavy-lifting during data modeling projects. Segment’s Connections feature makes it easy to capture, organize, and visualize every customer-facing interaction with your business, whether digital or offline. Protocols lets you define your data standards and enforce them at the point of collection. Using real-time data validation and automatic enforcement controls, you can diagnose issues before they pollute your marketing and analytics tools or data warehouse.

Top three data modeling techniques

There are many different techniques to design and structure a database. You should explore these techniques and decide on the most suitable one for your project at the end of the conceptual phase. These data modeling methodologies define how the database gets structured and closely relate to the type of formatting or technology you can use to manage your data project.

For example, many people now default to graph modeling because it’s new and popular, even when a simple relational model would suffice. Understanding the most popular techniques helps you avoid such mistakes.

1. Relational data modeling

In a relational data model, data gets stored in tables, of which specific elements link to information in other tables. Entities can have a one-to-one, one-to-many, or many-to-many relationship.

Relational databases often use SQL (Structured Query Language), a programming language, for accessing and managing data. They’re frequently used in point-of-sale systems, as well as for other types of transaction processing.

The Entity-Relationship Model—sometimes referred to as ER model—is similar to the relational model. It visualizes the relationships between different elements in a system but without going into technical details. You can use the ER model during the conceptual phase to align technical and non-technical stakeholders.

2. Dimensional data modeling

To understand dimensional data models, picture a cube. Each side of the cube represents an aspect of the data you’re trying to capture.

For example, suppose your business sells multiple products to different customer segments, and you want to evaluate sales performance over time. You can visualize this as a data cube, with dimensions for time, products, and customer segments. By traveling up, down, left, and right on the axes of the cube, you can make comparisons across all those dimensions. You’ll see how the sales of each of these products compare to each other and different customer segments at any point in time.

You use the cube model during the conceptual phase. One of the most frequent manifestations of such a cube in the logical stage is the “star schema,” like the one below. At first, it might look like a relational model. Still, the star schema is different because it has a central node that connects to many others.

3. Graph data modeling

During the conceptual phase, most people sketch a data model on a whiteboard. Such a sketch resembles the graph model. It consists of “nodes” and edges—a node represents where the data is stored, the edge the relation between nodes. It’s also the main advantage of this approach: “what you sketch on the whiteboard is what you store in the database.”

Other techniques require you to translate the output from the conceptual phase into a different format for the logical and physical implementation—for example, going from an ER to a relational model or from a cube model to a star schema. Not so with graph models. You can implement them straight away using technology like Neo4j, a native graph database platform.

Data models don't have to be difficult

When you understand the purpose of data models and the process to follow, they’re not challenging to create, especially if you also collect, organize, and standardize your data with Segment. You’ll align all stakeholders before starting technical implementation and avoid costly mistakes or rebuilds. You’ll know what expertise you need on the team to execute your plan and have your data governance defined, too.

Geoffrey Keating on June 1st 2021

Our annual look at how attitudes, preferences, and experiences with personalization have evolved over the past year.

Andy Li on June 1st 2021

Access is always changing. When you start at a new company, you usually are given access to a set of apps provisioned to you on day one, based on your team and role. Even on day one, there can be a difference between the access you are granted and the access you need to do your job. This results in two outcomes: underprovisioned or overprovisioned access. 

For the IT and security teams who manage cloud infrastructure accounts, securing access to them can be difficult and scary; the systems are complex, and the stakes are high. If you grant too much access, you might allow bad actors access to your tools and infrastructure, which at best results in a breach notification; at worst, it results in a company-ending, game-over scenario. If you grant too little access, you put roadblocks between your colleagues and the work they need to do, meaning you are decreasing your company’s productivity.

Overprovisioned access

A common approach taken by startups and small companies is to grant access permissively. In these companies, early productivity can be critical to the success of the business. An employee locked out of a system because of missing access means lost productivity and lost income for the business. 

If you give employees permanent admin access to every system, you optimize for velocity, but at the expense of increased risks from compromised employee accounts and insider threats. This results in an increased attack surface. As your company grows, it becomes more important to secure access to critical resources, and this requires a different approach.

Underprovisioned access

If you give employees too little access, it forces them to request access more often. Although new employees are initially given access based on their team and role, new duties and new projects can quickly increase the scope of the access they need. Depending on your company’s process for providing access, this can be cumbersome for the requester, for the approver, or oftentimes, for both. 

Here at Segment, we have production environments across Amazon Web Services (AWS) and Google Cloud Platform (GCP). We need to secure access to these accounts thoughtfully so that our engineers can continue to build fast and safely. At many companies, you might rely on a centralized team to manage internal access. While this is a simple approach, it does not scale – team members have a limited amount of context surrounding requests, and might accidentally over-provision the requester’s access. At Segment, we approached the problem of managing least-privilege cloud access by building Access Service: a tool that enables time-based, peer-reviewed access.

Setting the stage: access at Segment

At Segment, we have hundreds of roles across dozens of SaaS apps and cloud providers representing different levels of access. In the past, we used to have to log in to each app or system individually to grant a user access. Our IT team managed to “federate” our cloud access and use Okta as our Identity Provider. This gave us a single place to manage which users have access to which roles and applications. The rest of this blog post builds on this federated access system. 

If your organization hasn’t built something similar, the following resources that can help you build and set up your own federated cloud access system.

Blog posts:

Docs:

Mapping Okta apps to AWS roles

By configuring Okta applications to cloud provider roles, engineers are one click away from authenticating to a cloud provider with single sign-on (SSO) with appropriate permissions.

Each Okta app is mapped to a “Cloud Account Role” (or “Cloud Project Role” for GCP). For example, in AWS, we have a Staging account with a Read role that provides read access to specific resources. In Okta, we have a corresponding app named “Staging Read - AWS Role” that allows engineers to authenticate to the AWS Staging Account and assume the Read role.

This requires configuring an Okta app for each “Cloud Account Role” combination, which at the time of writing is 150+ Okta apps.

Configuring GCP with Okta is slightly different, and technical details for how to do this are at the bottom of this blog.

Mapping Okta groups to SaaS app groups

In addition to authentication, Identity Providers can also help with authorization. Users get understandably frustrated when they get access to an application, but don’t have the correct permissions to do their job. 

Identity Providers have agreed upon a common set of REST APIs for managing provisioning, deprovisioning, and group mapping called SCIM (the System for Cross-domain Identity Management).

If an application supports SCIM, you can create groups within your Identity Provider (e.g. Okta), which will map user membership into the application. With this setup, adding users to the Okta group will automatically add them to the corresponding group in the application. Similarly, when a user is unassigned from the application in Okta, their membership in the application group will also be lost. 

SCIM allows us to provide granular, application-level access, all while using our Identity Provider as the source of truth.

With a single place to manage access for all of our cloud providers, the problem should be solved, right? Not quite… 

While the underlying Okta apps and groups system worked great, we quickly ran into more human problems.

Pitfalls of centralized access management

Even with our awesome new Okta+AWS system, we still needed a process for a centralized team to provision access through Okta. At many companies, this team would be IT. At Segment, this was a single person named Boggs. Requests would go into his inbox, and he would manually review the request reason, and decide if there was a more suitable level of access for the task. Finally, he would go to the Okta admin panel and provision the appropriate app to the user. Although this system worked for a time, it was not scalable and had major drawbacks.

Permanent access

Once an app was provisioned to a user, they would have access until they left Segment. Despite having permanent access, they might not need permanent access. Unfortunately, our manual provisioning process did not have a similar scalable way to ensure access was removed after it was no longer needed. People granted access for one-off tasks now had permanent access that hung around long after they actually needed it.

Difficulty scaling due to limited context 

As an engineering manager, Boggs had a strong sense of available IAM roles and their access levels. This allowed him to reduce unnecessary access by identifying opportunities to use less sensitive roles. This context was difficult to replicate and was a big reason why we could not simply expand this responsibility to our larger IT team. 

Most centralized IT teams don’t work closely with all of the apps that they provision, and this makes it difficult for them to evaluate requests. Enforcing the principle of least privilege can require intimate knowledge of access boundaries within a specific app. Without this knowledge, you’ll have a hard time deciding if a requester really needs “admin”, or if they could still do the work with “write” permissions, or even just with “read” access.

Kyle from the Data Engineering team is requesting access to the Radar Admin role to “debug”. What do they actually need access for? Would a Read only role work? And wait… who is Kyle?! Did they start last week? They say that they need this access to do their job and I still need to do mine… APPROVE. 

It was slow

Despite being better equipped than most people to handle access requests, Boggs was a busy engineering manager. Although at first provisioning access was an infrequent task, as the company grew, it began to take up valuable chunks of time and became increasingly difficult to understand the context of each request. 

We considered involving extra team members from our IT team, but this would still take time, as they would need to contact the owners of each system to confirm that access should be granted. Ultimately, having a limited pool of centralized approvers working through a shared queue of requests made response times less than ideal.

Breaking Boggs 

Boggs tried automating parts of the problem away using complex scripted rules based on roles and teams, but there were still situations that broke the system. How would he handle reorgs where teams got renamed, switched, merged, or split? What happens when a user switches teams? What happens when a team had a legitimate business need for short-term access to a tool they didn’t already have? Using that current system, any access Boggs provisioned lasted forever - unless somebody went in and manually audited Okta apps for unused access.

Ultimately, we found ourselves in a situation where we had a lot of over-provisioned users with access to sensitive roles and permissions. To make sure we understood how bad the problem actually was, we measured the access utilization of our privileged roles. We looked at how many privileged roles each employee had access to, and compared them to how many privileged roles had actually been used in the last 30 days.

The results were astonishing: 60% of access was not being used.

Managing long-lived access simply did not scale. We needed to find a way to turn our centralized access management system into a distributed one. 

Access Service

In the real world of access, we shouldn’t see a user's access footprint as static, but instead view it as amorphous and ever-changing.

When we adopted this perspective, it allowed us to build Access Service, an internal app that allows users to get the access they really need, and avoid the failure modes of provisioning too little or too much access.

Access Service allows engineers to request access to a single role for a set amount of time, and have their peers approve the request. The approvers come from a predefined list, which makes the access request process similar to GitHub pull requests with designated approvers

As soon as the request is approved, Access Service provisions the user with the appropriate Okta app or group for the role. A daily cron job checks if a request has expired, and de-provisions the user if it has. 

At a high level this is a simple web app, but let’s look closer at some specific features and what they unlock.

Temporary access

The magic of Access Service is the shift from long-lived access to temporary access. Usually, an engineer only needs access temporarily to accomplish a defined task. 

Once that task is done, they have access they no longer need, which violates the principle of least privilege. Fixing this using the old process would mean manually deprovisioning Okta apps – adding yet another task to a workflow that was already painfully manual.

With Access Service, users specify a duration with their access request. Approvers can refuse to approve the request if they think the duration is unnecessarily long for the task. This duration is also used to automatically deprovision their access once the request expires.

Access Service offers two types of durations: “time-based access” and “activity-based access”. 

Time-based access is a specific time period, such as one day, one week, two weeks, or four weeks. This is ideal for unusual tasks such as: 

  • fixing a bug that requires a role you don’t usually need

  • performing data migrations

  • helping customers troubleshoot on production instances you don’t usually access

Activity-based access is a dynamic duration that extends the access expiration each time you use the app or role you were granted. This is ideal for access that you need for daily job functions – nobody wants to make a handful of new access requests every month. However, we don’t offer this type of access for our more sensitive roles. Broad-access roles, or roles that have access to sensitive data require periodic approvals to maintain access. Activity-based access provides a more practical balance between friction and access, aligning with our goals of enabling our engineers to build quickly and safely

Designated approvers

One of the biggest limitations with our previous process was that one person had to approve everything. In Access Service, each app has a vetted list of approvers who work closely with the system. By delegating decision making to experts, we ensure that access is approved by the people who know who should have it. 

To start out with, you can’t approve your own access requests. (Sorry red team.) Each app has a “system owner” who is responsible for maintaining its list of approvers. When a user creates an access request, they select one or more approvers to review their request. Because the approvers list contains only people who work closely with the system, the approvers have better context and understanding of the system than a central IT team.

This makes it easier for approvers to reject unreasonable or too-permissive access requests, and encourages users to request a lower tier of access (for example, telling them to request a read-only role instead of a read/write role). Since incoming requests are “load balanced” between approvers, users also see a much faster response time to their requests. 

Provisioning access always requires two people, much like a GitHub pull request. Users cannot select themselves as an approver, even if they are a system owner. Access Service also supports an “emergency access” mechanism with different approval requirements. This prevents Access Service from blocking an on-call or site reliability engineer if they need access in the middle of the night. 

With system owners appointed for each app, our distributed pool of approvers continues to scale as we introduce new tools with new access roles and levels. This is what the security community calls “pushing left”

When you “push left”, you introduce security considerations earlier in the development lifecycle, instead of trying to retrofit a system after it is in use. In the software engineering space, “pushing left” resulted in engineers learning more about security. This means that the people most familiar with the systems are the most knowledgeable people to implement security fixes. Since the engineers are the ones who designed and now maintain the software, they have much more context than the central security team. Similarly, Access Service unburdens the central IT team, and empowers system owners to make decisions about who should have access to their systems, and at what level. This significantly reduces the amount of time the IT team spends provisioning access, and frees them up to do more meaningful work.

How it works

Access Service, like many of our internal apps, is accessible to the open internet, but protected behind Okta.

The basic unit of Access Service is a “request”. A user who wants access creates a request that includes four pieces of information: 

  • the application they want access to

  • the duration they want access for

  • a description for why they need access

  • the approver(s) they want to review the request

When they click “Request Access”, Access Services sends the selected approvers a Slack notification. Segment, like many modern companies, has a high degree of Slack presence. Using this platform makes Access Service a more natural, less disruptive part of people’s workflows. Even if the user requesting access is an approver for the particular app, they must receive approval from a different approver – every request must involve two people.

The access request is tracked in a web app, so you can see what requests you have open, and what roles you currently have access to.

The requester is notified via Slack when their request has been approved, so they know they can now get back to the task they needed access for in the first place.

The results

After we migrated our access process to Access Service, the result was zero long-lived access to any of our privileged cloud roles in AWS and GCP. All access granted to these roles expires if it is not actively used. 

In the graph below, “Access Points” refers to the number of users with access to each admin role. After moving to Access Service, we reduced the number of people who had privileged access by 90%. 

In the next graph below, “Active” refers to the number of people who used an app within the last 30 days. Because this number is higher than the number of Access Points, this shows that more access was used in the last 30 days than was currently provisioned

That seems strange – how could admin apps have been used by more people than the total amount of people provisioned access? That’s because expired access had already been automatically deprovisioned, reducing the number of Access Points by the end of the 30 day window!

Conclusion

By acknowledging that access needs are constantly changing, we were able to create a more practical way to manage access control.

Access Service allows us to streamline the access approval process. By routing requests directly to designated approvers, we are able to get fast approvals from people with rich context. The time-based component of access requests allows the service to regularly remove unneeded access, preventing our access attack surface from growing too large. Finally, integrating Slack into the system makes approvals faster, ensures that you know immediately when your request has been approved, and reminds you when the request is expiring so you don’t run into unexpected access loss when just trying to do your job.

While it can be daunting to try to reinvent an existing, well-established process, the results can be incredibly rewarding. Start by writing down your goals, thinking about what you don’t like and what is painful about the current state, and reevaluate your core assumptions. Companies are always changing, and your processes have to keep up; the circumstances that led to the previous system may no longer be applicable today. Most importantly, remember to build with the user’s workflow in mind, because security depends on participation of the whole company.

Future development

Policies

Apps in Access Service are currently individually customizable. However, this can lead to issues with scalability if we want to make changes across multiple, similar apps. For example, if we decide that we want to limit access to several AWS accounts to no more than one week, we would currently have to edit the allowed durations for each individual role. With the introduction of policies, we would be able to map several roles to a single policy, allowing us to easily apply the change from the previous example. 

Dynamic Roles

Currently, Access Service grants users access to predefined AWS roles. These roles are typically made to be general-purpose, but there may be use-cases not fully captured by an existing role. Instead of configuring a new role for one-off needs, or using an overly permissive role, Access Service could allow users to create a dynamic role. When making a request, users would check boxes corresponding to what permissions they wanted (e.g. “S3 Read”, “CloudWatch Full Access”, etc) to create a custom, dynamic role.


Special thanks to David Scrobonia for creating Access Service and setting up the foundation for this blog. Thank you to John Boggs, Rob McQueen, Anastassia Bobokalonova, Leif Dreizler, Eric Ellett, Pablo Vidal, Arta Razavi, and Laura Rubin, all of whom either built, designed, inspired, or contributed to Access Service along the way.


References

Configuring GCP roles in Okta

Connecting a GCP role to Okta is harder than with AWS, and after struggling to figure it out for a while, we thought it would be worth sharing. To connect a GCP role to our Okta instance, we had to use Google Groups in GSuite. 

First, we created a single GSuite Group for each of our Project-Role pairs. In GCP, a Google Group is a member (principal) that can be assigned a role, and all users added to the group are also assigned that role. 

We then assigned each GCP role to its corresponding Google Group. Next, we needed to connect the Google Groups to Okta. 

You can do this by using Okta Push Groups, which link an Okta “group” to a Google Group. Adding a user to an Okta Push Group automatically adds the correct GSuite user to the Google group. We created an Okta Group for each of the roles and configured it as a Push Group to its corresponding Google Group.

To summarize, the flow looked like this: 

  1. Add Okta User david@segment.com to Okta Group “Staging Read - GCP Role” 

  2. Okta Push Groups adds the GSuite user david@segment.com to the “Staging Read” Google Group

  3. Because he is a member of the “Staging Read” Google Group , david@segment.com is assigned the “Read” IAM role for the “Staging” project.

A BeyondCorp approach to internal apps

All of our internal apps use an OpenID Connect (OIDC) enabled Application Load Balancer (ALB) to connect to Okta. This provides a BeyondCorp approach to access for our internal apps: all are publicly-routable, but are behind Okta. 

This is also nice from a tooling developer standpoint, because not only is authentication taken care of, but we can use the signed JSON web token (JWT) that Okta returns to the server through the ALB to get the identity of the user interacting with Access Service. This allows us to use Okta as a coarse authorization layer and manage which users have access to different internal apps.

Jim Young on June 22nd 2021

Insights for UK businesses about how attitudes, preferences, and experiences with personalisation have evolved over the past year.

Geoffrey Keating on June 21st 2021

This guide explains what data management is and how it lets organizations capture their data’s upside while removing the downsides of unmanaged data.

Kate Butterfield on June 16th 2021

Get an inside look at the design process for Journeys.

Katrina Wong on June 10th 2021

Joy Mangano shares her journey from QVC to omnichannel retail.

Geoffrey Keating on June 9th 2021

A customer data hub is the primary collection point for all your customer information. It connects to all the channels, platforms, and products your customers use.

Josephine Liu, Sherry Huang on June 9th 2021

Our latest feature, Journeys, empowers teams to unify touchpoints across the end-to-end customer journey.

Geoffrey Keating on June 8th 2021

Data Modeling 101: What Are Data Models?

Through data models, developers, data architects, business analysts, and other stakeholders can agree on the data they’ll capture and for what purposes before building databases and warehouses.

A data model specifies what information to capture, how it relates to each other, and how to store it, establishing data standards for your entire organization. For example, a model for an eCommerce website might specify the customer data you’ll capture. It will define how to label that data and its relation to product information and the sales process.

Like a blueprint for a house, a data model defines what to build and how, before starting construction, when things become much more complicated to change. This approach prevents database design and development errors, capturing unnecessary data, and duplicating data in multiple locations.

In this article, we’ll cover these basics of data modeling:

  • Understanding different types of data models

  • Why data models are necessary for building a data infrastructure

  • Top three data modeling techniques

Understanding different types of data models

Data models get divided into three categories: abstract, conceptual, and physical models. They help align stakeholders around the why, how, and what of your data project. Each type of model serves a different purpose and audience in the data modeling process.

Conceptual data models

Conceptual data models visualize the concepts and rules that govern the business processes you’re modeling without going into technical details. You use this visualization to align business stakeholders, system architects, and developers on the project and business requirements: what information the data system will contain, how elements should relate to each other, and their dependencies.

Typically, a conceptual model shows a high-level view of the system’s content, organization, and relevant business rules. For example, a data model for an eCommerce business will contain vendors, products, customers, and sales. A business rule could be that each vendor needs to supply at least one product.

There’s no standard format for conceptual models. What matters is that it helps both technical and non-technical stakeholders align and agree on the purpose, scope, and design of their data project. All of the below images could be examples of conceptual data models.

Logical data models

A logical data model is based on the conceptual model and defines the project’s data elements and relationships. You’ll see the names of specific entities in the database, as well as their attributes. To stay with the eCommerce example: A logical model shows products are identified through a “product ID,” with properties like a description, category, and unit price.

Data architects and business analysts use the logical data model to plan the implementation of a database management system—software that stores, retrieves, defines, and manages data in a database.

Physical data models

The physical data model gets technical. Database analysts and developers use it for the design of the database. The model specifies the types of data you’ll store along with technical requirements.

An example of data type specifications is whether a piece of data will be an integer—a number without a decimal point—or a float—a number with a decimal place. Technical requirements include details on storage needs, access speed, and data redundancy—storing a piece of data in multiple locations to increase durability and improve query performance.

In practice, only very large projects, say modeling a container shipping business, move from conceptual to logical to physical models. Most other projects skip the conceptual phase and spend most of their time in logical modeling. Some teams even cover elements from the physical phase simultaneously because the people working on the logical model also do the technical implementation.

Why data models are necessary for building a data infrastructure

Data models turn abstract ideas (“we want to track our global container shipments in real time”) into a technical implementation plan (“we will store an attribute called ‘container GPS location’ in a table called ‘Containers’ as an integer”). They help avoid costly demolition and reconstruction of your data infrastructure because you need to think about the data you’ll need, its relations, the database framework, and even whether your project is viable before creating databases and warehouses.

Data models also help with data governance and legal compliance. They allow you to set standards from the start of the project so teams don’t end up with conflicting data formats that need cleaning up before they can use it or, worse, can’t use at all.

Data models and standardization help avoid situations like a sign-up field labeled in nearly a dozen different ways across the organization.

You can also identify sensitive information—social security numbers, passwords, credit card numbers—while you’re modeling so you can involve security and legal experts before you start building.

With safe, accurate, and high-quality data, all teams benefit. Product teams can iterate faster and build immersive user experiences. Analytics teams can create queries without heavy workarounds. And marketing teams can improve advertising efforts by personalizing messaging according to user behaviors and traits.

Customer Data Platforms (CDPs) like Segment can do much of the heavy-lifting during data modeling projects. Segment’s Connections feature makes it easy to capture, organize, and visualize every customer-facing interaction with your business, whether digital or offline. Protocols lets you define your data standards and enforce them at the point of collection. Using real-time data validation and automatic enforcement controls, you can diagnose issues before they pollute your marketing and analytics tools or data warehouse.

Top three data modeling techniques

There are many different techniques to design and structure a database. You should explore these techniques and decide on the most suitable one for your project at the end of the conceptual phase. These data modeling methodologies define how the database gets structured and closely relate to the type of formatting or technology you can use to manage your data project.

For example, many people now default to graph modeling because it’s new and popular, even when a simple relational model would suffice. Understanding the most popular techniques helps you avoid such mistakes.

1. Relational data modeling

In a relational data model, data gets stored in tables, of which specific elements link to information in other tables. Entities can have a one-to-one, one-to-many, or many-to-many relationship.

Relational databases often use SQL (Structured Query Language), a programming language, for accessing and managing data. They’re frequently used in point-of-sale systems, as well as for other types of transaction processing.

The Entity-Relationship Model—sometimes referred to as ER model—is similar to the relational model. It visualizes the relationships between different elements in a system but without going into technical details. You can use the ER model during the conceptual phase to align technical and non-technical stakeholders.

2. Dimensional data modeling

To understand dimensional data models, picture a cube. Each side of the cube represents an aspect of the data you’re trying to capture.

For example, suppose your business sells multiple products to different customer segments, and you want to evaluate sales performance over time. You can visualize this as a data cube, with dimensions for time, products, and customer segments. By traveling up, down, left, and right on the axes of the cube, you can make comparisons across all those dimensions. You’ll see how the sales of each of these products compare to each other and different customer segments at any point in time.

You use the cube model during the conceptual phase. One of the most frequent manifestations of such a cube in the logical stage is the “star schema,” like the one below. At first, it might look like a relational model. Still, the star schema is different because it has a central node that connects to many others.

3. Graph data modeling

During the conceptual phase, most people sketch a data model on a whiteboard. Such a sketch resembles the graph model. It consists of “nodes” and edges—a node represents where the data is stored, the edge the relation between nodes. It’s also the main advantage of this approach: “what you sketch on the whiteboard is what you store in the database.”

Other techniques require you to translate the output from the conceptual phase into a different format for the logical and physical implementation—for example, going from an ER to a relational model or from a cube model to a star schema. Not so with graph models. You can implement them straight away using technology like Neo4j, a native graph database platform.

Data models don't have to be difficult

When you understand the purpose of data models and the process to follow, they’re not challenging to create, especially if you also collect, organize, and standardize your data with Segment. You’ll align all stakeholders before starting technical implementation and avoid costly mistakes or rebuilds. You’ll know what expertise you need on the team to execute your plan and have your data governance defined, too.

Geoffrey Keating on June 1st 2021

Our annual look at how attitudes, preferences, and experiences with personalization have evolved over the past year.

Andy Li on June 1st 2021

Access is always changing. When you start at a new company, you usually are given access to a set of apps provisioned to you on day one, based on your team and role. Even on day one, there can be a difference between the access you are granted and the access you need to do your job. This results in two outcomes: underprovisioned or overprovisioned access. 

For the IT and security teams who manage cloud infrastructure accounts, securing access to them can be difficult and scary; the systems are complex, and the stakes are high. If you grant too much access, you might allow bad actors access to your tools and infrastructure, which at best results in a breach notification; at worst, it results in a company-ending, game-over scenario. If you grant too little access, you put roadblocks between your colleagues and the work they need to do, meaning you are decreasing your company’s productivity.

Overprovisioned access

A common approach taken by startups and small companies is to grant access permissively. In these companies, early productivity can be critical to the success of the business. An employee locked out of a system because of missing access means lost productivity and lost income for the business. 

If you give employees permanent admin access to every system, you optimize for velocity, but at the expense of increased risks from compromised employee accounts and insider threats. This results in an increased attack surface. As your company grows, it becomes more important to secure access to critical resources, and this requires a different approach.

Underprovisioned access

If you give employees too little access, it forces them to request access more often. Although new employees are initially given access based on their team and role, new duties and new projects can quickly increase the scope of the access they need. Depending on your company’s process for providing access, this can be cumbersome for the requester, for the approver, or oftentimes, for both. 

Here at Segment, we have production environments across Amazon Web Services (AWS) and Google Cloud Platform (GCP). We need to secure access to these accounts thoughtfully so that our engineers can continue to build fast and safely. At many companies, you might rely on a centralized team to manage internal access. While this is a simple approach, it does not scale – team members have a limited amount of context surrounding requests, and might accidentally over-provision the requester’s access. At Segment, we approached the problem of managing least-privilege cloud access by building Access Service: a tool that enables time-based, peer-reviewed access.

Setting the stage: access at Segment

At Segment, we have hundreds of roles across dozens of SaaS apps and cloud providers representing different levels of access. In the past, we used to have to log in to each app or system individually to grant a user access. Our IT team managed to “federate” our cloud access and use Okta as our Identity Provider. This gave us a single place to manage which users have access to which roles and applications. The rest of this blog post builds on this federated access system. 

If your organization hasn’t built something similar, the following resources that can help you build and set up your own federated cloud access system.

Blog posts:

Docs:

Mapping Okta apps to AWS roles

By configuring Okta applications to cloud provider roles, engineers are one click away from authenticating to a cloud provider with single sign-on (SSO) with appropriate permissions.

Each Okta app is mapped to a “Cloud Account Role” (or “Cloud Project Role” for GCP). For example, in AWS, we have a Staging account with a Read role that provides read access to specific resources. In Okta, we have a corresponding app named “Staging Read - AWS Role” that allows engineers to authenticate to the AWS Staging Account and assume the Read role.

This requires configuring an Okta app for each “Cloud Account Role” combination, which at the time of writing is 150+ Okta apps.

Configuring GCP with Okta is slightly different, and technical details for how to do this are at the bottom of this blog.

Mapping Okta groups to SaaS app groups

In addition to authentication, Identity Providers can also help with authorization. Users get understandably frustrated when they get access to an application, but don’t have the correct permissions to do their job. 

Identity Providers have agreed upon a common set of REST APIs for managing provisioning, deprovisioning, and group mapping called SCIM (the System for Cross-domain Identity Management).

If an application supports SCIM, you can create groups within your Identity Provider (e.g. Okta), which will map user membership into the application. With this setup, adding users to the Okta group will automatically add them to the corresponding group in the application. Similarly, when a user is unassigned from the application in Okta, their membership in the application group will also be lost. 

SCIM allows us to provide granular, application-level access, all while using our Identity Provider as the source of truth.

With a single place to manage access for all of our cloud providers, the problem should be solved, right? Not quite… 

While the underlying Okta apps and groups system worked great, we quickly ran into more human problems.

Pitfalls of centralized access management

Even with our awesome new Okta+AWS system, we still needed a process for a centralized team to provision access through Okta. At many companies, this team would be IT. At Segment, this was a single person named Boggs. Requests would go into his inbox, and he would manually review the request reason, and decide if there was a more suitable level of access for the task. Finally, he would go to the Okta admin panel and provision the appropriate app to the user. Although this system worked for a time, it was not scalable and had major drawbacks.

Permanent access

Once an app was provisioned to a user, they would have access until they left Segment. Despite having permanent access, they might not need permanent access. Unfortunately, our manual provisioning process did not have a similar scalable way to ensure access was removed after it was no longer needed. People granted access for one-off tasks now had permanent access that hung around long after they actually needed it.

Difficulty scaling due to limited context 

As an engineering manager, Boggs had a strong sense of available IAM roles and their access levels. This allowed him to reduce unnecessary access by identifying opportunities to use less sensitive roles. This context was difficult to replicate and was a big reason why we could not simply expand this responsibility to our larger IT team. 

Most centralized IT teams don’t work closely with all of the apps that they provision, and this makes it difficult for them to evaluate requests. Enforcing the principle of least privilege can require intimate knowledge of access boundaries within a specific app. Without this knowledge, you’ll have a hard time deciding if a requester really needs “admin”, or if they could still do the work with “write” permissions, or even just with “read” access.

Kyle from the Data Engineering team is requesting access to the Radar Admin role to “debug”. What do they actually need access for? Would a Read only role work? And wait… who is Kyle?! Did they start last week? They say that they need this access to do their job and I still need to do mine… APPROVE. 

It was slow

Despite being better equipped than most people to handle access requests, Boggs was a busy engineering manager. Although at first provisioning access was an infrequent task, as the company grew, it began to take up valuable chunks of time and became increasingly difficult to understand the context of each request. 

We considered involving extra team members from our IT team, but this would still take time, as they would need to contact the owners of each system to confirm that access should be granted. Ultimately, having a limited pool of centralized approvers working through a shared queue of requests made response times less than ideal.

Breaking Boggs 

Boggs tried automating parts of the problem away using complex scripted rules based on roles and teams, but there were still situations that broke the system. How would he handle reorgs where teams got renamed, switched, merged, or split? What happens when a user switches teams? What happens when a team had a legitimate business need for short-term access to a tool they didn’t already have? Using that current system, any access Boggs provisioned lasted forever - unless somebody went in and manually audited Okta apps for unused access.

Ultimately, we found ourselves in a situation where we had a lot of over-provisioned users with access to sensitive roles and permissions. To make sure we understood how bad the problem actually was, we measured the access utilization of our privileged roles. We looked at how many privileged roles each employee had access to, and compared them to how many privileged roles had actually been used in the last 30 days.

The results were astonishing: 60% of access was not being used.

Managing long-lived access simply did not scale. We needed to find a way to turn our centralized access management system into a distributed one. 

Access Service

In the real world of access, we shouldn’t see a user's access footprint as static, but instead view it as amorphous and ever-changing.

When we adopted this perspective, it allowed us to build Access Service, an internal app that allows users to get the access they really need, and avoid the failure modes of provisioning too little or too much access.

Access Service allows engineers to request access to a single role for a set amount of time, and have their peers approve the request. The approvers come from a predefined list, which makes the access request process similar to GitHub pull requests with designated approvers

As soon as the request is approved, Access Service provisions the user with the appropriate Okta app or group for the role. A daily cron job checks if a request has expired, and de-provisions the user if it has. 

At a high level this is a simple web app, but let’s look closer at some specific features and what they unlock.

Temporary access

The magic of Access Service is the shift from long-lived access to temporary access. Usually, an engineer only needs access temporarily to accomplish a defined task. 

Once that task is done, they have access they no longer need, which violates the principle of least privilege. Fixing this using the old process would mean manually deprovisioning Okta apps – adding yet another task to a workflow that was already painfully manual.

With Access Service, users specify a duration with their access request. Approvers can refuse to approve the request if they think the duration is unnecessarily long for the task. This duration is also used to automatically deprovision their access once the request expires.

Access Service offers two types of durations: “time-based access” and “activity-based access”. 

Time-based access is a specific time period, such as one day, one week, two weeks, or four weeks. This is ideal for unusual tasks such as: 

  • fixing a bug that requires a role you don’t usually need

  • performing data migrations

  • helping customers troubleshoot on production instances you don’t usually access

Activity-based access is a dynamic duration that extends the access expiration each time you use the app or role you were granted. This is ideal for access that you need for daily job functions – nobody wants to make a handful of new access requests every month. However, we don’t offer this type of access for our more sensitive roles. Broad-access roles, or roles that have access to sensitive data require periodic approvals to maintain access. Activity-based access provides a more practical balance between friction and access, aligning with our goals of enabling our engineers to build quickly and safely

Designated approvers

One of the biggest limitations with our previous process was that one person had to approve everything. In Access Service, each app has a vetted list of approvers who work closely with the system. By delegating decision making to experts, we ensure that access is approved by the people who know who should have it. 

To start out with, you can’t approve your own access requests. (Sorry red team.) Each app has a “system owner” who is responsible for maintaining its list of approvers. When a user creates an access request, they select one or more approvers to review their request. Because the approvers list contains only people who work closely with the system, the approvers have better context and understanding of the system than a central IT team.

This makes it easier for approvers to reject unreasonable or too-permissive access requests, and encourages users to request a lower tier of access (for example, telling them to request a read-only role instead of a read/write role). Since incoming requests are “load balanced” between approvers, users also see a much faster response time to their requests. 

Provisioning access always requires two people, much like a GitHub pull request. Users cannot select themselves as an approver, even if they are a system owner. Access Service also supports an “emergency access” mechanism with different approval requirements. This prevents Access Service from blocking an on-call or site reliability engineer if they need access in the middle of the night. 

With system owners appointed for each app, our distributed pool of approvers continues to scale as we introduce new tools with new access roles and levels. This is what the security community calls “pushing left”

When you “push left”, you introduce security considerations earlier in the development lifecycle, instead of trying to retrofit a system after it is in use. In the software engineering space, “pushing left” resulted in engineers learning more about security. This means that the people most familiar with the systems are the most knowledgeable people to implement security fixes. Since the engineers are the ones who designed and now maintain the software, they have much more context than the central security team. Similarly, Access Service unburdens the central IT team, and empowers system owners to make decisions about who should have access to their systems, and at what level. This significantly reduces the amount of time the IT team spends provisioning access, and frees them up to do more meaningful work.

How it works

Access Service, like many of our internal apps, is accessible to the open internet, but protected behind Okta.

The basic unit of Access Service is a “request”. A user who wants access creates a request that includes four pieces of information: 

  • the application they want access to

  • the duration they want access for

  • a description for why they need access

  • the approver(s) they want to review the request

When they click “Request Access”, Access Services sends the selected approvers a Slack notification. Segment, like many modern companies, has a high degree of Slack presence. Using this platform makes Access Service a more natural, less disruptive part of people’s workflows. Even if the user requesting access is an approver for the particular app, they must receive approval from a different approver – every request must involve two people.

The access request is tracked in a web app, so you can see what requests you have open, and what roles you currently have access to.

The requester is notified via Slack when their request has been approved, so they know they can now get back to the task they needed access for in the first place.

The results

After we migrated our access process to Access Service, the result was zero long-lived access to any of our privileged cloud roles in AWS and GCP. All access granted to these roles expires if it is not actively used. 

In the graph below, “Access Points” refers to the number of users with access to each admin role. After moving to Access Service, we reduced the number of people who had privileged access by 90%. 

In the next graph below, “Active” refers to the number of people who used an app within the last 30 days. Because this number is higher than the number of Access Points, this shows that more access was used in the last 30 days than was currently provisioned

That seems strange – how could admin apps have been used by more people than the total amount of people provisioned access? That’s because expired access had already been automatically deprovisioned, reducing the number of Access Points by the end of the 30 day window!

Conclusion

By acknowledging that access needs are constantly changing, we were able to create a more practical way to manage access control.

Access Service allows us to streamline the access approval process. By routing requests directly to designated approvers, we are able to get fast approvals from people with rich context. The time-based component of access requests allows the service to regularly remove unneeded access, preventing our access attack surface from growing too large. Finally, integrating Slack into the system makes approvals faster, ensures that you know immediately when your request has been approved, and reminds you when the request is expiring so you don’t run into unexpected access loss when just trying to do your job.

While it can be daunting to try to reinvent an existing, well-established process, the results can be incredibly rewarding. Start by writing down your goals, thinking about what you don’t like and what is painful about the current state, and reevaluate your core assumptions. Companies are always changing, and your processes have to keep up; the circumstances that led to the previous system may no longer be applicable today. Most importantly, remember to build with the user’s workflow in mind, because security depends on participation of the whole company.

Future development

Policies

Apps in Access Service are currently individually customizable. However, this can lead to issues with scalability if we want to make changes across multiple, similar apps. For example, if we decide that we want to limit access to several AWS accounts to no more than one week, we would currently have to edit the allowed durations for each individual role. With the introduction of policies, we would be able to map several roles to a single policy, allowing us to easily apply the change from the previous example. 

Dynamic Roles

Currently, Access Service grants users access to predefined AWS roles. These roles are typically made to be general-purpose, but there may be use-cases not fully captured by an existing role. Instead of configuring a new role for one-off needs, or using an overly permissive role, Access Service could allow users to create a dynamic role. When making a request, users would check boxes corresponding to what permissions they wanted (e.g. “S3 Read”, “CloudWatch Full Access”, etc) to create a custom, dynamic role.


Special thanks to David Scrobonia for creating Access Service and setting up the foundation for this blog. Thank you to John Boggs, Rob McQueen, Anastassia Bobokalonova, Leif Dreizler, Eric Ellett, Pablo Vidal, Arta Razavi, and Laura Rubin, all of whom either built, designed, inspired, or contributed to Access Service along the way.


References

Configuring GCP roles in Okta

Connecting a GCP role to Okta is harder than with AWS, and after struggling to figure it out for a while, we thought it would be worth sharing. To connect a GCP role to our Okta instance, we had to use Google Groups in GSuite. 

First, we created a single GSuite Group for each of our Project-Role pairs. In GCP, a Google Group is a member (principal) that can be assigned a role, and all users added to the group are also assigned that role. 

We then assigned each GCP role to its corresponding Google Group. Next, we needed to connect the Google Groups to Okta. 

You can do this by using Okta Push Groups, which link an Okta “group” to a Google Group. Adding a user to an Okta Push Group automatically adds the correct GSuite user to the Google group. We created an Okta Group for each of the roles and configured it as a Push Group to its corresponding Google Group.

To summarize, the flow looked like this: 

  1. Add Okta User david@segment.com to Okta Group “Staging Read - GCP Role” 

  2. Okta Push Groups adds the GSuite user david@segment.com to the “Staging Read” Google Group

  3. Because he is a member of the “Staging Read” Google Group , david@segment.com is assigned the “Read” IAM role for the “Staging” project.

A BeyondCorp approach to internal apps

All of our internal apps use an OpenID Connect (OIDC) enabled Application Load Balancer (ALB) to connect to Okta. This provides a BeyondCorp approach to access for our internal apps: all are publicly-routable, but are behind Okta. 

This is also nice from a tooling developer standpoint, because not only is authentication taken care of, but we can use the signed JSON web token (JWT) that Okta returns to the server through the ALB to get the identity of the user interacting with Access Service. This allows us to use Okta as a coarse authorization layer and manage which users have access to different internal apps.

Tim French on April 29th 2021

Say goodbye to long customer support wait times and lengthy back-and-forth with Segment Personas and Twilio Flex.

Leif Dreizler on March 2nd 2021

Building customer-facing security features in partnership with dev teams helps you better serve your customers, unlocks additional revenue, and bidirectionally transfers knowledge between teams—a concept at the very core of DevSecOps.

Udit Mehta on January 20th 2021

Learn how we use AWS Step Functions for large-scale data orchestration

Sherry Huang, Caitlyn Sullivan on March 31st 2021

It’s time to say hello to the next era of digital advertising, powered by first-party data.

Sudheendra Chilappagari on February 18th 2021

Learn how to use Segment and Twilio Programmable Messaging to send a personalized SMS campaign.

Jim Young on February 17th 2021

What is the most important ingredient for a successful customer experience program? 

According to hundreds of business leaders across the globe, the answer is good data. 

With the explosion of digital adoption last year, data quality became a top priority for companies looking to adapt. Unfortunately, many businesses lacked the right technology to manage the increased volume and complexity of customer data.

At the same time, a number of firms found that managing customer data effectively was a challenge that paid off generously. Savvy business leaders turned to customer data platforms as a result, ensuring that all teams across the org could operate with access to clean, reliable data.

Because a customer data platform can empower multiple different teams across the business, we are often asked: “What’s the exact ROI from investing in a CDP?”

To address this question, we recently conducted a survey with Aberdeen, the leading industry research firm, to investigate in precise terms the numerous benefits a CDP can have on your bottom line.

Let’s dive in.

ROI benchmarks to justify your customer data platform investment

One of the simplest ways to illustrate the cumulative ROI of a customer data platform is to measure the performance of businesses that are using a CDP against those that are not.  Across several KPI categories, businesses that use a CDP are knocking it out of the park.

Notably, 9.1x greater annual growth in customer satisfaction and 2.9x greater YoY revenue growth represent impressive returns on investment for customer-first businesses. 

It should then come as no surprise that, by 2022, close to 90% of enterprise firms will have implemented a CDP across their organizations. 

With customer data platforms in place, these companies are better able to connect and unify first-party data across channels, ensure that data is accurate, and personalize every customer interaction to each individual’s preferences.

Source: Aberdeen, September 2020

It’s important to remember these benchmarks only represent the average impact of a CDP on business outcomes. Some businesses will achieve greater returns, some less. 

That’s why you must emulate businesses that have successfully deployed a CDP. To help you maximize the ROI of your CDP, follow these three steps.

How to maximize the ROI from your CDP investment

1) Use data to understand buyer behavior

The ability to seamlessly integrate data from all relevant sources is critical for businesses looking to build a comprehensive understanding of customer behavior. 

Put simply, a CDP helps standardize your data across the organization. This allows your customer-facing teams to better deliver relevant, personalized experiences by segmenting customers based on various criteria such as previous spend, loyalty, or demographics. 

In turn, you can uncover trends and correlations influencing customer behavior that would be otherwise hidden to non–CDP-users.  

Standardized data can also help your business reduce churn by identifying common elements in the journey of lost customers, and drive revenue by targeting high-profit clients or those with the best product fit. 

SpotHero, a popular parking reservation service, is a great example of a company using Segment to standardize their customer data and drive conversions by unlocking insights into user purchase behavior. 

Source: Aberdeen, September 2020

So you have already established a single view of customer data — now what? 

2) Use data to hyper-personalize customer interactions

The next step is going beyond integrating data across enterprise systems and toward activating customer insights to provide hyper-personalized experiences. 

The research shows that companies using a CDP are better equipped to deliver consistent messages to their customers through multiple channels (89% vs. 82%). In addition to consistency, you can use real-time insights to tailor the content and timing of your interactions to the unique needs of each buyer. 

This capability can substantially improve customer satisfaction, retention, and LTV.  In other words, using consistent data to hyper-personalize your customer interactions will result in a direct impact on your bottom line. 

For companies looking to maximize the ROI of a CDP, personalization is the name of the game. 

Source: Aberdeen, September 2020

3) Use data to continuously improve marketing performance

The last insight we can apply from Aberdeen’s research involves the effect of good data on employee performance.

Consistent, reliable data drives better performance by providing business leaders a frame of reference to evaluate employee activity and determine areas of inefficiency or training needs.  

Although a CDP can empower every team across the organization, the most impactful benefits are often enjoyed by the marketing department. More specifically, the ability to accurately map each step of your customer’s journey or target the most profitable customers can vastly improve the ROI of your marketing campaigns. 

Segment Personas, for instance, is a powerful toolkit for orchestrating the customer journey that allows marketers using Segment’s customer data platform to: 

  • Build custom audiences 

  • Sync those audiences to advertising, email, a/b testing, chat, and other tools in real-time

  • Get a single view of the customer across all digital properties and the tools they engage with

Source: Aberdeen, September 2020


The velocity of customer data last year left many businesses scrambling to accelerate their digital transformation roadmaps. Prior to COVID, personalizing the customer experience was an aspirational project for many businesses. Now, it is imperative.

However, the fact of the matter is that nearly 80% of enterprise companies struggle with the challenge of using data in their customer experience efforts. Poor data quality, fragmented customer insights, and outdated technology each present considerable obstacles for non–digitally-native firms. 

Fortunately, the rapid expansion of customer data platforms helped many businesses overcome these challenges by improving their ability to harness and activate customer data. 

On this point, Aberdeen’s research is clear — CDP investments are paying dividends. Companies with a CDP are outperforming non-users in annual revenue growth, customer satisfaction, and employee engagement, among other KPIs. 

However, the implementation of new technology is not enough. It’s essential to follow best practices as well. 

Using good data to understand buyer behavior, improve marketing performance, and deliver hyper-personalized customer interactions are three steps you can take today to ensure you are getting the maximum ROI from your CDP investment. 

Become a data expert. Subscribe to our newsletter.

Josephine Liu, Sherry Huang on June 9th 2021

Our latest feature, Journeys, empowers teams to unify touchpoints across the end-to-end customer journey.

Kate Butterfield on June 16th 2021

Get an inside look at the design process for Journeys.

Katrina Wong on March 31st 2021

With Segment, brands can leverage their first-party customer data to build deeper customer relationships.

Madelyn Mullen on August 17th 2020

Your business growth depends on empowering every team with good data. Introducing the Segment Data Council, a series of interviews with seasoned customer data experts who know how to build bridges across the organization and empower teams.

Madelyn Mullen on August 17th 2020

Imagine if your PMs had an overview of support tickets, billing issues, sales interactions, and users’ clickstreams—all unified and available via self-service. It would be the Holy Grail of data management. Listen to more in this Data Council episode.

Madelyn Mullen on August 17th 2020

Simply put, data governance leads to better automation. Listen to this Data Council episode to hear how Arjun Grama grew his customer data wrangling techniques to transform product lines at IBM and raise the bar on growth KPIs at Anheuser-Busch InBev.

Madelyn Mullen on August 17th 2020

What does it take for a data driven business case to excite stakeholders across an organization? Tune in to this Data Council episode for an insider perspective from Kurt Williams, Global Director of Customer Products at Anheuser-Busch InBev.

Become a data expert.

Get the latest articles on all things data, product, and growth delivered straight to your inbox.