Identity resolution: the definitive guide
A comprehensive look at how to implement identity resolution for a holisitc view of customer behavior.
A comprehensive look at how to implement identity resolution for a holisitc view of customer behavior.
If you’ve attended a marketing conference in the last decade, you’ve probably heard the phrase “Single view of the customer” thrown around quite a bit.
But despite the hype, the “single view of the customer” is usually an empty rallying cry as opposed to a day-to-day reality.
This is because gathering all the data about your customers and merging it into a single record is still a largely unsolved challenge.
If the expectations weren’t already high enough, companies are also up against constantly evolving data privacy initiatives, from legislation like the GDPR and California Consumer Privacy Act (CCPA). If companies don’t take steps to future-proof their technology in light of these privacy regulations, they’ll get left in the dust.
The crux of the issue is this—for a business to get that elusive single view of the customer, you need to have good identity resolution.
And that hinges on having the right attribution technology to get identity right in the first place.
At Segment, our customer data platform has helped thousands of businesses merge the complete history of each customer into a single profile.
That’s why we’ve put together this short guide, to help you understand the nuts and bolts of identity resolution, and how you can bring it into your organization.
Identity resolution is the process of attributing customer behavior and interactions with your business — across all touchpoints, platforms, or channels — to a single unified customer profile. Identity resolution allows any team across your organization to then access this profile and use it to better serve each individual customer.
This relatively simple explanation belies the complexity of actually unifying customer interactions in the 21st century.
The number of touchpoints customers can have with your business has exploded in the last decade, and a typical customer journey is likely to take place across various devices. Think about it: How many devices have you used in the past hour? Your work laptop, your phone, maybe a smart speaker, and possibly more.
And you’re not alone.
The average household has twenty-one connected devices in thirteen device type categories, and 96% of the world’s digital consumers use a mobile device to connect to the internet.
While the number of touchpoints has risen, so too have customer expectations, and traditional tools like CRM platforms lack the functionality to track cross-device activity like this.
In this year's State of Customer Engagement Report, 81% of brands surveyed said they had a deep understanding of their customers. But less than half of customers agreed.
With customer identity resolution, you can gather data across those seven screens, tie it all to one unique user, and then use that data to delight your customers with messaging tailored to them.
This level of insight opens up a whole host of opportunities. Here are three of the most common use cases we see for identity resolution.
Anonymous visitors account for around upwards of 98% of all website visitors.
Maybe they’re visitors who haven’t converted yet. Or maybe they have converted but are logged out on arrival.
The path of least resistance is to simply ignore these data points and resign yourself to focus on data from existing customers.
But in doing so, you’re leaving a lot of valuable insights on the table. With identity resolution, you can reconcile anonymous visitor data with your known visitor data and get deeper insights into customer behavior than ever before to boost sales and retention.
Let’s look at a real example.
Let’s say there’s a hypothetical shoe brand called SegKicks that has the hottest sneakers on the market. Jane Doe downloads the app on her iPhone but doesn’t bother registering for an account yet.
She then clicks on a few different types of shoes—ShoeA, ShoeB and ShoeC—but doesn’t add them to the cart. Because Jane hasn’t registered for an account yet, all of these events will be sent through with an anonymousID and an ios deviceID.
Jane then decides to add ShoeD to her cart. At checkout, she creates a new account with her email and purchases the shoe. When she creates the account she is assigned a userID and the events of her purchase are sent through with an email.
Here’s where the real power of a customer data platform comes in.
Instead of having two different user profiles — “logged out Jane” and “logged-in Jane” — there is only one. This makes perfect sense when you think about it. Jane is the same person, irrespective of whether she’s logged in or logged out.
By linking the original anonymous events to her logged-in activity into one profile, we can start to get a much clear picture of Jane’s purchase experience. And now that we understand her behavior and her preferences, we can focus on optimizing that experience next time around.
Best of all, she’ll have a single user profile that can grow with her as she continues to interact with the business.
Of course, it’s never that simple.
As we saw earlier, customers aren’t just using one touchpoint. They’re interacting with your business via mobile apps, mobile browsers, desktop, and more. So how do you ensure customers have a consistent customer experience no matter the device?
Let’s continue with the example of Jane Doe. Jane logs into the same mobile app, SegKicks, with the same email address, but this time she uses her partner’s Android phone.
Should we create multiple profiles for Jane because she used an Android phone instead of iOS? Of course not.
Jane is the same person, irrespective of what device she’s using.
Thanks to identity resolution, Jane has the same user profile, with an android.id attached.
Here’s a shocking statistic for you: poor data quality costs organizations around $12.9 million in lost opportunities and bad business decisions.
That’s because most marketers are reduced to working with a partial view of their customers, based on breadcrumbs collected from disconnected interactions. They don’t know that Device ID 6 954 is already a customer, or that Desktop User X is actually the same person as Mobile User Y.
The result? Untargeted, poor performing advertising that eats up your valuable marketing budget.
Identity resolution will automatically merge these disconnected profiles, ensuring you won’t have duplicates or other errors.
Let’s return to the example of Jane Doe one last time. SegKicks also has a running app SegRuns. Jane Doe downloads the Android app SegRuns and views a workout.
Her user profile is updated with a new anonymous_id from the SegRuns app.
Having this single view of the customer experience can help SegKicks make their marketing campaigns a lot more effective.
They’ll know not to advertise SegRuns to Jane on any device — she’s already a customer.
They’ll know to send targeted ads for running shoes to Jane on any device — she’s indicated that she’s a fan.
Thanks to identity resolution, SegKicks can suppress ads from specific customer segments. and double down on advertising to those with a propensity to buy.
We work with a lot of companies to establish successful identity resolution systems, and we’ve identified four things that have the biggest impact on successful implementation.
The number of vendors offering identity resolution is growing fast, and all have their own unique ways of managing identity.
But generally speaking, there are two approaches for accomplishing identity resolution: deterministic and probabilistic.
Deterministic is where you resolve identities based on what you know to be true. It merges new data into customer records by searching for matches among the phone numbers, emails, device IDs, and user IDs you already have. Deterministic identity resolution is a high-confidence approach using first-party data where you know with certainty that this user did that.
Probabilistic is where you resolve identities based on what you predict to be true. It uses predictive algorithms to understand who your customers likely are. Probabilistic identity resolution is a statistical model with a given confidence interval, where you know with a certain amount of confidence that this user will do that.
At Segment, we believe that deterministic is the best approach for identity resolution because it’s based on first-party data your customers actually produce.
For this reason, our Engage feature is 100% deterministic, based on first-party data.
Identity resolution is not just a data quantity issue. It’s a data quality issue.
Before you jump headfirst into an identity resolution program, it’s imperative you gather all your customer data into one place with a customer data platform (CDP). This central source of truth for all customer information ensures accuracy, allows for good data governance, and improves the efficiency of cross-departmental collaboration.
Depending on your CDP and its capabilities, this could mean anything from standardizing user’s names and addresses or assigning events to web sessions.
Remember — a 360-degree view of the customer is only as good as the quality of each individual degree.
Your goal with an identity resolution system is to create a robust and accurate profile of each individual customer. This means you need to collect a lot of data from diverse sources, especially when it comes to identifiers.
An “identifier” is the data you use to identify customers. The most common identifiers are email addresses, login data, and IP addresses. They’re a good place to start creating profiles, but they don’t provide nearly enough useful information.
To fill in the gaps, you need to utilize externalIDs, which are identifiers pulled into your CDP from an external data source. Some examples that Segment uses include user_id, Android and iOS IDs, Google Analytics, anonymous_id, and group_id.
But you can get even more specific by using custom externalIDs. These usually include identifiers unique to your business and your goals, like office phone numbers, device IDs, or offline transaction data. From there, you can tailor your identifiers to the specifics of your business.
We got pretty deep there for a second, but the point is that you have a vast array of options when it comes to identifiers, and the more you utilize those options, the better your customer profiles will be.
Don’t expect to get it completely correct right away; just keep in mind that you should gather as much data as you can and customize how that data is handled.
Segment maximizes the value of real-time data collected through various third-party tools and channels. It currently supports hundreds of integrations, including those for e-commerce, business analytics, social media, and A/B testing.
While all of this data can be powerful, it’s made even more so when integrated with any first-party data you store in warehouses. This interoperability of systems helps you mesh together brand-new data with historical data to create a more authentic view of your customer.
If you have data on a customer’s browsing activity, product purchase history, and interactions with customer service, you could view these in relation to one another on the Segment Identity Graph. You can also go further by linking customer profiles in Segment with the data you have stored in data warehouses and data lakes.
As new data events happen (such as purchases or email clicks), data gets updated in real time and sent to the warehouse or lake. Organizations can then enrich their onsite customer data immediately and capitalize on last-minute sales opportunities, performance insights, or sales increases. By tailoring marketing outreach to use both the real-time Segment captured data and the historical data, organizations get the best of both worlds and a more powerful view of the customer.
Segment also integrates with data clean rooms, which are designed as a sort of playground for marketers and business leaders to experiment with customer data without actually having access to any personally identifiable information (PII). These clean rooms allow you to access the powerful data tools in Segment’s pipeline but not risk violating privacy or consent laws when enriching with second-party data.
If we look at how customer data has evolved over the past decade, first, it was about collecting the right data. Then, as martech grew exponentially, it was about taking action on that data.
While both of these are and will continue to be critical functions of a CDP, none of it matters if you can’t identify individuals across all of their systems and channels.
If marketers truly want to get that elusive single view of the customer, they instead must look inwards and get to grips with the fundamentals of identity resolution.
It may seem counterintuitive, but getting the basics right may be the next big trend in customer data.
For a more detailed look into how Segment handles identity resolution, watch our recent webinar on the topic: Demystifying the single view of the customer.
Our annual look at how attitudes, preferences, and experiences with personalization have evolved over the past year.
An identity graph is a database that stores all identifiers associated with a particular customer. It helps organizations to identify and understand their customers better.
Identity stitching is the process of merging customer identifiers into a single view in order to have a complete view of your customers.
An identity resolution solution is a software that helps organizations combine multiple identifiers of their customers into a single identity. In other words, an identity resolution platform merges different customer data sets into one view. This allows organizations to ensure that they have a complete view of their customers, have a better understanding of the customer's needs and wants, and create more personalized customer experiences.