This is Segment’s Data Council series, where members share stories about using Segment to work with customer data within Enterprise companies. To make sure you don’t miss an episode, subscribe on iTunes, Spotify, or your favorite podcast player. You can also read a lightly edited transcript of the conversation below.
Automation is only as good as the data that powers it. If that data isn’t clean and optimized, you can easily end up embarrassing your brand—by flubbing personalization, for example—or worse, hemorrhaging money you didn’t need to lose. (Harvard Business Review reports that bad data costs the U.S. more than $3 trillion yearly.)
Simply put, data governance leads to better automation. From taming horses to arranging the ideal fruit bowl, Arjun Grama, Instrumentation & Architecture Specialist at Anheuser-Busch InBev, uses vivid stories to showcase what that governance should look like. His experiences accelerating organizational data maturity include kicking off customer data implementations, creating repeatable and trusted processes, and knowing when to raise the bar on growth KPIs.
Arjun grew his customer data wrangling techniques while crafting Segment Tracking Plans to transform product lines at IBM as a Growth Product Manager. Tune in to this episode to hear how to succeed with automation by increasing data literacy and focusing curiosities.
Just because you can automate something doesn't mean you should. Everything automated should always have a reason behind it.
Everything that you're tracking should be tied to a key performance indicator (KPI), a business metric, or a specific use case.
Use governance on your architecture to be both proactive and reactive.
“You could have the greatest data in the world, and it could follow a very clean and specific taxonomy, but if nobody understands it and there's no record of it anywhere, it's meaningless and worthless.”
“Automation can be great once you have a more mature process and you have a better idea of what it is you're trying to achieve. But automating for the sake of automating can also be more trouble than it's worth because writing something into stone when it's a first draft is just a bad idea.”
Guide: What is a Tracking Plan?
Analytics Academy: Making the Most of Your Customer Data
Read the transcript:
Madelyn Mullen: I'm Madelyn Mullen, part of the Segment Product Marketing team. I'm joined by Arjun Grama, Data Instrumentation and Architecture Specialist at Anheuser-Busch InBev. Prior to AB InBev, Arjun was a Growth Product Manager at IBM. Welcome!
Arjun Grama: Thank you! It’s great to be here.
Madelyn: Arjun, to kick us off, would you share a little bit about the business challenges you've worked on?
Arjun: Absolutely. The challenges are not foreign to anybody in this space—the biggest one being a bunch of very separated systems that weren't speaking to each other, with the additional issue of data not really being seen as something to focus on, but mostly a byproduct or something you might be able to get if all the cards land in your favor.
The biggest challenge that we had was, one, just figuring out what systems were out there and then culling that list down to what systems mattered. There are plenty of options that have duplication; they have old legacy information that's great for diving into bookkeeping but really isn't relevant to business today; or they just have bad data. Figuring out what was safe to even be paying attention to was one of the first things we worked on. And then, looking at how to tie it all together. IBM had so many acquisitions and so many companies spring up within it that it really wasn't like dealing with one company. It was like dealing with 50 or 100 different small companies. That was definitely an interesting challenge to tackle.
The other half was data literacy. A large part of the reason we had so many different systems was because as a need came up, they would build something to address it. There wasn't necessarily a focus on the larger picture of how this fits into the IBM ecosystem, or more importantly, how this fits into the customer's journey. The other piece of that (in terms of data literacy) was also positioning it in a way that allowed folks within IBM to see it not just as data or something that's just conceptual, but as a tangible piece of the customer journey they could actually look at and understand.
Take the Segment Analytics Academy lesson 'Making the most of your customer data.'
Madelyn: Those are a lot of different actions you are responsible for with the data.
Arjun: We had a great team. I absolutely did not do any of it single-handedly. It was a journey—a journey they're still on. But in the last two years they made significant progress, and there's good momentum to carry them forward.
What good is bad data?
Madelyn: What would make data "good" or "bad" as you're trying to raise data literacy across the organization?
Arjun: It depends on who your audience is. Good data for a product manager who's trying to understand some basic KPIs around the products and maybe track the user journey is going to be different than good data for a product analyst who's really getting into the weeds and dealing with the nitty-gritty.
That said, I have some basic data principles I like to adhere to: "Less is more," meaning if you have to figure out what data to go look at, and then figure out how to understand it—just to answer a single question—that's going to be a huge problem. If you have to do that, chances are, so does everybody else. You need to make it as clear as possible from the get-go what a data point represents. That means, consistent terminology and taxonomy, trying to be specific as opposed to generic when we name our events in our properties, and then tracking all of it. You could have the greatest data in the world, and it could follow a very clean and specific taxonomy, but if nobody understands it and there's no record of it anywhere, it's meaningless and worthless.
And then for bad data, it comes down to not having a clear use. When data is being collected, there should be a specific use case or a specific context in mind. Tracking for the sake of tracking just leads to more headache than it's worth.
Madelyn: When you have these principles to help you identify good and bad data, how do they shape your automation journey?
Arjun: It really helps with automation if you have clean data and actually vice versa. Starting off, it's a very manual process. You're ducking, dodging, and weaving to figure out: Where is a certain data point is coming from? What’s it supposed to mean? Who's sending it? Why are they sending it? Once we get to a certain stasis or level setting, and we're comfortable—maybe not with the state of the architecture, but we at least know where it is and what's happening with it—it becomes easier to automate.
Automation helps with standards. When things are automated and they break, you know it's because you've got something that was unexpected. It helps you manage expectations, and of course, it takes a very manual process and streamlines it. It can help twofold really: one for the team that's working on implementation, because all of a sudden they're hearing less noise, and they're not just throwing events into the ether. They have a very clear track of what to do. Then on the flip side, as a data consumer, you know exactly what to expect, and you know exactly what you're looking for. Automation can be great once you have a more mature process and a better idea of what it is you're trying to achieve. But automating for the sake of automating can also be more trouble than it's worth, because writing something into stone when it's a first draft is just a bad idea.
Automation with Segment
Madelyn: What happens when Segment becomes part of your automation toolset?
Arjun: Automation gets easier, and all of a sudden you're not writing your first draft in stone. You're writing it in clay, so there's a little bit of flexibility. In general, because of the flexibility of Segment, you can track once and leverage however many times you want. It becomes a little bit easier to try out different processes, because you can pivot them really quickly. And when you find out that automation does or doesn't work, you can adjust accordingly.
With Protocols, it's great. You can put forward tracking plans for each team. You can slice them down as small as you need to or as big as you need to, which again, just makes it easier so that everybody knows exactly which sandbox they're playing in. The other piece of that is also, again, you track once and then you can send it wherever you want.
Madelyn: How would you explain Protocols?
Arjun: If Segment is a highway, you can get to where you need to go. But in a lot of cases you still need to have rules, you need to have maps, and you need to have stoplights. Protocols gives you the infrastructure to actually govern that highway through stop lights, speed limits, HOV lanes, and so on. If an event is a car, you can say: Red cars go here, blue cars go here. If you're a red car but today's a Tuesday, go here. You can get really good granularity and control over your data. So it goes back to the automation, where you can deal with the expected, and you can also get immediately notified about anything unexpected. Because unexpected isn't necessarily bad, but it can be. So it's good to know about it the minute it happens.
Source: Unsplash @roseannasmith
Madelyn: How do you see Segment informing your work for this coming year?
Arjun: Segment is my work for the next coming year. We're rolling out to several countries within the next year. And that means there are several new instances of Segment and architectures that need to be set up. Segment is my day-to-day, but it's also my foot in the door to learn about anything I want within the organization. So how does it inform my next year? It lets me pick what I want to learn about.
How data maturity evolves
Madelyn: Arjun, how does your team within AB InBev operate, and how big is it?
Arjun: The data team is a whopping four people right now. But Bee's (which is the larger organization within ABI that we are in) is the new digital pillar for ABI. We’re taking a lot of those analog or manual efforts and bringing them into the 21st century for both the convenience of suppliers and for customers. We support the entirety of the Bee's organization, which is 200 folks. But we work most closely with the Product team and then in turn with the Engineering team, whenever they have questions.
Madelyn: When you're thinking about your journey with automation and Segment, how have things been the same when you've moved over to AB InBev?
Arjun: AB InBev has folks who have worked with Segment in previous roles. I'm not the only one. There was already a higher level of data maturity within the org, but some things don't change—like having to go into your product, click a bunch of things, and see what fires in Segment. That's just something you always have to do. You have to level-set and understand how things are being tracked.
There's always the potential trap you fall into, where you have a very specific solution that works really well for you, but it might not necessarily scale up or fit into the larger picture. That was less of an issue here since it is a single product as opposed to the numerous products we were working on at IBM. But some of those same actions are being tracked in a different way, because they're happening in a different part of the application. Even when you know data—you're paying attention to it, and you're focused on it—it's not necessarily easy, and it's still going to be a little bit messy. Expecting that difficulty will reduce the headache.
Madelyn: When you say data maturity was higher within your new role, what makes up data maturity?
Arjun: First of all, it’s the prioritization of collecting and acting on data. In a lot of cases, data is looked at as this magic black box where it's like, "Oh, we'll get data, and then we'll have an answer, and we'll make $1 billion!" And after a month of tracking things, it's like, "Okay, where's my billion dollars?" It's not just “track data, get money.” That'd be great if it was.
The bigger thing is understanding why you're tracking and how you're planning to leverage it. Anytime you go to developers and say, "Hey, implement Segment for this feature," there should be a very specific performance question you're trying to answer or a messaging component that you need to fulfill with this trigger.
The big thing here was that with all the data being collected—while I may not have understood it right off the bat—its use was pretty clear. I could look at anything, and even if I didn't clearly understand the definition, based on the context, I could see why it would be something we would want to track.
The other piece is that it's not data as an afterthought. Data is synced with Product, and each drives the other. When Product releases something, they're going to have tracking for it, so they can see how it performs. And when they want to do an upgrade or they want to roll something new out, they're generally going to have data to back it. That was a nice shift to see data at the forefront with the rest of Product and Engineering.
Madelyn: Switching back to IBM, when you first arrived, what was that data stack like, and how did it change?
Arjun: When I first got to IBM, Segment was at the core of the data stack. That was the first thing they had put in. And then shortly thereafter, they had implemented Amplitude for visualization and Intercom for customer engagement.
It was a good stack, and it functioned pretty well. There were only maybe 15 (if that) IBM products on there. It wasn't at the hundreds that it was by the time I left, so it was a little bit easier to manage. There was a lot of learning. We didn't have tracking plans. We didn't really have anything that could help us govern the data that was being sent, so there was a lot of checking in on Amplitude and looking at schema tabs just to be like, "Hey, you see anything funky?" And if we didn't see anything funky or didn't hear anything from somebody downstream, we were doing okay.
Data was a means to an end, there. It was more like these products needed to be able to say, "Hey, I have data." But, really what they were saying is, "I can now message my users through Intercom.” That’s great, but I don't think there were a lot of analytics done around the data that was being collected or even around the messaging that was going out.
That being said, as we saw more folks start to see the value of the stack, we were able to say: "Okay, rather than just giving you carte blanche to do whatever you want, we're now going to have some standards and some things you have to do because we have needs on the data side. You're going to get all these great benefits but only if you meet some of these criteria." We went from having to chase teams down to having some teams come to us, because they saw the value of the stack. That was nice. The team's doing really well there. They've continued to add to the stack. I know they just brought on WalkMe and won an award there, so the IBM growth team is thriving.
Madelyn: Why should companies act now on data maturity versus waiting until later?
Arjun: Well, I guess the argument is, “We got this far without it, why do we need it now?” To that, all I can say is that if you're okay with getting a B when you could be getting an A, more power to you. But for those who are looking into data and do want to have a more mature data model but aren't sure about timing, the longer you wait, the harder it gets.
You might not not realize it, but you're continuing to churn out data whether or not you're governing it or structuring it in a specific way. If you wait six months, that's six months of systems you might have built, integrated, and changed that you're now going to have to deal with. The quicker you start getting a handle on your data and what's actually flowing in your systems, the quicker you can start to clean it up. Alternatively, if it’s not worth cleaning up, you can start from scratch. Because whether or not you're cleaning up or just diving in to figure out whether you are going to do the cleanup, the longer you wait, the more there is to work through. It just piles up. It's like procrastinating anything else: The longer you wait, the worse it's going to be.
Mindset of a data wrangler
Madelyn: Arjun, you've referred to yourself as a "data wrangler.” How did you become one?
Arjun: I like asking questions. We're still in the early days of data. I know, clearly, it's been around for decades, eons, but data as we know it is relatively new. While there are SMEs and geniuses and folks who really know their stuff in and out, nobody knows everything, because everything is continuing to grow. The way that we employ it and the way it impacts your day-to-day lives—it's just continuing to evolve.
If this is the Wild West, the data wrangler is the guy going out there and getting a few of those horses. You're not going to tame the entire Wild West, but you could go break a few stallions and have a good stable. What I do is I try to figure out what those things are and go try to wrangle them. If there are a hundred horses thundering down the plains, what are those 10 horses we really want to pay attention to?
That goes back to understanding the systems: what information they have, what's valid versus not, and then what also is actually realistic to try to get. Data wrangling is as much about understanding systems and understanding the data points as it is about understanding people and the relationships they have with each of these tools. Because you're not going to just go to a system and ask it for something. You're going to go to the person who works with the system and ask them.
Madelyn: As a cowboy out on the range, do you have any other cowgirls or cowboys with you?
Arjun: Oh, heck yeah! I'm the only cowboy on the team. The cowgirls are:
Krystal—she's our Head of Data and Global Insights. She's phenomenal and works really heavily with reporting but also helps direct the product teams and is the go-to person for data questions.
You have Christine, who's a Product Analyst, and she knows the product in and out. Anytime I have questions around a piece of metadata or historical knowledge, she's my go-to.
And then you have Kerry, who is heading up data globally. That means customer engagement, messaging, collection, data reporting—all that good stuff.
I have an awesome team, and I'm realizing that even at IBM, I was one of the few guys on my team. Even though the industry is very male-skewed, in my experience, I've had a good amount of diversity.
Madelyn: How do you ask for help if you need it? Who supports all of you?
Arjun: If one of us needs help, the first instinct is to go to one of the others, because they're most closely tied to the work you're doing. They have the most context and are obviously just the most willing to help you. Outside of that, Bee's as a whole is very accessible. I'm able to Slack somebody I've never met before and send them two sentences, a little bit of context, and they're like: "Yeah, throw something on my calendar. Happy to help. Let me know whenever you're free." Honestly, there's not really a set person who we go to, because everybody's so focused on the same mission that if you find a blocker, you just identify who can help you with that blocker and go to them. If you can't figure out who can help you, then you go to your manager, because they might have an idea of who to go to. But it's not so much escalating or having to get help from above. It's just finding the person who knows it, and they'll help.
Madelyn: As your immediate team grows, or as the Bee's overall digital team grows, what do you look for when hiring?
Arjun: From a technical standpoint, we're looking for folks who have some level of familiarity with a data architecture if they're looking for an analyst role. If they're in development, we're looking for folks who have experience with mobile development for both iOS and Android.
But then the other really big piece—and this is not Bees specific, this is just ABI—is the focus on people. That means making sure that they are genuinely a correct fit. From my experience here, the focus is more on finding the right person and then knowing that the role will come then rather than vice versa.
Building and automating tracking plans
Madelyn: Thinking about what you're doing right now at AB InBev, how does one get started building out a tracking plan?
Arjun: Usually, I'll have the product owner walk through the feature like a user would: What are they clicking on? What are they typing in? What are they selecting? And I'll just write down that list of actions.
Then I'll try and boil it down to something a little less verbose but still clear. Maybe if the action was "user clicks on red button with a green arrow" we would change that to "arrow button clicked" or "red button clicked," depending on what the defining factor is. That way, if there's a red button elsewhere on the product, we can continue to use that event.
We’ll also say, “Okay, when you're tracking each of these things, what are you hoping to gain from it?” Because in certain cases, the same thing can be tracked in a variety of ways, and you don't want to have a bunch of data just for the sake of it. You can have events with just two properties underneath them, and they’ve served their purpose. That’s great. In certain cases, you have events that have 35 properties underneath them, but each of those properties is needed by a different member of the team for their analysis, and that's great as well. A big piece of it is walking through the product and understanding the context and how they expect the user to use it.
When they're telling us why they want to track something, in a lot of cases, PMs will say: "This is because we think this is what the user is going to do, and we're not sure. We just need to be able to validate whether or not the product is being used the way we think it is." When they say something like that, it gives you really good context that they're paying attention to their user journey, and that's the right mindset to have.
Once they get to a point of knowing these are the data points they want to track, it just becomes a matter of matching those up to existing events in the schema. In the case that there's not an event in the schema that meets their use case, you write a new one with them and make sure it makes sense to them as well as to the analytics team.
Want an example of a tracking plan? Get a Segment customer’s tracking plan to guide you.
Madelyn: How did your superpower of asking questions help you automate away certain activities?
Arjun: For starters, if you keep on asking enough questions, eventually people want to shut you up, so they're willing to do some work to make sure you won't ask any more questions. That's one piece of it. The other piece is also just understanding more. There's the obvious pitfall of asking so many questions that you're not learning anything because of the noise, and you're causing problems or being a hassle to those around you. But if you're genuinely asking questions because you want to understand the larger context, and other people know this, they're very willing to answer them. They know that if you learn something or you're able to make a connection, you'll probably benefit their life.
At ABI, I just try to ask as many questions as possible to get an understanding of how things work, because in a lot of cases, people assume other people know how things work, and somebody doesn't necessarily have a bird's-eye view of everything. Even if you can just carve out 5% of your domain and truly understand it, that can go a long way toward making others more willing to help you, because you're able to help them, and you truly know it in and out. It also gives better context, because it's not just about how things function, it's also why they were made to function like this. Understanding why things were built in a certain way (or why things weren't done in a certain way) can help with future decisions. And when you know a bunch of random things, you can have better ideas, because you can stand on the shoulders of giants.
Madelyn: Who are some of those giants or supporters?
Arjun: Our engineer at IBM was phenomenal. My favorite thing to do was to sit down and argue with him, because at the end of the discussion, one of us understood why the other was right, and we both agreed on it. There were plenty of times when I walked away thinking, "Oh man, I can't believe I thought that." There were a couple of times he walked away saying, "Okay, Ajun is actually correct." As a result of that, I learned things that weren't necessarily my domain. But when I spoke up on them, he listened and vice versa. If he saw me doing something that wasn't his domain, but he knew that I valued his input, he would say something and help me out.
At both IBM and ABI, I've been pretty lucky to work with people that I have. IBM data was very highly prioritized, so we always had really good support, which I very much appreciated.
Ask questions and offer freebies
Madelyn: What do you do as the organization's data maturity grows, and you're thinking about your job responsibilities?
Arjun: It moves from day-to-day implementation to thinking longer-term and taking something you're doing and turning it into a repeatable or automateable process.
You also consider: “Okay, how do we marry these two different data points? How do we make sure this downstream user is getting this data?” Once you have a stabler data flow and you're pretty confident about what's coming into your system, then you ask, “What other systems do we need to be sending this to and what other systems should we be ingesting?”
A good example is the Foxtrot integration we did with Braze. Foxtrot is our logistics company down in the Dominican Republic, and anytime a delivery is about to go out, in the Foxtrot system, they get an event that says “route created.” It contains the IDs of all the stores that are going to get hit on that route. And we were able to take that data and plug it into a Segment through a Function that Oliver from Segment helped us set up. It would look at the Foxtrot data, look at the CustomerID, match it to a StoreID, pass it into Braze, and trigger a message to those store owners saying, “Hey, your items are out for delivery.”
We were able to go from thinking about it to actually accomplishing it in the space of four days, because we were able to use Segment so that we weren't redoing work that had already been done. We weren't trying to reinvent the wheel. We were able to just take what we had, plug it in, and go. Now, there is a small piece of it in there, and we're going to continue to ramp up the amount of information we're getting. So I think it's really just looking for consistency. When you have stability and repeated performance, that's usually a good indicator that you can start looking to either expand the stack or expand the scope of what the stack is covering.
Madelyn: Would you share one of your biggest learnings from either expanding the scope or expanding the stack?
Arjun: A big one was shifting from Intercom to Braze. At the time, it did not seem like an expansion, it seemed more like an upgrade. I guess you can argue either way, but we thought we were taking out one component and slotting in another component that did the same thing a little bit better for our use case.
And again, the Wild West of data is still growing. As similar as the two products are, they're still very, very different. What we thought was going to be an intense (but only three-month) process took closer to six months and was even more intense than we thought. My biggest learning was just to keep on asking questions. I wasn't owning Braze, I wasn't owning Intercom, so I was a little more on the periphery of it, and I would just accept the information as it came in. As we got deeper into it, I found out that if you don't ask the questions because you assume somebody else did, chances are that somebody else also made that assumption. It's better to have two people ask the same question then to have nobody ask it at all.
Madelyn: Arjun, you had also shared with me before we spoke today about a "freebie story" that gets early wins to your downstream users. How do those freebies help you onboard new Segment use cases and users from Product, Engineering, Marketing, and Analytics?
Arjun: “If you build it, they will come,” is true to a certain extent, but once they get there, they have to want to stay there. The way that we found we could do that was by essentially letting the first few adopters run wild and do whatever they wanted with the stack. Obviously, that’s not sustainable long-term. But if you want folks to come in and co-sign and tell their colleagues and managers and teams about it, they need to feel like they're getting something out of it. You can't expect them to come in and just pick up your process and do a bunch of dev work and trust that they'll see results in a few weeks.
We said: "Okay, you track what you want, however you want. You can send it into Amplitude. We’ll give you access to that database. Do what you like, and you tell us what works and what doesn't work." And that was great! We had really involved teams that were truly helping us help them, because they were learning about the ins and outs of the development. They were seeing what didn't work and what did. They were telling us: "Hey, for page events, make sure you're tracking refer. Make sure that you're including it in this way. Make sure that you're setting up your path this way because otherwise it's going to look different from everybody else's."
(Want to try this freebie with your teams? Send them to Segment University!)
It was really good in that sense, but at the same time, it was apples to oranges to bananas to pineapples. There was no consistency between them. And if that team changed all of a sudden, nobody knew the data. So once we had buy-in and teams could see the value of it, it was easier to start moving towards a process and having criteria in checklists that teams needed to meet in order to move to production, because otherwise all this was for naught.
Madelyn: So you were creating a little bit of space for everyone to see what they could do and what their particular fruit would look like.
Arjun: Exactly. It was about being able to plant your own little tree and see what that looks like and then saying, “Okay, now it's time to really plant the forest.”
Madelyn: Arjun, to wrap up this discussion what are three takeaways you'd like to leave with the listeners as they think about automating away their jobs?
Arjun: Automating away your job is good. You might feel like everything should be automated immediately. That's not the case. Everything should always have a reason for automation. Just because you can doesn't mean you should.
Everything you're tracking should be tied to a KPI or to a metric or to some specific use case. If you've got a data point out there in the world and you don't know what it's doing, it doesn't know what it's doing either. So maybe stop collecting it.
Finally, use safeguards and governance. It's great to be proactive and to have processes and checklists to get teams through implementation, but you need to have guardrails. You need to have governance, because even the best laid plans can go awry. Be proactive, but also be reactive by having governance on your architecture.
Madelyn: Arjun, thank you for sharing your data-wrangling experiences from IBM and now AB InBev.
Arjun: Madelyn, thank you so much for having me.