Data and Analytics Strategy with Head of Data and Analytics at Google Cloud

Watch this CXOTalk episode featuring Bruno Aziza, Head of Data and Analytics at Google Cloud, as he delves into data strategy, governance, real-time data activation, and data products in the digital transformation era. Learn from practical examples, team collaboration tips, and steps to build a successful data and analytics strategy.


Apr 14, 2023

Watch this important CXOTalk episode for a discussion on data and analytics strategy with expert guest Bruno Aziza, Head of Data and Analytics at Google Cloud.

Bruno shares insights on the convergence of data and workloads, data products, building trust through governance, and activating real-time data for success in the era of digital transformation.

Here is more detail on the topics discussed in this episode:

Bruno Aziza is the Head of Data and Analytics at Google Cloud. He has helped companies of all sizes: startups, mid-size, and large public companies. He helped launch Alpine Data Labs (bought by Tibco), AppStream (bought by Symantec), SiSense (bought Periscope Data) & AtScale. He was at Business Objects when they went IPO (after acquiring Acta & Crystal Reports, & before SAP bought them for $7B). He was at Microsoft when they turned the Data & Analytics business into a $1B giant.

Bruno specializes in high-growth SaaS, enterprise software, everything data, analytics, data science and artificial intelligence. He educated in the US, France, the UK & Germany. Bruno has written 2 books on Data Analytics and Enterprise Performance Management. His allegiance is to the Analytics Community worldwide.


Michael Krigsman: Welcome to CXOTalk, Episode #784. We're discussing data and analytics strategy with Bruno Aziza, the head of data and analytics for Google Cloud.

Bruno Aziza: I've been in the data space for over 25 years now. There are probably three key main themes that just continue to come back to the conversations with customers.

Convergence of data and workloads

The first one is this idea of convergence, the convergence of data, and the convergence of workloads. Organizations are challenged with this idea of having siloed teams and siloed data, so they want to be able to bring all of that into a unified workflow.

Think about SQL. Think about Spark. Think about machine learning and BI and AI.

Organizations are trying to kind of figure out ways to get these techniques and data silos, but more importantly, people silos, to converge because they're realizing that the way to get the value is to get more people to collaborate more effectively.

Definitely, this idea of convergence is a very real one. You might have heard of the Lakehouse Concept. But the whole idea here is how do I bring people and data together. That's the first trend.

Importance of governance to build trust in data

The second one, which people, I think, are realizing now, particularly as they're modernizing to the cloud, is the concept of governance not as just a way to secure or protect access but as a way to have better trust in your data. And so, I call this Governance with a big G because it talks about, of course, all the basic stuff that you need to do (security access) but also lineage, quality, and being able to discover metadata effectively.

That's really important because, today, bad data costs the average organization a lot of money. I think there was research from Gartner that showed there was at least $12 million per year.

Activation and real-time data

Then the third one is around activation. It's great to have access to all the data and see all the data. It's really essential to be able to trust it. Now you need to be able to empower people to take action.

What we're seeing is people are moving beyond dashboards into what they started to call data products where they want the ability for their folks to get access to analytics and workflow in real-time without having to pay for what I call the data logistics, the work that you have to do to move the data across multiple places before it can be acted upon. The concept of real-time is an important one when you think about data products.

What are data products

Michael Krigsman: This concept of data products is becoming increasingly important. But I think it's a little bit confusing to people. Can you tell us? When we talk about a data product, what does that refer to?

Bruno Aziza: What is a data product? We run into the semantics of defining how you think about a data product.

Think about it from your perspective as a consumer of information today. I'm sure you listen to a lot of music. If you use Spotify today, for instance, that's a great example of a data product. Why is that? It's because it's a consumer-grade experience that is backed up by an enterprise-grade infrastructure.

You don't want Spotify to break on you because it's a very frustrating experience if it does. You can't find the song or the song doesn't play in time, and so forth.

That's a requirement. Customer-grade experience, very useful, very usable. It requires very little training while at the same time considering the concept of data that's being pushed into the application as a business-critical concept.

The second attribute of a data product is that it's a real-time product. In the past, if you've been in the data business, you've been in the data logistics business where you basically take data out of a source, maybe put it into a data mart or an extra, and then build a visual (typically a dashboard) that enables people to ask a certain set of defined, predefined questions.

A lot of the work that's happened in this industry has been about us, the data people, preconditioning, massaging the data, presenting it so people can ask a set of few questions. The product concept is about removing this idea of data extract, so you're now asking questions to the data in real-time, straight into the origination, and you're also opening the ability for people to ask just about any question.

What we're discovering is for every single question the business user has, there are another six questions they have after that. And so, one of the issues that dashboards typically have is that they're static and they've been preconditioned to answer only a set of questions, and so it creates also a lot of frustration for users, which doesn't work in 2023.

Another great example of a data product is Why does it work? It requires no training, a very simple interface, and a limitless amount of data of just about any format (structured, unstructured, semi-structured), and gives you an incredibly personalized experience at an incredible scale. Billions of people use that product.

When you're thinking about this mindset, designing data products is drastically different from building the data warehousing and the dashboards that we used to build—I want to say—ten years ago, but in fact, we're in the middle of this shift here, so I would even say three years ago.

Michael Krigsman: Then to paraphrase what you just said, the data product extends from what the user experience is, so it's the combination of the underlying data, consumer-friendly, potentially real-time, wrapped up in an attractive user experience, and backed up with the technology infrastructure to deliver that data (as just described) with that type of user experience.

Bruno Aziza: That's right. I think there are three concepts here.

There is the data itself, so it needs to be just about any data structured, unstructured, and semi-structured. It doesn't really matter. It needs to be data that's across just about any cloud, on-premises, and in the cloud. That's notion one – data.

Time is the second notion, this idea; it needs to be real-time to be useful.

Then people, just like you noted. Your employees, your stakeholders, or consumers first, and so they expect their data applications to mirror the consumer experience that they have on their phones. If you're very far from that, it just becomes really hard for them to adopt them.

Certainly, being available at scale, we know today that the dashboards of the past (or even the dashboards of today) are typically only adopted by 30% of your employees. You can imagine 70%, 7 out of the 10 people that you've built this dashboard for, will not look at it. And the consequence is there might be great insights, but they won't be able to act on them because they don't find the experience enriching, trustable, or even relevant or timely.

Michael Krigsman: Please subscribe to our YouTube channel and hit the subscribe button at the top of our website so we can send you our newsletter and you can stay up to date with our upcoming shows.

We have a really interesting question from Twitter from Arsalan Khan. This is a great question. He says, "Wouldn't asking questions of the data be dependent upon how much the user knows about the limits of the data? And so, therefore, don't you need enterprise education around the data?"

Bruno Aziza: Unfortunately, the technology up until now, I think, has limited our ability to create innovation across most organizations. I'll give you a very specific example. I think it addresses the question.

If you think about machine learning today and innovating with machine learning, it's a very expensive process. It requires dedicated talent, very specific understanding, and it also requires time that needs to be dedicated to assign a specific innovation to a specific talent.

What we are noticing is, in fact, when you now open up technology to many people, you actually discover that many of your business users have great questions, and we've limited the ability to answer these questions because of the technology limitation.

I think the answer of the future is kind of what Voltaire used to say, right? I'm French, as you might know, and I'm a big fan of French philosophy. Voltaire used to say, "Judge the person by the quality of their questions, not necessarily the quality of their answers."

I think, by opening access, opening the ability to kind of fail and experiment at a very low cost, you'll be able to identify those really, really good questions that business people should be asking – and they have the ideas. Somehow, we need to figure out a way to unlock that.

I think that if you're just keeping it to the people that know the limitation, that know the technology, that have the specialized skill, you have fewer innovations. And the ones that you are pursuing might not be the right ones, and they're expensive to pursue as well.

Today, I think the stat in the market is 20% of the machine learning models actually make it to production. You can imagine you've reduced the number of questions to just a few people, and even those, 80% of them just never make it to realization. It's a real issue, I think, in today's market where data should be a lot more available and innovation should be a lot easier than it is today.

Practical examples of data products

Michael Krigsman: Can you give us some concrete examples of data products in the enterprise and how that's distinguished from dashboards? We all know what dashboards are. But still, for me, the concept of the data product, it's still kind of abstract.

Bruno Aziza: Yeah. Let me give you a specific example you might relate to in the retail business or in the CPG industry – L'Oréal. Again, I'm sorry I'm picking on a French company, but that's just one example that came to mind. L'Oréal was realizing that they needed to have a better relationship with their customers and more frequent relationships so they can understand their preferences better and build better products.

Now, the way that they would interact with customers, in major part people would go to their website, but they would really have a strong relationship by people going to their stores. Go to their stores, visit the stores, talk to an advisor, understand your preference, and so forth.

They built an application called Modiface, which is essentially an application you get from your phone where you can point it at your face and it suggests maybe products that are relevant to what you're wearing. Maybe you ask for advice on, "Here's what I'm doing tonight," or "Here's what I'm doing today," so it creates makeup options for you. It creates all these options that are related to L'Oréal's products.

By having this really tight relationship with the customers where they went from four hours a year by people visiting their stores to now having a daily conversation with their customers, they saw a much higher level of conversion, 42% conversion between their marketing campaigns and people actually buying products.

They even further extended with that knowledge enabling their infrastructure, their physical infrastructure to enable customers to build their own makeup. So, you can, using Modiface, Michael, you can go and build your own lipstick, if you want, that is just the one lipstick that you've created yourself.

That, I think, takes the idea of an excellent customer experience because we know each other a lot better. And if that's your goal, then you can't achieve it by just waiting for your customers to visit your store.

Similar examples will exist in banking. Retail is not the only one. But you can imagine kind of the extension of data products and what data products would actually provide both for the provider of the service but also for those customers that are having a far superior experience – all of that done through data.

Michael Krigsman: There's this highly interactive aspect that's really tailored to what the customer is trying to accomplish very directly at that moment.

Bruno Aziza: That's right, and I think the character of personalization at scale is really important because I think it demonstrates that you as a provider of a service or product, you are really on the same page as your customer. You really understand them.

If you think about why people go to their bank or their grocery store, they're not just there to buy a specific product or a particular service. The next-level experience is they're there because we connect at the experience level. We know what you want, we know you, we can make your experience frictionless, and we can customize it for you.

If you think about Starbucks, it's a great example of mass customization. You're creating your own coffee. And so, the same concept here happens. Can your technology stack support these steps of experiences in the modern world where your customers are expecting that now?

Infrastructure requirements for data products vs. traditional dashboards

Michael Krigsman: Can you describe differences between the infrastructure necessary for traditional dashboards versus what's required to be in place in order to create this type of data product?

Bruno Aziza: There are a few components that are required. Typically, people refer to it at the modern data stack. What's required in order to work through a scenario like this where the real-time nature of the information is paramount to the experience?

I think, first, I'm going to use a few verbs here because I don't want to talk about just purely the technology but the verbs, I think, are what's probably most important. You know I'm a big fan of the theory of jobs to be done.

I don't know if you know this book, Michael, Competing Against Luck, by Clayton Christensen. It's a really good way to think about how people consume innovation, realizing that, as a customer, you're hiring someone to solve a particular job.

These verbs I'm using are there to define what these jobs are. The first one is about able to ingest and process data of any type, and being able to ingest it at any speed, so being able to ingest it in batch when it's relevant, but most of the time (using the same technology stack), being able to ingest this in real-time. The real-time nature of the ingestion, the transformation is extremely important.

The second verb is about storing and analyzing. Of course, you need to be able to store your data, any data, in about just the same infrastructure. Typically, most organizations today will have a data warehouse, will have a data lake, will have data marts, and will have dedicated places where they store data of different types and typically of different use cases.

Typically, we tend to think that a data lake is better for data scientists that want to work with unstructured data while a data warehouse might be better for business analysts that want to use structured information. What we're seeing is the convergence of these concepts into one big area, if you will, one big data ecosystem or estate that typically is referred to as the data ocean.

Why the ocean? It's because you want to build on infrastructure that allows you to see no end to your data.

Actually, a customer of mine taught me this concept. I was asking him why was he not talking about data lakes, and he told me that the lake is comfortable. It's landlocked. I see the end of my data.

The ocean is very representative of what I'm dealing with on a daily basis. I've got data across my estate. Sometimes I need data-sharing platforms because I've got data from my providers. Sometimes I get data in another cloud. And sometimes I've got structured and unstructured data that I need to analyze together.

The ability to treat data of different types, of different sources, in the same platform is really important not just for the technology itself but also for the teams themselves.

Today, innovating with machine learning typically requires that you move data to a different place, that you require a different team than your business analysts to work on it, and that's where collaboration breaks. You want to converge people into the same type of toolset and the same ability to work on any data from the one place.

Then the final one is the integration around an API-first structure for building these data products. The big, big difference between a dashboard and a data product, the data product is API-first.

The reason why it matters to be API-first is because you're going to want to ping external systems in real-time through APIs, and you're going to drive actions into other systems through an API, which you can't do in the old world of data extracts, static reports, filtering that maybe required action through a meeting or action through an email. That's really kind of the old way of doing business. Now everything is API-first, and everything is ingested through an API, and everything is acted through an API.

Those three areas, right, so ingesting, transforming, storing, analyzing, and then activating are probably the jobs, if you will, the verbs that I would use to describe what your modern data stack should look like.

Roles and team structure in data science

Michael Krigsman: We have a couple of really interesting questions from Twitter. Lisbeth Shaw wants to know, "What is meant by applying a product approach to data, and what does that look like from a data scientist perspective?"

Bruno Aziza: You know it's not just about the technology but it's also the team. What we've discovered is that while there's an incredible amount of technology that's available for you as a data team leader, to innovation, you also need to adapt your structure to this new challenge.

In the past, we were in the business of protecting data and making sure that only the data that was sanctioned could be used by only a few people. That was a cost conversation. You had capacity-based systems, and you could budget that.

Now you've got this relationship with data that's just a limitless amount of data, limitless amount of people, so how do you organize yourself around that? What we've found is the teams that succeed the best, particularly when you talk about data science, typically have five key roles that they organize themselves around.

The first one is the data product manager. So, just like it is in any software development organization, you have someone whose job it is to say, "Here's what the final product is, and I own the result of that from the origination of the data all the way to the consumption of that data itself," and the data product manager tends to be the right role that people hire.

The second one is the program manager. The program manager is the person that's going to partner with their product manager, and they're going to make sure that everything happens on time and per the requirements that were written by the product manager. Typically, a product manager might use a product requirement document or PRD to define what their product should be.

The third role is a UX leader, so we always think about as being very logical. If you're a data scientist yourself or business analyst, you imagine that your stakeholders will react the same way to the applications you're using. But what we're realizing is that's actually not the case.

A big reason why they're not adopting our application is because the UX or the UI needs to look a little different. And so, we always think the best UI is no UI. Think about how can you make it easy for people to consume the application.

Then the fourth role is the data engineer and the data scientist. They tend to work and design and build and deploy the infrastructure but also the applications around all the verbs that I talked about from data storing, analyzing, processing, and so forth. Their job is really about how we make it possible to activate these products at scale.

Then finally, the chief data officer. The great news here, if you're working at a data organization today, Lisbeth, I hope that there is a chief data officer where you work because now 83% of organizations have chief data officers. Ten years ago, only 10% or 12% of organizations had chief data officers.

There's great news for us here because that means that part of your team is someone in the executive suite who is reinforcing this need of this is how we act today. This is why data is now becoming an incredible asset for saving money and making money for the organization.

I know, Michael, I might have over-answered the question here, but I thought it's very relevant for how you think about data science and the other roles involved in the data team.

“Data scientists are lonely”

Michael Krigsman: Absolutely. You're pointing out an extremely crucial aspect of this, which is data science does not exist as an abstract concept. It's not a platonic concept. It needs to be actually executed with people by people.

Bruno Aziza: And it does not exist just in isolation to the rest of the team. I think that's really what sometimes makes the data scientists lonely is that they feel like they're just the only one pushing for this agenda. Really, look to be included in the larger team across these roles that I talked about here.

Michael Krigsman: I'm just tweeting out "Data scientists are sometimes lonely." [Laughter]

Bruno Aziza: It's a difficult profession because I think one of the issues that we deal with is people don't understand how hard of a job working with data can be. And so, if you're a data scientist, you're highly qualified. You almost sometimes have this curse of knowledge where you know how hard it can be, but then you might have business users that have expectations of delivery in a timeframe that maybe your current stack can't allow you to deliver, so you really need to modernize your stack, modernize your team so you can develop solutions at the speed of the business.

Data science in small organizations

Michael Krigsman: We have a couple more questions coming in. I'm just going to take these in order. Arsalan Khan comes back again with another really, really good question. He says, "It seems that to accomplish the things you're describing requires a lot of resources. Small organizations don't have those resources, so is this becoming a kind of have and have-not situation of data?"

Bruno Aziza: This might have been true (this assumption of working with data at scale) in the old days where you did have to buy very expensive, on-premises systems, and you needed people to support them as well. It wasn't just that you needed to spend a lot of money in software itself, but you also needed to spend a lot of money in the team supporting that software or that hardware. In the on-premises world, I think that was very true.

I think what we have noticed over the last ten years is really a complete shift where now you can get access to an incredible amount of compute, an incredible amount of storage for pennies, and you can get it done in the next five minutes.

I'll just give an example. BigQuery has a sandbox. You don't even need a credit card to start working with a data warehousing solution at scale, hosted in the cloud, managed for you. So, one person in five minutes can do something that maybe ten years ago would require a team of ten people and a procurement process that could take six months.

This ability to technology has really been incredible in the last ten years. Similarly, machine learning now is more accessible than it has ever been.

If I look at products like BigQuery machine learning, for instance, that is a SQL way to interface with machine learning models. That was never done before. So, in terms of the ability to hire someone that can write machine learning models, that's a process by itself. It is so much available today that I think it's also changed the way we, as data people, need to think about what we're building.

In the past, when capacity was not available, it would take a long time to procure, and it would take a lot of money to build a team and buy the software required. The question that we would ask is, "Can we build this?" You have an idea and you say, "Can we actually effectively do this?"

Today I think the question is, "Should you do it?" because so much technology is available that now we've got a lot of solutions that are built where they are built because you can. And so, it requires a different mindset, I think, for a lot of teams today that is completely a different one than it was maybe ten years ago.

But the good news here is because you can essentially build anything today, the cost of experimentation just goes down to a ridiculously low cost. Now more ideas can be seen.

Michael Krigsman: It is amazing, the power of the cloud to provide these kinds of resources to anybody at extremely low cost and immediately as well.

Bruno Aziza: That's right.

Michael Krigsman: Incredible. I have another really interesting question. You can see I love taking questions from Twitter and from LinkedIn. You guys in the audience are so smart, so intelligent. It's awesome.

Data products and digital transformation

This is from Chris Peterson. His Twitter name is also on Mastodon. Okay, Chris. Chris wants to know, "Would you say that data products are some of the main goals for your customers' digital transformation efforts?" Or to ask it another way, what's the intersection between these data products and digital transformation more broadly?

Bruno Aziza: I would say that they are tightly related because part of your digital transformation is the modernization of your technology stack. But I think probably most importantly your processes.

I think what we notice is the companies that tend to just do a lift and shift, meaning take the process that they have and the applications they have on-premises and just essentially replicate them in the cloud. I think what they're realizing is there is a good amount of missed opportunity if you do it that way.

I'll give you an example of a recent customer I was talking to has thousands of reports on-prem and they're looking to modernize. The first approach was, you know, there's got to be a reason for why these reports are there, so I'm just going to take every single one and build a project plan that's going to copy what's happening on-prem and now I'm going to put that in the cloud.

What we looked at when we saw the metrics around these reports is it was a third, a third, a third. What I mean by that is a third absolutely could be lifted and shifted into the cloud and it made a lot of sense because they had a lot of users. They were effectively at the core of the business being run by this organization.

A third needed to be modernized because they had effective metrics that needed to be seen, but they had issues with usability, availability, reliability, so they clearly needed to be modernized, kept and modernized, if you will.

But there's a good third that nobody cared about. They were the wrong reports. They had their own semantics. Nobody used them. They were essentially not as useful as the other two categories. Those, we never moved them or modernized them.

I think where it's similar is, when you think about digitalization, it's not a technology problem. What you're really looking to build is a faster, bigger, stronger organization that is closer to your customers that has better operational efficiency. And if that's the case, then it does require that you have to think about what we're doing today is not what we're going to want to do tomorrow.

Maybe we're solving the same problems, but we're going to solve them differently. Digitalization really is about transformation, not so much just replication.

Team collaboration in data science

Michael Krigsman: You've emphasized the importance of teams and team collaboration. You described five important roles that a data product team should have in place. But can you elaborate on why this concept of collaboration and working together becomes so important with this environment particularly?

Bruno Aziza: Often we get enamored with the incredible amount of innovation being thrown at us. Every week, there's new technology being announced that we think is going to change the world.

I was reading research recently. For every dollar spent on employees in the technology organization, there's anywhere between $0.30 to $0.60 spent on tools. And so, the reason why employees are extremely important to your technology strategy is you can see from a spending standpoint, they are the majority of the spend.

Now, of course, there are some industries that have higher tech spend, right? Financial services, the technology to healthcare organizations, we found tend to spend more on tech whereas retail and e-commerce might not.

It's going to be different in your industry, but the point is the biggest silos that you have in your organization is actually not the data silos. It's the people silo. And so, being able to figure out how to get teams to collaborate is how you're going to be able to create more innovation.

Where we see it is the places where it does not work is when you have divided teams around disciplines. You have your data scientists over here and your business analysts over there. They're using a different tech stack. They're using different data. They're using different languages. And for them, collaboration is impaired by design. breaking through that is extremely important.

The other reason for why employees is really important is, I think, in our business, even though we're logical and we think that, because we're data people, everything is decided through logic, in fact, that's not how we make decisions. We're emotional people that make decisions that is often justified by data.

We're not logical people. The emotions modify the decision. A great example of that is culture, affecting data culture inside an organization. It's the number one issue for an organization. To be data-driven, you've got to think about that.

Culture is not what you put on the billboard on the wall. Culture is what you do. And so, you have to be able to exemplify. You have to be able to organize your teams in ways that is aligned with your culture and your principles. That's why we spend a lot of time looking at organizational design because that is ultimately what's going to make the difference.

Michael Krigsman: Arsalan Khan comes back yet again. Just before you mentioned culture, he asks – perfect timing—

Bruno Aziza: [Laughter]

How to build a great data culture

Michael Krigsman: He anticipated your comments, and he says, "Okay, Mr. Aziza. What is the ideal data culture and how do you create it?"

Bruno Aziza: I don't know that I have... I wish I had the formula for here are the five things you need to do. But I will tell you what I'm learning from working with customers. You think about Mercado Libre and WPP.

Adrian at Mercado Libre or Di Mayze at WPP are examples to follow on how they've built an incredible data culture across their organizations. There are a few things that they do, and I can share with you what they're doing so you can get a sense.

The first one is they do have data culture principles. That's a required process. You have to go and state this is how we think about data at this organization.

But they don't stop there. Of course, they have the posters. But then what they do is they use practices by which the employees are exposed to this culture on a daily basis. There are a few tactics.

The first one is the easiest thing that you can do today, and it sounds really simple. T-shirts, hats, mugs: brand your data culture initiative and make sure people use that in their daily meetings.

I had a customer, for instance, that was in the gambling business. They called their culture initiative DICE because people would recognize, "Oh, DICE. That's what we work with on a daily basis."

I asked, "What's DICE?" He explained to me it was data integration center of excellence. He came up with an acronym that defines what they're focused on (in this case data integration) but also it's catchy.

That's the first thing. I know it requires us to be a little bit more creative, but it actually makes a big difference.

The second one is they actually hire people to help with data literacy and data enablement. In the case of Adrian at Mercado Libre, he's hired someone who has been in the training business to come in and enable people on using the solutions to data products that they've built.

Of course, we're not talking here about people building data products. We're talking about people consuming data products in an effective way in their daily workflow. That does require a lot of work with the assumption that the people, their shareholders are not expected to be data creators necessarily. You just want to create consumption patterns that are more effective for your organization.

Then the third thing that they do is they do what we call decision-making introspection. Often, you have a practice of postmortem. You go through a process. A decision occurs. It's not yielding the exact results you expected, and you want to assess why that happened.

A lot of the organizations now that are trying to drive this data culture, this data first culture, they're doing premortem. Before we go out and make a decision, let's get in a room and imagine everything that could go wrong because there's nothing worse than reacting, and so they want to proact basically on something they imagine could go wrong.

These are simple examples. They're not simple to implement, but they're simple tactics that really reinforce this idea of culture is what we do; it's not what we say. Where do you see those examples throughout your day where you are effectively living your culture?

Michael Krigsman: Clearly, there is a very strong intention and recognition upfront of the importance of culture and, therefore, it is not left to be, "Oh, well, it'll just happen," or it's not left as an afterthought.

Bruno Aziza: That's correct. I think the intention in having the chief data officer and the CEO, ideally, lead with data and lead not in a suggestion, not in sponsorship, but leading as a mandate and realizing data is the way we get this company to the next level. And this means some habits, new habits, are going to show up in ways that maybe we didn't think about before.

Data quality creates trust and confidence in the data

Michael Krigsman: Where does data quality come into play now with data products?

Bruno Aziza: It's probably the most important component because if you think about why is it that people are not adopting the dashboards of the past, of course, there are usability issues. Of course, there is a timely issue. Of course, there are all the issues related to just the infrastructure itself and maybe the skill set required to use these dashboards.

The number one reason why people can't make decisions based on the data that's provided to them is because they don't trust them. They just don't trust the data. And trust is the basis of just anything we do.

If you even think about your team, what makes a team perform is the element of trust. It's that you can communicate to them that you trust that they will actually achieve the opportunity in front of them. And the same thing for data.

This is not a new problem. I was actually discussing this recently. I Googled the principles of data quality.

I found a 1991 paper from MIT. In fact, the author is Richard Wang who is the person that organizes MIT's CDO symposium. He was listing the 20 attributes of data quality.

Now, there are a lot of ways that you can think about data quality, as you can see, but I would say there are probably three things that matter the most.

The first one is, what do you know about your data today? Is it complete? Is it fresh? Is it rich? Is it secure? The richness of the data really enables you to trust it even more.

The second item is, how does it relate to your people? Are people able to find this data, or is the way it's presented to them relevant to them?

As I said earlier, we're emotional creatures making decisions. We need this to be relevant to us. The nirvana of any data is when it tells you something about yourself that maybe you don't know about yourself. The aspect of being relevant and personal to the person is really important.

Then the third attribute is, is it actually actionable? Is it something where I'm telling you, Michael, "It rained yesterday. You should have gotten an umbrella"?

Well, that's accurate information. You've got great action, but there's nothing you can do about it. It rained yesterday.

This idea of being timely, understandable, and actionable is a key component on how you assess data and how you assess and trust the value of the data that's being provided to you by your organization and your data teams.

Steps to build a data and analytics strategy

Michael Krigsman: What advice do you have for organizations who are trying to build a data and analytics strategy in line with the principles you were just describing?

Bruno Aziza: The first one is start with people. Don't necessarily just start with technology. I know that's probably surprising because I work for a technology company. But I really do believe that it is the way that we are going to be able to help each other innovation. And so, this concept of the five roles across your data team is a really critical one.

Look at the ratio of data people you have inside your IT organization. That's a really big one when you are thinking about how many people do I need.

The average organization that has anywhere between 2% to 6% of the IT group is data people. The most mature organizations have anywhere between 15% to 18% of their IT people are data people.

I would benchmark that. Just look at what your organization looks like.

Second, I would look at build for scale today. Many of the organizations, I think, that are making a mistake with data is that they might build for small data today saying, "You know what? Big data is not relevant to me. I'll worry about this in three years."

Unfortunately, what happens is the choices that they make on small data infrastructure do not scale economically and financially for the large data issues that they are having in fact a lot earlier than they thought they had. They're buying into a model that just doesn't help them succeed months after they've made the decision.

Then the third bit is focused on integration and unification not just of the technology but the teams. How can you get your teams to collaborate more often at a higher velocity on more problems?

We talked about making machine learning approachable. Please don't just make it a thing that only a few people inside your organization can do.

As we see, it doesn't work. Twenty percent of machine learning models make it to production. That's not how we're going to innovate. We've got to be able to get more people to ask more questions so we can get to which are the most important questions to solve, the ones that are most valuable, the ones that move the needle the most.

Michael Krigsman: Okay, and with that, we have covered a lot of ground today. A huge thank you to Bruno Aziza. He is the head of data and analytics for Google Cloud, and he runs a great (what he calls) a carcast. It's like a video podcast but from his car. He always wears sunglasses.

Bruno Aziza: There you go. And I have a pair for you as well, Michael, when you become a guest on my carcast.

Michael Krigsman: I accept those virtually. Bruno, thank you so much for taking your time to be with us. I really, really appreciate it.

Bruno Aziza: Thanks for having me. It's always fun to talk to you.

Michael Krigsman: Everybody who watched, thank you for being here, and especially those folks who ask such awesome questions. Now, before you go, please subscribe to our YouTube channel and hit the subscribe button at the top of our website so we can send you our newsletter and you can stay up to date with our upcoming shows.

With that, check out for the upcoming schedule, and we will talk with you soon. Have a great day, everybody

Published Date: Apr 14, 2023

Author: Michael Krigsman

Episode ID: 784