Join industry experts Iavor Bojinov, Assistant Professor at Harvard Business School, and QuHarrison Terry, Head of Growth Marketing at Mark Cuban Companies, as they discuss the complexities of AI failure in CXOTalk episode 778. Gain insights into the types, causes, and differences from traditional IT projects.
Join us on CXOTalk episode 778 as we delve into the complexities of AI failure with our guest, Iavor Bojinov, Assistant Professor at Harvard Business School, and QuHarrison Terry, Head of Growth Marketing at Mark Cuban Companies. In this episode, we explore the various types of AI failures that can occur, the causes behind them, and how they differ from traditional technology and IT projects.
Our experts also address the implications of these differences and distinguish between technology failures and human errors that can cause AI failures. In the second segment, we discuss solutions to address AI failure, including practical advice on how to avoid it in the first place. We also touch on the crucial ethical and privacy considerations associated with AI.
This is a must-watch episode for business leaders building an AI group within their organization. Our guests offer valuable insights into AI governance, which can help prevent failures and maximize opportunities.
The conversation includes these topics:
- Understanding AI operations and ai project failures
- Differences between AI projects and traditional technology or IT projects: AI is probabilistic
- Five steps of an AI project
- The shift from rule-based expert systems to probabilistic AI
- Implementing AI in small businesses
- AI pilot projects should have direct implications on revenue
- Categories of AI failure
- Successful projects integrate AI into processes and business operations
- Data sets are a key difference between traditional IT projects and AI projects
- Digital Transformation vs. AI
- Ethical considerations of AI
- Principles of privacy by design
- Addressing bias in decision-making based on data
- Governance and regulation of AI
- Advice to business and technology leaders on preventing AI failures
Iavor Bojinov is an Assistant Professor of Business Administration and the Richard Hodgson Fellow at Harvard Business School. He is the co-PI of the AI and Data Science Operations Lab and a faculty affiliate in the Department of Statistics at Harvard University and the Harvard Data Science Initiative. His research and writings center on data science strategy and operations, aiming to understand how companies should overcome the methodological and operational challenges presented by the novel applications of AI. His work has been published in top academic journals such as Annals of Applied Statistics, Biometrika, The Journal of the American Statistical Association, Quantitative Economics, Management Science, and Science, and has been cited in Forbes, The New York Times, The Washingon Post, and Reuters, among other outlets.
Professor Bojinov is also the co-creator of the first-year required MBA course “Data Science for Managers” and has previously taught the “Competing in the Age of AI” and “Technology and Operations Management” courses. Before joining Harvard Business School, Professor Bojinov worked as a data scientist leading the causal inference effort within the Applied Research Group at LinkedIn. He holds a Ph.D. and an MA in Statistics from Harvard and an MSci in Mathematics from King’s College London.
QuHarrison Terry is Head of Growth Marketing at Mark Cuban Companies, a Dallas, Texas venture capital firm, where he advises and assists portfolio companies with their marketing strategies and objectives.
Previously, he led marketing at Redox, focusing on lead acquisition, new user experience, events, and content marketing. QuHarrison has been featured on CNN, Harvard Business Review, WIRED, Forbes and is the co-host of CNBC’s Primetime Series – No Retreat: Business Bootcamp. As a speaker and moderator QuHarrison has presented at CES, TEDx, Techsylvania in Romania, Persol Holdings in Tokyo, SXSW in Austin, TX, and more. QuHarrison is a 4x recipient of Linkedin’s top voices in Technology award.
Michael Krigsman: Today on CXOTalk, we're exploring the interesting, fascinating world of AI failures. What is an AI failure? How do we recognize it? Most importantly, how do we avoid these things?
Our guests are Iavor Bojinov, Assistant Professor at Harvard Business School, and my guest co-host (welcoming him back) is QuHarrison Terry, who is the head of growth marketing with Mark Cuban Companies. Iavor, just briefly tell us about your work at HBS.
Iavor Bojinov: My work at HBS is around this new field that's emerging on data science and AI operations. What the field is studying is how companies should integrate and deploy AI and data science into their operations. That, in a nutshell, is what all of my research has been about (in the past three or four years) since I joined HBS.
Michael Krigsman: QuHarrison Terry, welcome back to CXOTalk. It's always awesome when you are a co-host with me.
QuHarrison Terry: This is a special topic that's near and dear to myself because every day I find myself thinking about the future and the impact it's going to have on our lives. Where is all this stuff going? It's going to be a great show; me, you, and Iavor. We get to talk about AI.
Michael Krigsman: Iavor, you're researching AI operations When we talk about an AI failure, what exactly do we mean?
Iavor Bojinov: The definition of AI failure is when AI fails to deliver on the promise. Now, if you want to get a little bit deeper into this, I think it's important to make a distinction between the different AI applications.
At a high level, there are basically two different types of AI. You have internal applications and you have external applications.
Now, external applications, those are the algorithms that companies deploy that their customers see. Think like the Netflix recommendation algorithm. ChatGPT is a great example.
The lift matching algorithm. These are algorithms that companies deploy to their customers. It's usually a part of a particular product or just on their own. ChatGPT is just a deployment on its own.
Internal-facing AI projects, those are usually called data science, and they're really for the employees. They're designed to improve the operations of the organization.
For example, this could be some automation in the factory. It could be some recommendation system to help your sales associates prioritize different leads, et cetera.
When you look at failure within these two different contexts, they look quite different. If you're looking at internal applications of AI, then the failure is around failing to achieve operational efficiencies or gains. If you're looking at external applications of AI, the failure is around failing to deliver on revenue growth or cost-cutting.
That's the big distinction that I think is really important and I'm sure we'll touch more on as we continue this conversation.
QuHarrison Terry: How do you see AI projects differing than just the general technology projects that an organization is going to take on?
Iavor Bojinov: I think the big difference is that AI is probabilistic in nature. That means that it's wrong. Whereas traditional technology or IT applications, they always give you the same output.
Of course, AI has all of the challenges that come together and come from the IT applications, but they have this additional threshold of, if you ask the same question twice, you could get two different answers. Very often, those answers could actually be wrong.
That adds an extra layer of complexity that you have to deal with, and that requires a lot more education because it's one thing to say, "Use this software and always use the software." It's another thing to say, "Use the software when it gives you the right recommendations but overrule it when it's wrong." Right?
That requires a lot more judgment and a lot more education and understanding. There's lots of overlap, but there are some really interesting and unique aspects when it comes to AI.
Michael Krigsman: With traditional IT projects, we can say that they were mechanistic. We had a body of software and hardware that we needed to implement. We had a defined set of data (employee records or whatever it might be). And we had to get the software installed, get the data running, and roll it out to our employees.
The results were known, were very clear. Failure looked like not meeting expectations or maybe the project was over time or over budget.
AI is very different because the results are so open-ended. You have the project aspect, but then you have this black box algorithm and data. What about that? How do these pieces overlay?
Iavor Bojinov: There's one additional thing, which is algorithms change over time. Even if you get your project, your AI project done on time (you deploy it, you scale it, people are using it), six months down the line, the predictive accuracy that you had could have plumped by half, or maybe some new biases could have been introduced. Basically, you have all of the potential IT failures with a whole layer of complexity around the development of it, the evaluation of it, the deployment of it, and even the management and sort of monitoring in the future.
I think, when it comes to AI failure – and we'll talk a little bit more about this, I'm sure, in a minute – I think there are different areas where it fails. It's really important to understand what an AI project actually looks like. Then you can try to identify each of the failures within those steps of that project.
QuHarrison Terry: We call it the probabilistic failures of AI. It's an interesting way to bucket it.
You almost think about AI projects as something that a company endeavors in. You just almost start off and say, "Hey, we're going to fail. But if we don't fail, the outcomes or the probabilities are going to be drastically different than anything we would get within our organization just by our own pursuits." Is that the way you envision that?
Iavor Bojinov: Absolutely. Then the other thing is the bar for AI is much higher than for humans.
Think about self-driving cars. That's an AI. If a self-driving car gets into an accident, that makes mainstream news.
If a self-driving car makes a mistake, that's headlines in the New York Times. If a driver makes a mistake, that doesn't even get mentioned on the local news.
The bar, the threshold that AI has to overcome in order to be widely deployed is just much higher than what humans have to overcome. That's what I mean.
That's because AI is sometimes wrong, and people are not used to the fact that they're using some sort of tool that nine out of ten times would just give you the wrong suggestion. But that's what AI does. Those nine times, it's really good and it's much better than what you could do on your own. But sometimes it's just categorically wrong.
That sort of changes the nature of how a person is interacting with the AI. The level of education that they need to have, the level of understanding that they need to have, and the level of autonomy they need to have is very different compared to just a typical software or typical IT product that just says, "Okay. Every single day, this is what you do."
It's completely deterministic. There's no randomness. That's what I mean by the probabilistic nature.
Qu, does that help with the intuition there?
QuHarrison Terry: It makes a ton of sense. It's quite fascinating, too, because when you think of AI and the bar that's set for it, I almost feel like that's just for the next decade or so. Eventually, we won't make a big deal that AI is running a lot of the operations.
I think back to when I was in grade school. It was a big deal to use spellcheck. Right? You didn't want to use a variant of Microsoft Word that didn't include spellcheck.
When you did use that variant, you knew the difference. You were like, I might have grammar issues, or I might have spelled that word incorrectly.
Now, today, spellcheck is ubiquitous across all of our devices. It's not software specific. It's not even platform specific.
If you use iOS, Android, Windows, you're going to get some variant of auto-correct and the bar isn't that high, like you said. It's just kind of accepted, even though it's not that good.
Michael Krigsman: Be sure to subscribe to our YouTube channel and hit the subscribe button at the top of our website so we can keep you up to date on the latest upcoming live shows.
When I was studying traditional IT failures (going back some years now), really the primary causes of failure had to do with less about the technology and more with poor communication among stakeholders, mismanaged expectations among the customer, the software provider, and the services provider, the system integrator. But those are project aspects.
With AI, it seems like all of those pieces are in place. But then again, layered on top of it, you have additional levels of complexity. Again, is that a proper way to look at it from your view?
Iavor Bojinov: I think it's really helpful to break down an AI project into several distinct areas. I think, within each area, there's opportunity for failure. Let me do that right now, and then we can spend a little bit of time talking about each of these five steps.
The first step is the project selection. This is when you think about which algorithm or which operation process should we try to digitize and maybe automate with an algorithm. This is really about there could be potentially 200, 300 different projects you can go after, and this about picking the right project that is going to be both impactful and feasible.
The second step is the actual development of this. This is when you take your idea, you get the data, and you build that prototype to check if it's going to work as you expected.
The third step is the actual evaluation of this. This is when you've built your algorithm, you've tested it on historical data, and now you're going to deploy it on real people and see if it's having the type of impact you hoped that it would have.
Now here, a lot of companies fail to do this step very carefully, and they should be way more careful about it because there are studies coming out of places like Google and Microsoft that say something like 70% to 80% of everything the company tries has a negative or neutral impact on the very same business metrics they were designed to improve. Chances are whatever you've developed doesn't really have that much of an impact on your customers or the users.
The fourth step is then the deployment and the scaling of this to 100%. Usually, in your pilot, you're going to start off maybe with one team, one product area, maybe 5% of your customers. But at some point, you need to launch this to absolutely everyone.
Then the final step is the management of this, which is you continue to monitor. You watch for biases. You watch for drift. You watch for all the other things that could go wrong over time.
These are the five steps of an AI project. Then within each of those steps, there are lots of ways the project could fail. I think, in some places it's similar to AI failure, but in some places of these five steps, there are some very unique things about AI failure.
Michael Krigsman: Let's jump for a moment to Twitter because I love taking questions from the audience. You guys in the audience are so smart, so bright, and we have a couple of questions that have come up. Let's jump there, and we'll come back and look at what you were just describing.
Chris Peterson says, "It sounds like AI today is nothing but machine learning based on neural networks. In the past, industrial AI projects were based on other technologies such as rule-based expert systems, and they create a tremendous value but they didn't have that probabilistic element. Is that just gone now, or is there still a place for these systems?"
Iavor Bojinov: I think, by and large, these systems have gone out of practice because we've shown that, with probabilistic AI, we can do much better. There are definitely some areas where these are still deployed. I think it's sort of in manufacturing there are some rule-based systems. But really, we've sort of transitioned to this branch of machine learning.
I love this question because even a few years ago, I would not call this AI. I would call this machine learning. But over the past three or four years, there's just been this transition to use the term AI to basically mean not artificial intelligence, not in the way the computer scientists thought about it in the '60s and '50s when that term came into being.
It's really about any process that is now being automated and a computer is handling, that's usually referred to as AI these days. That kind of encompasses machine learning, and that could be for neural networks, or it could be as simple as a regression. A simple linear regression for many companies is now AI.
QuHarrison Terry: Yeah, yeah, yeah. We've got a question from Twitter that is actually quite good. Arsalan Khan is wondering, "If AI is being constantly updated," the data, the algorithms, et cetera, "this requires a ton of resources. If you're a small business, you might not have the disposable income to throw at these big projects, and you'll be at a disadvantage in unlocking the potentials of AI. What should you do in that case?" What are your thoughts on that, Iavor?
Iavor Bojinov: I think this is a big misconception. You don't need to throw large amounts of resources for the vast majority of machine learning algorithms.
Of course, there are a handful of AI algorithms that need lots of data, lots of compute. These are things like image recognition, video ranking.
The large language models that we've seen like ChatGPT, they need lots of data, lots of compute. But actually, for most other tasks, you really don't need much. You can run everything on your basic laptop with a single core within a few seconds.
Modern computers are really, really powerful. So, for most of your AI applications that can add real value, they're not the ones that are doing image recognition. They're the ones that are maybe helping you figure out which lead you should pursue, maybe which sales lead, or maybe they're looking at a bunch of your products and identifying which of your products are performing really well on Amazon and which of your products are performing really poorly on Amazon. That gives the information that you need to go and change those SKUs and maybe change the description, add some more images, et cetera.
To do that, I can do that on a laptop within not even five seconds for that algorithm to run. I think that's a big misconception that you need lots of resources. You don't.
QuHarrison Terry: If you were getting started in implementing AI in your org, what area do you think is best positioned for a small company (outside of what you've just mentioned)? Just specifics, if you have something top of mind.
Iavor Bojinov: It really depends company-to-company. The first thing is to definitely get started. Right? [Laughter] You have to pick something.
Usually, I would try to focus on anywhere between two to five pilots, and you want these pilots to have direct implications on revenue. This could also be cost-cutting. It doesn't have to just be revenue. It could be going after profits.
If you're not going after those projects that are going to have a big ROI, you're not really going to see the value. Find those two to five projects that are both feasible (meaning that you have the data, you have the skill, you have the infrastructure to implement this) but also going to be impactful.
QuHarrison Terry: If I'm a leader, I'm in a leadership position, I don't have my team, I'm hearing you tell me I can possibly implement AI to either make more money—
Iavor Bojinov: Yeah.
QuHarrison Terry: –or cut costs. This sounds like about the time you should get started, you should start developing these things, if I'm hearing you correctly.
Iavor Bojinov: Absolutely. Here's the thing. You don't have to go this alone.
If you are a company that has no expertise in these, there are hundreds of different companies out there that are offering some AI-based solution.
Now, you have to look at those carefully and cautiously because a lot of them are just trying to sell you snake oil. But there are definitely some out there that can really help you transform, and you can do that without having to build a lot of these resources in-house.
Now, that's a great way to get started (maybe for one or two projects). But if you want to transform your organization and really become an AI-first firm, you're going to have to hire those people in-house, and you're going to have to really transform both your business and your operating model.
But to get started, really, you don't need that much. You could do this. Some of this you could do yourself. There are a lot of no-code AI solutions where you basically have a sheet that looks like Excel, and you put in all of your sales leads.
You say, "Hey, here are the sales I had in the last quarter. Here are all my leads right now. Can you just rank them for me?" It will just build the model.
You don't even need to worry about what model it's building. You'll build a few different models, and it will say, "Okay, these are the top ones that you should really call," and then you can use your own judgment to really check: Do they agree with my intuition? Do they agree with my sales leads?
For that, you don't need an expert. You just need to understand the problem, and you need to get the data and find an external solution that you can leverage.
Michael Krigsman: Can we break up AI failures into two buckets? Number one is the project aspect.
Iavor Bojinov: Mm-hmm.
Michael Krigsman: The project is on time. It's on budget. It meets the stakeholder goals, whatever the outcome is. And number two is the unexpected results arising because it's AI. Can you break it up and how would you think about it in those two separate ways?
Iavor Bojinov: I think the first category is absolutely right, which is around does the project fail to really show value. The second one, as you said, is it doesn't have the predictive accuracy we were hoping for. We don't even have the data to train the models to the accuracy. We don't have the infrastructure to deploy it.
Then there's a third element, I think, that is important here which is the human element, which is, we built it. It works really well. But the human doesn't want to use it.
I think you see this a little bit in IT with the change management. But here, the usage and adoption is different because, in IT, it's just, "Do you use this or do you not use this?"
With AI, it's, "Do you use this when the recommendations are correct? And do you not use this when the recommendations are wrong?"
There's that additional piece of it, and I think lots of companies will fail here because they'll try to build an algorithm that is just such a black box that people just don't understand it. They're not involved in the process. Then they just don't know when to use it, when not to use it, what are the risks associated. And that side, I think, is also a big failure that many companies fall into.
QuHarrison Terry: Do you really believe that the user experience of how you integrate these AI suggestions or even outcomes, is that going to play a part into a company being successful or inevitably run into a failure at some point?
Iavor Bojinov: You can have an amazing user interface. But if you don't integrate it into the operations, it still would fail.
Let me give you an example from healthcare. In healthcare, there are so many AI companies that are trying to solve whatever it is.
Very often, the people who are involved in those companies are not physicians, and they don't understand the typical process flow of a physician. What they will do is they will come to the physician's office and they'll say, "Okay, here is an iPad with this beautiful UI. Once you've examined the person, I want you to type up all of these things into this UI. We won't integrate into Epic. We'll just have this beautiful UI. It's been tested by 100 people. Go and use this."
But it's not integrated into the actual process flow of that physician, so they're not going to use it. That's, I think, something that people very often forget is the actual process that individuals go for in their work, and you need to make sure that the AI solution fits into it.
Then the other thing – and, Qu, what we were talking about a little bit earlier as well – is there is that really, really high bar, and people panic when they see a mistake that is being made by the AI.
There's lots of research on this where people have run these experiments. They'll have an AI that's able to diagnose cancer at a much higher rate than an individual clinician, but the second that clinician sees an example of the AI being wrong in a very obvious way where the clinician looks at it and goes, "Oh, no. It's really obvious. If you look over here, it's clear this is cancerous," then it is immediately shut off, and they're like, "I don't want to use this. This is wrong."
You have the UI. You have the process integration. Then you have the trust piece, which is very easy to break in these types of AI examples.
Michael Krigsman: Data Rebel on Twitter comes back and says, "To help understand the difference between AI and IT projects, look at the data sets." That seems like a reasonable point.
Iavor Bojinov: Yeah, absolutely. In IT, the data sets were usually very fixed.
Michael, you gave the example. If it's your employee data, it's a very nice, rectangular table that has employee ID. It has how many years you've been at the company, and so on and so forth.
With AI, the data sets tend to be much larger and they don't tend to be structured in the same way. We could have text. We could have images. We could have videos. All of that is being pooled into one area, very often stored in something like a data lake, and it could be completely unstructured. Yet you want to use this for some sort of prediction.
The cancer example I was giving you, that's not looking at the patient's height and age. That's looking at MRI scans. It's looking at lots of other pieces of information, and it's trying to build a model. The model is just so much more complicated and integrates a wide variety of non-structured and structured data to make those predictions. I think that's a really, really good observation.
QuHarrison Terry: We have another question from Twitter from Arsalan Khan. It revolves around this whole concept of AI.
Iavor Bojinov: Yes.
QuHarrison Terry: A term that we've heard in the business industry for a long time, known as digital transformation, should we consider the use of AI as a digital transformation trope or are they entirely differently different things in your opinion?
Iavor Bojinov: Digital transformation is around taking your operating model and trying to digitize it. Going from analog and having hand-written notes and files to having those be digital in maybe some sort of searchable environment.
AI is, I think, sort of the next step after that. You need to be digital in order to fully leverage AI, but those two things are not necessarily the same. You can have a digital operating model without really having AI integrated.
The way I think about it is if you think about the value curve versus how many customers or whatever. You have some sort of curve. Then what AI does, it adds more value. It shifts everything up on that value curve but it is sort of distinct from digital.
I think that's where you're starting to see this term "AI-first companies" also emerging. Before, it used to be digital-first firms. Those were our typical digital native companies. Think Facebook, Meta, so on and so forth.
In the beginning, they were just digital companies. They weren't really using algorithms. Now they're becoming AI-first companies where AI is being integrated into everything that they're doing.
I think Microsoft is a beautiful example of a company that has transformed itself multiple times. They transformed themselves to become a digital company and now they're transforming themselves into being an AI-first company. You see it with the recent announcements of ChatGPT being integrated into Bing and into basically every part of Microsoft you're starting to see these types of integrations.
Michael Krigsman: We have another interesting comment from Chris Peterson on Twitter. He's asking about academic programs to help train folks to understand why an AI model has come up with a specific answer.
Iavor Bojinov: Yeah.
Michael Krigsman: If we put this into the AI failure context, to what extent is this issue the lack of transparency and the lack of probabilistic or mechanistic results?
Iavor Bojinov: Yeah.
Michael Krigsman: To what extent is this a category of failure? It doesn't have to be just with prediction. I mean you can think of many different uses for AI. Again, it moves us, a very clear distinction from traditional IT projects where a failure might be we can't print paychecks.
Iavor Bojinov: Over the past couple of years, I've been heavily involved in building out HBS's data science curriculum. In particular, we've actually developed a course called Data Science for Managers, which is starting in the fall.
It's actually going to be part of the required curriculum for all 900 of our MBA students at HBS. That means (along with accounting, marketing, finance) they're going to be learning about data science, and they're going to be learning about data science in AI operations.
Really, that's exactly at the heart of this question because we realize you need to educate people, and you need to start educating at every level, not just at the executive level, but even at the MBA level. We want to educate them to understand these challenges.
Now, where do you see this in the failure points? For me, this really is in the deployment stage, which is when you take your pilot and you want to scale this up.
The way you drive deployment, I have a very simple framework to think about it, which is the E2 adoption framework, which centers on education to increase knowledge, automation to reduce switching costs, and trust to facilitate the adoption. Then trust is further broken down into employee-centric design, adds real value, and transparent. Here, transparent, what I mean by transparent is being clear on what was the data that this algorithm was trained on because understanding that will help you understand when you should overrule this algorithm.
Let me give you a very concrete example. Again, I'll go to healthcare because I think there are a lot of really simple examples there.
Imagine you trained an algorithm on detecting cancer in adults. The physician knew that. They knew it's just for adults.
If they ran this algorithm on children, you'd be very skeptical on the results because the physiology of adults and children is fundamentally different. There are lots of different hormones, et cetera, so those things are very different. The denseness of tissue is different, et cetera.
If you knew, and it was really transparent on what the training data was, that physician would understand and be able to overrule this. But if they just had an algorithm that just said, "Oh, use this algorithm to detect cancer," and they had no idea who this algorithm was trained on, they wouldn't really know when to overrule.
Here I gave the example of adults and children, but it could be if you have something that's trained in Europe. It probably doesn't apply directly to Japan or South Korea, et cetera. All of those transparency angles are really important in understanding when you should overrule the algorithms. I think that's a big failure.
QuHarrison Terry: I want to talk about some of the ethical considerations as it relates to AI. There is an ethical and privacy consideration that companies have to talk about. Sometimes, when you just release a new model or when you open up your data set to a new AI thing and it's unproven, you're going to have some really unintended consequences. How should an organization think about that, and how do you handle that in your research?
Iavor Bojinov: Ethical considerations should not be bolt-ons. It should not be something you think about after you've built your algorithm and you deployed it.
Ethical considerations need to come at the beginning. They need to be at the center of the conversation around even picking "Should we go after this project?" That's actually part of the feasibility, thinking about the privacy, thinking about potential biases (both in the data and the predictions of that).
That's my first observation is a lot of companies go, and they build stuff because they're trying to iterate as quickly as they can. That's great. For IT, it was really sensible to try to iterate, break things, move as fast as you can.
But with AI, we now face this unprecedented scale where you could take an algorithm like ChatGPT that went to a million people in several days. The scale is just completely unprecedented that we're seeing with AI. If you have biases, if you have privacy issues, they just scale and grow so much quicker, which is why you have to think about them at the beginning.
Now, for privacy, there's a core set of principles called Privacy By Design that I would encourage every company to embrace. There are seven principles. I'm not going to go through them now. You can Google that yourself.
A lot of those principles are around making sure that it's something that you do at the beginning. It's not a bolt-on. It enables full functionality. That's how I think about privacy.
When it comes to the bias aspect, this is a really hard question because what happens is AI isn't biased by nature but it takes biases in the data, scales them, and it grows them. Here you have to think really hard about is my data biased.
And there's actually a group of machine learning experts who have been working over the past decade to try to make algorithms as unbiased as possible. You could have these types of constraints where you require that there are no differences between men and women in terms of predictive accuracy and so on and so forth.
Michael Krigsman: Now we have a comment from Arsalan Khan that's again relating to all of this. He says, "What about the power of veto when using AI for decision-making?"
Iavor Bojinov: Yes.
Michael Krigsman: "What happens when execs or others go with their gut rather than with AI recommendations?" To me, this really also connects with this whole idea of responsible AI and training people to understand realistically what's going on with the AI so that they can make informed decisions and have some type of internal radar to get a sense of when bias might be introduced. And if you're developing these systems, especially, you need to know to put in mechanisms to avoid the bias. But getting back specifically to his question about executives who go with gut feel and don't listen to the recommendations of the AI.
Iavor Bojinov: I think there are two parts. The first part is, as we've talked about, AI is sometimes wrong. If there's a very good reason, of course, the executive or even the frontline worker who is using the AI should overrule it.
The place where executives get this wrong all the time is really when it comes to the evaluation piece of it and when it comes to maybe looking at results from something like an experiment where they really want to use their intuition to overrule those results, but really experiments and A/B testing are a way of collecting data from your customers. When an executive overrules it, they basically say, "I know better than what my customers want."
In those situations, I would really discourage executives from overruling the data. But when it comes to recommendations where you know the algorithm could sometimes be wrong, then yeah, of course, you have to overrule it when it is.
But here's the thing. You have to then go back and try to evaluate. Was the algorithm actually wrong or was the person wrong? Can you try to understand that, because if you keep overruling the algorithm when the algorithm is right, then you're going to end up getting the wrong answer and you're not really going to have the benefits of AI.
QuHarrison Terry: I work at an enterprise organization, and I am concerned about some of the governance and regulation that's going to be forced upon these companies as it relates to big black box algorithms. How do you think I should best prepare for that, because that's a big topic? The more AI we integrate, the more we get addicted to AI, and the more it enables us, does it open up an entirely new threat or domain for us to just consider as we think about our standards and just our org?
Iavor Bojinov: Let's focus on privacy here for a second. Now, if you look at GDPR, one of the core rules in GDPR is that everyone has a right to be forgotten. That means the company has to delete all of your data. But if you have an algorithm that's trained on someone's day that you're supposed to have deleted, it turns out (through sort of modern machine learning practices) we can back that out.
Now, what does that mean for algorithms? It means we need to develop new strategies to allow us to go into that data and remove that observation from the model.
Now, if you're training this really large, generative model like ChatGPT, it turns out removing the data is extremely hard and you can't really retrain the model because those can take months to train. There's a whole group of researchers are working on that particular area. That's just one example of how things are changing.
But I want to come back to the governance issue. Again, this is something where it can't be a bolt-on. You can't do this after the fact.
Governance has to be in the beginning when you are starting the project, and you really need to bring together three groups of people.
- You need to have security there. This was the conversation last week in CXOTalk. You need to have security in there.
- You need to have legal in there because they want to make sure you're complying with the laws.
- Then you need to have someone that's an expert in privacy and the ethics of it so they can raise the right questions.
When it comes to governance, the thing that I've seen be really successful is companies developing these sort of tech governance meetings where, if you want to develop a new advanced algorithm, you go through one of these reviews just to check you're really above the board on all of these different dimensions. The idea is you get everyone in the same room for 30 minutes, an hour. You explain what's going on. They give you feedback, and you make sure that you have all of these considerations top of mind at the beginning of the project and not at the end.
Michael Krigsman: One of the things that is very clear is the level of complexity. This notion of AI failure, it's not crystal clear how to even define it. But we can say that there are many different facets to it.
Very quickly, Chris Peterson comes back, and he says, "From a security perspective, how do we guarantee that the training data isn't poisoned by a bad actor since the data aspect is so important?" Just very quickly, any thoughts about the poisoning of training data?
Iavor Bojinov: This is where you really need to have the right security around the data. What a lot of companies do is they have the real data locked in very secure environments where even the data scientists that work there don't interact with it.
What the data scientists work with are sort of synthetic data sets that are generated using methods that are what's known as differentially private to ensure that even what the data scientists within the organization are seeing is not actually the real data but is sort of like a privacy protected version of it. Then if you have everything under this really secure lock and key that even your own employees, most of them, can't touch, it makes it harder to really poison the water there.
QuHarrison Terry: What is your final advice on preventing AI failures?
Iavor Bojinov: It comes back to thinking about those five different steps that I outlined in the beginning. The first one is really careful thinking about the project selection.
Here there's a lot of great research that's been done, some of it from my colleague here at HBS, Jackie Lane, that basically shows that people entangle feasibility and impact. What that means is if you look at a project and you say it's high feasibility, people often say it's high impact (or vice versa).
This is just human nature. They've done field experiments that have shown this.
What you have to do is you have to disentangle those things. You have to think about impact first and then feasibility.
That will allow you to overcome those initial challenges of, "Oh, we don't have the right data. We don't have the right timeframe, or "This is never going to actually be as impactful as we thought." That's the first bucket of failures that you can overcome, and it's a very simple thing you can do but it's really, really powerful.
The second one is when it comes to the actual development. Here you want to think about AI development as any other production process. You want to think about it as almost like in manufacturing.
Two of my colleagues – Michael, you had Karim Lakhani here a few months ago – they have this model called The AI Factory, which is around how you can really scale AI and build a production system around it where you have a data pipeline protected by code with algorithms built on top of it, and then sort of an experimentation platform. You can do that, and that's really going to standardize things. It's going to improve speed, and it's just going to make everything so much easier internally.
The third piece is the evaluation. This is where you really want to use systematic experimentation to make sure that you're really adding value.
Here my big recommendation is to be scientific. There's a lot of research that says if you take a scientific approach to product development, you're going to do a better job.
What I mean by this is it's as simple as saying if then by because. If I make this change, then this outcome is going to improve by this much, and here is my evidence. By taking the scientific approach, you become much better at evaluation.
The fourth one – I already talked about it – is the deployment. It's the E2 adoption framework and then the management is really you have to make sure that there are algorithms and managers.
They are just like employees. They need managers. You can't just build it and leave it in the wild. That's my final point.
QuHarrison Terry: Then lastly, we're going to end on this question. It's from Arsalan Khan on Twitter. Rapid fire. Just answer this in 30 seconds or less. "Who should we sue when AI is wrong, especially when AI is being used in life or death situations?"
Iavor Bojinov: The answer is the human. All of these systems are human in the loop. In healthcare and FDA, et cetera, it's always the human that is the final decision-maker. These are decision-augmenting tools not decision-replacement tools.
Michael Krigsman: I love that answer. Sue the human. It makes perfect sense.
With that, unfortunately, we're out of time. It's been such a quick conversation. We've been speaking with Iavor Bojinov from Harvard Business School and QuHarrison Terry who is the head of growth marketing for Mark Cuban Companies. Gentlemen, thank you both. Iavor, thank you so much for being with us.
Iavor Bojinov: Thank you.
Michael Krigsman: Qu, it's always a pleasure to see you. I'm looking forward to doing this again with you as co-host on CXOTalk.
QuHarrison Terry: Likewise.
Michael Krigsman: Everybody, thank you for watching. Before you go, be sure to subscribe to our YouTube channel and hit the subscribe button at the top of our website so we can keep you up to date on the latest upcoming live shows. Thanks so much, everybody. I hope you have a great day, and we'll see you again next time.
Published Date: Feb 10, 2023
Author: Michael Krigsman
Episode ID: 778