Explore how AI transforms business and society with professors Ravi Bapna and Anindya Ghose, authors of 'THRIVE'. They discuss the "House of AI" framework, data engineering, overcoming barriers to AI adoption, and addressing ethical concerns. The experts offer practical insights for business leaders on implementing AI and leveraging it for competitive advantage.
Enterprise AI Strategy: Algorithms and Ecosystems
In episode 852 of CXOTalk, host Michael Krigsman explores the world of Enterprise AI strategy with two distinguished guests: Anindya Ghose, professor at NYU's Stern School of Business, and Ravi Bapna, chair professor at the University of Minnesota's Carlson School of Management. These experts discuss their new book, Thrive: Maximizing Well-Being in the Age of AI, which offers a comprehensive framework for implementing AI in business.
The conversation covers a wide range of topics, including the "House of AI" framework, the importance of data engineering, and challenges of AI adoption in organizations. Ghose and Bapna share practical insights on overcoming barriers to AI implementation, addressing ethical concerns, and building AI-ready workforces. They also explore the potential of AI to drive innovation and competitive advantage, even for smaller companies with limited datasets.
Episode Highlights
Build a Strong AI Foundation
- Prioritize data engineering as the foundation of your AI strategy, allocating at least 70% of resources to cleaning and preparing data.
- Implement the "House of AI" framework, focusing on descriptive, predictive, causal, and prescriptive analytics pillars to drive value from your data.
Overcome Barriers to AI Adoption
- Address the "three I's" hindering AI implementation: inertia, ignorance, and lack of imagination.
- Foster a culture of innovation by educating leadership on AI use cases and potential benefits across various business functions.
Leverage AI for Competitive Advantage
- Explore AI applications beyond predictive modeling, such as causal inference, to understand the "why" behind business outcomes and scale recommendations effectively.
- Use transfer learning and fine-tuning techniques to overcome small dataset limitations and compete with larger companies.
Address AI Ethics and Bias Proactively
- Implement de-biasing processes in your AI workflows, including data cleaning, algorithm adjustment, and output validation.
- Develop metrics to measure fairness in AI models and be prepared to recalibrate when biases are detected.
Cultivate an AI-Ready Workforce
- Upskill existing talent and recruit professionals with a holistic understanding of AI, including causal inference and experimental design capabilities.
- Educate executives on AI potential and use cases to bridge the gap between technical AI capabilities and business leadership.
Key Takeaways
Prioritize Data Engineering for AI Success. Data engineering forms the foundation of successful AI implementation. Spend at least 70% of resources on cleaning and preparing data before diving into modeling. This investment in data quality pays dividends across all AI applications, from descriptive analytics to advanced predictive and causal modeling.
Overcome Barriers to AI Adoption. Address the "three I's" hindering AI implementation: inertia, ignorance, and lack of imagination. Educate leadership on AI use cases and potential benefits across business functions. Start small with existing data and gradually build AI capabilities, using techniques like transfer learning to overcome the limitations of smaller datasets.
Balance Predictive and Causal Analytics. While predictive modeling is valuable, causal inference is crucial for understanding the "why" behind business outcomes and scaling recommendations effectively. Invest in developing causal modeling skills within your workforce. This balanced approach enables more robust decision-making and helps address potential biases in AI systems.
Episode Participants
Anindya Ghose is an award-winning professor of business at NYU Stern and author of the best-selling book TAP: Unlocking the Mobile Economy. Ghose has been named among the top 1% of researchers in his field and recognized as one of 30 management thinkers most likely to shape the future. He has published more than 115 papers in premier scientific journals and peer-reviewed conferences and has given more than 300 talks internationally. He’s consulted for Apple, Facebook, Google, Microsoft, Samsung, Snapchat, Tinder and Verizon, among other companies. He has provided expert testimony in many high-profile trials and depositions, including the Tinder vs. Match valuation lawsuit, the Facebook IPO matter, the counterfeit goods case against Amazon, and more. He has been interviewed, and his research has been profiled numerous times in the BBC, Bloomberg TV, CNBC, Wall Street Journal, The Economist, Financial Times, Fox News, TIME, The Guardian, and elsewhere.
Ravi Bapna is the chair of business analytics and information systems at the University of Minnesota’s Carlson School of Management. His research investigates online dating, social media, social engagement, the causal effect of AI and ML innovations such as recommender systems, analytics, economics of information systems, trust and peer influence online, human capital in digital services and online auctions. His work has been published in numerous journals, including Management Science, Informs Journal on Computing, Statistical Science, Information Systems Research, Journal of Retailing, MIS Quarterly, and Decision Sciences. His views have also been featured in the Financial Times, Wall Street Journal, Knowledge@Wharton, and The Economic Times, among others. He founded the Analytics for Good Institute at the University of Minnesota and is the Inaugural INFORMS ISS Practical Impacts Award winner for his analytics and digital transformation work.
Michael Krigsman is a globally recognized analyst, strategic advisor, and industry commentator known for his deep digital transformation, innovation, and leadership expertise. He has presented at industry events worldwide and written extensively on the reasons for IT failures. His work has been referenced in the media over 1,000 times and in more than 50 books and journal articles; his commentary on technology trends and business strategy reaches a global audience.
Transcript
Michael Krigsman: Welcome to CXOTalk episode 852. I'm Michael Krigsman, and we are discussing Enterprise AI strategy, looking at algorithms and ecosystems. Our two guests are Anindya Ghose, from the NYU Stern School of Business, and Ravi Bapna, from the University of Minnesota Carlson School of Business. Together, they have just published a new book. Anindya, this is your second appearance on CXOTalk. So, welcome and tell us about your work.
Anindya Ghose: Hi. I'm a professor at NYU's business school. My research interests are at the intersection of causal inference and machine learning, where I essentially help companies of various sizes—small, medium, large—figure out what to do with their data, using a combination of both predictive and causal inference techniques. But most recently, I've been immersed in some really interesting litigation cases, testifying as an expert witness for some of the largest tech companies in the country, including Google, Meta, Apple, Snapchat, Pinterest, and so on.
Michael Krigsman: Ravi Bapna, this is your first time on CXOTalk, and I'm thrilled you're joining us. Welcome and tell us about your work.
Ravi Bapna: I'm a chair professor of business analytics and information systems at the Carlson School of Management at the University of Minnesota. I also direct the Carlson Analytics Lab and the Analytics for Good Institute.
We started teaching this stuff—what people call AI today—almost 20 years ago. At that time, we were excited about all the data coming out of the internet. That dot-com revolution, you know, people were writing reviews on Amazon, which was so insightful in learning about consumer preferences. And then came the mobile revolution. Anindya wrote a book on the mobile revolution. Then came the social media revolution, and the data kept growing. However, companies’ ability to harness that data and make better decisions—actually, that gap has become even bigger. I think that was the motivation for writing this book, which brings in almost 40 years of research between us. The book itself was almost three years in the making, so we're excited to chat about it today.
Anindya Ghose: There is a missing narrative where everybody's just talking about the negative aspects of AI, even though we both realize that there's a lot of positive upside to AI.
Michael Krigsman: You describe this foundational framework called the "House of AI." Do you want to tell us about that? I have an image of it that I can display on the screen as you talk.
Ravi Bapna: At the foundation of this house, we have data engineering. Companies, like I said, are sitting on tons and tons of data, but they can't get insights out of it. In the Carlson Analytics Lab, over the last decade, we've done 130-plus projects with 90-plus companies, and we've been tracking how much time it takes to get the data cleaned, aggregated, and integrated to then maybe build a predictive model. It's in the range of 60% to 70% of project time.
Actually, that time is spent just working the data, which then sets up the four pillars. So, above the foundation of data engineering, we have the four pillars of descriptive analytics. The idea is to go beyond just doing reporting and visualization, to maybe use machine learning to find patterns of co-purchases in products and use that to recommend new products. Amazon was doing this 25 years ago. Let's go ahead and do what banks do really well in detecting anomalies, like finding things that are out of pattern. Maybe that's a risk, maybe that's an opportunity. And then also things like grouping customers. So, there's the descriptive pillar.
Then there's the predictive pillar. You might, as a company, want to predict your customer churn. Or maybe you're an HR group, and you might want to predict whether your employees will churn or quit. That's, by the way, got the bulk of the work and the value that we see going forward over the next three years.
There's causal analytics, separating correlation and causation. It's a muscle that many people don't exercise well. There's the prescriptive pillar. And then we have a few layers, and maybe Anindya wants to get into the layers.
Anindya Ghose: As you were walking us through the framework, an important recommendation or lesson for practitioners is not to get too enticed by the modeling first. Rather, spend most of your time and resources in the data engineering part. One of the things that Ravi and I talk about in this book is that at least 70% of your time should be spent cleaning up the data. Most of these datasets are available or being generated, and they're going to be raw and messy. So you've got to clean it up. A lot of the organizations we have worked with have made the strategic mistake of not spending enough time cleaning up the data and then jumping into the really cool prescriptive, descriptive, predictive, and causal pillars that Ravi was talking about. So that's one thing to keep in mind.
Building on what Ravi was saying, above all of this, obviously, both Ravi and I are very cognizant that we care about fairness. We care about equity, we care about fairness. At the very top, as the business leader or the practitioner is telling the story about this house of AI, they have to make sure that they are cognizant of the importance of spending enough time and resources on making sure it's fair and equitable, no matter what input data is coming in or what's being generated.
One of the really interesting things is that because of generative AI, it adds another element to this fundamental group of four pillars and the data engineering component.
I think we are in for a pretty wild ride, in a positive way because there's a lot of upsides that can be harnessed by practitioners, and, you know, we walk the readers through, in a lot of detail, how to go about doing it. It's all there in the book.
Michael Krigsman: Please subscribe to our newsletter and subscribe to our YouTube channel as well. Check out CXOTalk.com. How should folks in the enterprise who are looking at AI initiatives of all different kinds use that framework?
Anindya Ghose: The first step is to make sure you have a team assembled to clean up the data—the data engineering piece. A lot of organizations have the infrastructure to collect it, but they may not have the right people to clean it up and curate it.
Once you have cleaned-up data from your raw data, you're ready to harness any of those four pillars, whether predictive, causal, prescriptive, or descriptive. It has to be done in stages, but it's also iterative because as new data comes from your modeling techniques, you'll have to go back and revisit whether it's all signal, whether there's still some noise, or whether you have to curate it again. I think what we've done in the book is given people a roadmap on how to do this sequentially.
One other thing I would add, which Ravi was saying as well: When we talk about these four pillars, many people will say, "Yes, we have the skills to do predictive," but my thought is that, increasingly, the importance of causal inference is staring right at us. If you cannot understand the "why," you cannot scale up your recommendations.
Ravi Bapna: There's a myth right now, with all the hype around generative AI, that it's the end-all and be-all—that that's the definition of AI. I think large chunks of value for organizations in the next three years, at least—three to five; we can't project beyond that—will still come from the four pillars we were talking about. The catch is that generative AI will help you do these traditional AI tasks better. That's one part I think is really, really important.
Michael, to your question about how organizations should start thinking about this, we can talk about why organizations are unable or not in a good position to distinguish correlation and causation. How come they're not deploying the full scope of the house of AI right now?
Anindya and I have been talking a lot about this, and our view is that there are three "I"s. There is inertia, which is a powerful force. The status quo always wins, especially at the executive level. If leadership is not familiar with this language or hasn't seen a framework like this, they don't know the use cases. It’s difficult for them to change their current way of decision making and do something differently, whether for demand forecasting, optimizing their inventory, etc. Inertia is a big force.
There's also plain and simple ignorance, a lack of awareness of all the use cases and potential. I think that's a big part of what we address in the book. We have use cases in literally every aspect. Also, from my experience, our projects span every industry and function.
The final thing is a lack of imagination. The most creative solutions I've been part of in the last 20 years—and Anindya, I'm sure, will agree—have been when somebody thought out of the box: "Oh, if I can do X and Y, put these two pillars together, maybe I can trend spot." You see examples of companies, like General Mills in my city. They were the market leader in Yoplait, in the yogurt category, and suddenly, you have Chobani. Nobody predicted that because they didn't have the mechanism to do what you might think of as anomaly detection. What is this unusual pattern that's coming? Is that signal? Is it noise? How should we react to it? Smart companies do that.
Michael Krigsman: We have an interesting question from Twitter, from Arslan Khan, who says, "Data is very important for AI. How can organizations know what to collect and what not to collect? And how do they know if they're creating, quote-unquote, 'data bubbles' that might give them wrong recommendations?"
We also have a comment from Rose Semenov on Twitter, who's asking about use cases. Maybe you can talk about this issue of data collection and, in your book, what goes on behind the scenes. For example, in a dating app, or in financial transactions. Can you link this data question to the reality of the use cases?
Anindya Ghose: Ravi's point about inertia is what I thought of when I heard this question. I get this question a lot: "What should we collect and not collect?" Sometimes the analysis—that extra analysis people do—causes inertia: "I don't know what I should collect, so maybe I won't start collecting it." My recommendation to these companies is, "Just get started. This is not a one-shot process." Neither Ravi nor I will tell you that this is a one-shot process. It's very iterative.
In other words, you'll have to experiment and learn along the way to determine which data sets you're collecting are actually useful. You may have a high-level sense at the beginning, but it's hard to know ex-ante for sure, "These three things are most helpful, and those other two things are less." You'll only know that after you've collected the data, fed it into an algorithm, and seen what it predicts.
Ravi Bapna: To add to that, I think in the end, it boils down to: What is keeping you up at night? What problems do you want to solve?
This is a gap in leadership's ability to think in a data-driven, decision-making way. If they don't know the use cases, they're not going to realize, "Hey, maybe I have a problem in my funnel. I'm losing people at a certain stage, and that's the problem I want to fix. To do that, I need to predict X, Y, and Z, and for that, I need this particular dataset."
I think Arslan is right that we don't want to be spinning our wheels collecting data bubbles, as mentioned. We have to base this on use cases. Once we identify the problem we want to solve, we can start bringing in disparate sources. That's the beauty of today's world. We can integrate—even to build a model of, let's say, customer churn. In the past, people would look at demographics, or maybe they'd go beyond that and do psychographics. But now we have behavioral data. We know what people are doing, what they're buying, what they're saying. We know how likely they are to refer a customer, or to review a product.
So, there are many other behaviors. Are they talking about your product on social media, for example? The smart players in this space will integrate all of these data sources, but it's driven by the problem they want to solve. My conversation with CEOs is often: "What's keeping you up at night? How good is your demand forecasting?"
You'd be surprised how terrible Fortune 500 companies are at forecasting demand. Think about all the downstream implications, the decisions based on that. If your forecast is 10% off, there are 20 other decisions you're making badly based on that bad forecast.
Anindya Ghose: We see that with marketing mix modeling and attribution modeling. 75%, 80% of marketers understand its importance, but they're still figuring out how to do it. That’s why a lot of the projects we did at NYU's MSBA program involved that.
Michael, going back to your question about dating, can I give you my favorite dating statistic? In our book, we talk about this: two spelling mistakes on your dating profile will reduce your probability of finding a soulmate by 14%. 14%! Two spelling mistakes! So, we all need some soulmates. <noise> This is where AI is really helpful. I mean, obviously, we've been somewhat facetious, but it's a fact that people make typos and grammatical, spelling mistakes in their profiles, and that costs them.
An application of AI would be very low-hanging fruit: use AI to prevent these simple, preventable mistakes. I've done some work for Tinder a few years ago and Match.com, and Ravi has done a lot of work with dating companies, too, so we can give some dating advice in addition to AI advice.
Michael Krigsman: This is a first for CXOTalk where we're giving out dating advice. So, if you're dating, you have to be really careful on your dating profile.
I just want to remind everybody that a Tweet Chat is taking place right now. Pop your questions into Twitter using the hashtag #CXOTalk. If you're watching on LinkedIn, put your question into the LinkedIn chat. Take advantage of this opportunity to ask these two professors whatever you would like.
We have another question on LinkedIn from Michael Walton, who says, "What are your thoughts on AI for food R&D and food manufacturing?"
Ravi Bapna: In Minneapolis, we have a large food and ag business. We have a cluster of companies, from commodity companies like Cargill, all the way up to brands like General Mills. All of them are deeply involved in thinking about AI in a variety of different functions.
If you start thinking about where food comes from, in my lab at the Carlson Analytics Lab, we've been working with Land O'Lakes for almost a decade, helping farmers make better decisions about food production. In a typical year, a farmer has to make about 40 important decisions: what seed to use, how much to water, how much to fertilize, and where to fertilize. We're starting to get to a stage where we can do this in a precision ag way. AI models can distinguish between a wheat stalk and weeds. You want to optimize the production function of farming, and many companies are chipping away at different aspects of this: John Deere, Land O'Lakes, and others.
Upstream, you get to a big brand like General Mills. They are constantly on the lookout, looking at how the different channels are working and how influencers play a role in making their product the product of choice. Lots of models are being used to predict that.
So, to me, it's an end-to-end play. By the way, this is true in almost every sector that impacts us daily, and food is no exception.
Anindya Ghose: I just mentored a high school student, Michael, who wanted to do a research project trying to figure out the predictors of corn consumption and corn growth. He collected interesting data across the country on precipitation, weather, soil quality, and built a bunch of predictive models to figure out which parts of the country are most conducive to the growth and development of corn and at what points in the year.
I think, as Ravi was saying, this is a nascent but fast-growing industry. You're going to see more of that. The other angle of food is nutrition. When I think of food, I think of nutrition, and I think of personalized health. What Ravi was talking about are the applications of AI in manufacturing food, but there are also applications of AI in the consumption of food: "When do we eat? What do we eat? How does that complement our other wellness behaviors, like exercise and sleep?" Exercise, sleep, and food are the three most important pillars of wellness.
Michael Krigsman: We have another question from Twitter, again from Arslan Khan, who says, "Is data leading to monopolies? Organizations with large customer datasets can use AI, but what about small companies who don't have large customer datasets?"
Anindya Ghose: Yes, data can be helpful, but, in my experience, data is not a substantial barrier to entry or innovation. I'll contextualize this within the context of digital marketing and digital advertising. Have we seen a stoppage or a diminishing entry of companies into the digital advertising space over the last 20 years? No. If anything, we're seeing more new companies emerging in this space, even companies that don't have consumer-facing data.
Think of The Trade Desk. They’re now competing head-to-head with some of the largest tech companies and beating them at their own game. Think of Criteo, another digital advertising company with no consumer-facing products. When they enter, they have zero data. They build it up from zero, and now they're competing with the big Silicon Valley companies and beating them at their own game. While data is definitely helpful, it is not an entry barrier.
Even if you're a small company—and I just dropped a couple of examples, but there are many more—in addition to The Trade Desk and Criteo, there's Magnite and PubMatic. There are so many companies that were nothing on day zero, but today they are publicly listed companies giving well-known tech companies a run for their money.
Ravi Bapna: A couple of things to add. First, even with small or reasonable-sized datasets, you can still use AI and gain insights that you're not getting now. This shouldn’t be an excuse not to, for example, think about understanding your customers better using segmentation, or building a predictive model as to whom you should target for your next promotion. We have seen huge success and lift coming from datasets with 2,500 rows of data, not 25 million. That's not an excuse. You can start using AI; you can start using those capabilities.
For example, Amazon, when they started out, didn't have a lot of consumer data, but they were still giving product recommendations, maybe using an algorithm like association rule mining, which is based on what people are buying and what’s co-purchased. Even something that simple is low-hanging fruit that every company with a thousand transactions can use.
The other point—and Anindya and I were talking about this the other day—is that generative AI, because of how it is set up, is going to remove the advantage of large data monopolies. If I wanted to build a sentiment classifier in the past, I might need a reasonably sized dataset. I might need 10,000 rows of labeled data to build a good, accurate model. Now, I can ask that question in a prompt to ChatGPT, and it will give me the answer. Every small business can access that capability. And, by the way, we should discuss this later: generative AI can also build some of these models for you. If you have a small dataset and you don't have a data scientist to build an XGBoost or machine learning model, generative AI now has that capability. We've been piloting this in the courses we teach at our universities.
I don't buy the “small data” excuse. Everybody can start climbing this ladder. The last chapter of our book uses the metaphor of the AI summit. I think Anindya is a mountaineer; I spent a lot of time in my youth hiking in the Himalayas—I went to Everest Base Camp when I was 15. There's Camp One; there’s Base Camp—Base Camp is data engineering. Once you get to Base Camp, you can start doing other things, like descriptive and predictive. You get to Camp Three, where you start thinking about correlation and causation. Everybody can start thinking about it; everybody can benefit.
I often hear this “small data” excuse, but people do not understand the space well enough to claim that.
Anindya Ghose: That's the inertia we've been talking about. It’s analysis paralysis leads to inertia, which leads to a lack of innovation. Maybe that's the fourth "I" we should think about: ignorance, inertia, lack of imagination, leading to a lack of innovation.
Michael Krigsman: But isn't it true that, if you have a small dataset, you will be at a disadvantage in personalization in many different ways?
Ravi Bapna: Not really, Michael. Now, with generative AI, we have fine-tuning. We can take an existing generative AI model and, even with a small dataset, we can change the parameters just enough to customize it for a small startup. There are other architectures as well, such as RAG—retrieval augmented generation—that also allow us to do this. This has been a game changer.
There's a technical aspect of machine learning called transfer learning. The idea is that you can take a model built for X, and, with a little tweaking, you can use it for Y. That tweaking doesn't require tons of data.
Anindya Ghose: Ravi, remember we were texting each other about this very topic just a couple of days ago. This is a great question. Was it Arslan who asked this?
Ravi Bapna: Yes.
Anindya Ghose: We get that question a lot, and both of us are dying to let people know that's no longer a concern, thanks to transfer learning, RAG, generative AI. The small data entry barrier is no longer a concern.
Michael Krigsman: We have another question from Twitter, again from Ravi Karkara, who says, "Can you shed light on why the world needs to have policy and discussions on ethics?" He wants to know about AI for food specifically. Maybe you can touch on that, but there's also a much broader set of issues. Can you talk about AI ethics and its implications for food, but especially for the enterprise?
Ravi Bapna: First, algorithms themselves are never biased. The algorithm is a concept in math that will give you insights based on the data you feed into it. So, what data are we feeding into it? Let's look at some examples. Amazon was called out several years ago for building a resume screener that was biased against women. In fact, Goldman Sachs and the Apple Card had a similar issue, again in terms of gender bias in AI algorithms. Why?
If you go back to that Amazon resume screener example, how come the algorithm was associating high performance in tech jobs with being male? Well, it's probably because, historically, society has generated data that reflects that bias. Maybe not enough kids in high school, especially girls, are signing up for math and science classes. Therefore, they're not getting STEM degrees; therefore, they're not qualifying for tech jobs. That's not the algorithm's problem. That's society's problem—a process that, for various, complex reasons, has generated biased data.
Algorithms are not the starting point for bias. That's where the bias comes from. We are now at a stage where we’re designing algorithms to fix those biases and recognize them. That's what we teach our managers. Somebody going through the NYU MSBA program that Anindya runs or somebody going through our program will have a whole class on de-biasing these algorithms and their results. That's more AI. It is more likely to be the solution than the problem.
Anindya Ghose: This issue that we have to be very cognizant of the ethics of AI is a solvable problem. It's solvable because, at the first stage, when you look at the input data, you can identify elements in the data that may be skewing results—we call them outliers. If they skew the output, that may be causing the bias. It's solvable because you can identify the outliers, clean up the data, and then feed it into the algorithm. That's how the de-biasing process starts.
I think what has happened so far is that, because of this cottage industry we talked about, a lot of conversations have focused on one side and not talked about the fact that this is a very solvable problem. It's really not rocket science. We teach this in our MSBA program, and we see great results. Fear not!
Ravi Bapna: One of the components of the middle layer in our house of AI is reinforcement learning. This is a powerful approach predicated on the idea of explore and exploit.
What is explore and exploit in the context of screening people for job interviews? Maybe if you run an algorithm based solely on historical hiring patterns and people's performance patterns—like Amazon did earlier—you may score a particular candidate really high. What this algorithm does is say, "Wait a minute. This person has the highest score, but let me take a chance. 10% of the time, I'm going to try the third highest-scoring person, or the next person."
That automatically diversifies the pool of people you're bringing in. It could be that there are certain subgroups of people—for example, women with super talented tech capabilities—that your algorithm, in the process of exploration, will include. Then it will learn that those people are doing really well, and that brings in equality. We're seeing reinforcement learning, in this kind of explore and exploit fashion, help in de-biasing.
We also coach the people we work with in companies, as well as our students, that, when we look at model performance, it's not just about accuracy. There are specific metrics around fairness, such as the true positive rate. Is that different by subgroups? Is it different by men and women? Is it different by race? We have to bake this whole calibration process into the models so that they are also fair. Over the summer, both of us have been advising companies to take this into account.
Sometimes this comes at a cost. If you care about fairness, you may not be profit maximizing in the short run, but in the long run, you may not get sued! <noise> I think that's where we are.
Michael Krigsman: But, with respect, it strikes me that you are simplifying this problem to an unnatural degree. Yes, from a mathematical standpoint, you can weed out issues in the data and adjust your algorithms, but there is still a societal context in which these products operate. That is where problems tend to arise, which is why there's so much emphasis around the world, especially in the United States and Europe, on AI regulation.
Arslan Khan comes back and says, "He disagrees that algorithms can't be biased." He says his nine-year-old daughter was pointing out that the Tesla screen was showing only the silhouette of a man, even if a woman was crossing the road.
Anindya Ghose: I don't think we said that algorithms cannot be biased. We said that if there is a bias, it can be easily de-biased.
Going back to step one, I do think that we are simplifying the process, but Ravi and I have done this many times, and it is a simple process. It's not rocket science. You have to be willing and cognizant. You should be able to change and revisit what you're doing; you should be willing to adapt and experiment.
Many of us get bogged down by the idea that it's a difficult problem. But this de-biasing problem is not rocket science. It takes some work to figure out, but it's a solvable problem.
That's partly why we are optimistic about AI. We are not just talking in the air. Ravi and I have worked with close to 200 companies, and we’ve been very hands-on. We've done this a lot, and we've gone through this de-biasing process. We've figured out the issues, and then used the right models to solve problems across industries and countries. This is a solvable problem. It's not nearly as complicated as people make it out to be, and we should just drop those three "I"s—inertia, ignorance, lack of imagination—and move toward innovation.
Michael Krigsman: On Twitter, Rose Semenov comes back again on bias and fairness, raising the question of AI being a black box, which can lead to a lack of transparency. Can you talk about this lack of explainability and how that fits into the apparent lack of fairness?
Ravi Bapna: A lot of work is being done. Very smart people in research departments and universities are writing PhDs on explaining black box models. In the three-day course I teach to the NYU MBAs, we spend a whole afternoon on explaining black box models. There are approaches; they're not perfect. We recognize that this is a key barrier for many companies adopting AI.
I had a former student who built a sophisticated demand forecasting model. He back-tested it over 25 years, quarter by quarter. When the model predicted that a particular business unit's demand would go down by 6%, the manager’s first question was, "On what basis? Can you explain how this prediction was made?" The student couldn't really explain it. It was a complex, deep learning model that used things like LSTM.
The manager didn’t believe the prediction. Guess what happened? Demand went down by 6%. <noise> The company lost that opportunity. That's how they learned. My advice to them was, "Let's run a shadow process. Let’s continue doing what you're already doing. If somebody is building a black box model, let's run these two processes in parallel for a year and see which one wins." They soon were convinced. They didn’t need to understand in detail how the model was forecasting demand as long as they knew it was accurate. If they knew it was accurate, they could plan accordingly. They could staff correctly and hire correctly. I think people are now going with that.
People talk a lot about explainability, but let's go to healthcare. Think about medical imaging. If I build a model that takes X-rays and can detect a fracture, do I really care which pixel in the image is responsible for the diagnosis? No. There are many use cases where we get a lot of practical value. Many hospitals have done this. We can detect a fracture from an X-ray. If you're in sub-Saharan Africa and you don't have access to a trained radiologist or an orthopedic doctor, AI can do that for you. In our book, we have a whole section about this. We talk about clinics in Eastern Europe that have used this kind of AI to detect breast cancer—more accurately than humans, actually.
We're going to get to a stage where nobody's talking about the explainability of those incredibly complex models because we are validating their output and seeing that we're saving lives. So, there are two things. A lot of people are working on explainability, but there are also many use cases where we don't care about explainability.
Michael Krigsman: We have another interesting question, again on bias and de-biasing. Lisbeth Shaw says, "What if de-biasing means questioning the existing dataset, which may require starting over? Companies want speed of execution and speed to market. This step may not be approved."
Anindya Ghose: That may happen in theory, but, in practice, there are usually certain observations, or certain variables in a dataset, that are more problematic. I haven't encountered a scenario where the entire dataset is problematic. More likely, let’s say a dataset has 10 variables. Seven are fine, but three are problematic.
In those scenarios, you have to be cognizant of that and figure out a way to fix the problem, either through the data engineering process or by tweaking your infrastructure so that it generates and collects better data in the first place.
To answer that question briefly: the more likely scenario is that you'll have a mixed bag where some parts of the dataset are fine and others are problematic. The world now has many tools to fix those problems: extrapolation, imputation, and curation.
Ravi Bapna: The broader message is this: First, we have to recognize that this is an issue, so it's good to be talking about it. Second, we now have specific metrics and literature. There's science behind detecting bias in algorithms—not just based on their accuracy. We can run them against tests for fairness. If I build a model, are the outcomes fair? Let’s say I’m deciding whether someone should get a credit card. I’m a bank, and I look at the true positive rate across different subgroups, and I find significant differences. My students know that's a stopping point. That means we have to go back and recalibrate the model. We may need to oversample certain subgroups and try to learn better from groups that are underrepresented.
This is part of the education of using AI models correctly. This technology isn't going away, so it’s up to us to educate ourselves on how to use it the best way possible. In many ways, that's the message of our book.
Michael Krigsman: Again, from Arslan Khan: "If organizations deploy biased systems that cause a problem, and they are not aware of the bias, who should the consumer sue? Who is responsible overall for this AI?"
Anindya Ghose: In the US, we are a very litigious society, so suing is easy. That shouldn't be a problem. It’s a little harder overseas.
I would question the premise of this question. If an organization is working with a biased data set, first, to what extent is there bias? Is it meaningful or trivial? Bias can be meaningful when the most important output variable is biased. Bias can be trivial if something tertiary is biased. You need to figure that out first. As I said, there is a roadmap for fixing these problems. You don't need rocket scientists. A good data scientist can fix this problem.
Most organizations are cognizant of the importance of this. My experience is that there's more good than bad out there. CEOs, CTOs, CIOs are not intentionally trying to create biased systems. They are trying to solve problems, and, in doing so, they may inadvertently work with algorithms and datasets that are not de-biased. The important thing is that they are aware of it. In my work with at least 50 companies in 12 industries, they have all been willing to fix this problem. I've never had someone say, “Oh yeah, don't worry about it,” when I pointed out a bias problem.
Michael Krigsman: There are examples outside of AI. The FDA recalls food if there's a problem. Cars are recalled if there’s a problem. We have a history of dealing with unintentional issues and implications.
Let's move on; we're running out of time. You talk a lot about ecosystems with respect to AI. What do you mean by an ecosystem, and why is that so important?
Ravi Bapna: AI is a fast-moving, complex, intangible general-purpose technology. As we said, not many people truly understand it—hence our book, which tries to demystify it.
I think what we have seen—and companies in the Twin Cities area understand this—is the importance of having access to a research-oriented university like the University of Minnesota and its Carlson Analytics Lab. We have great partnerships with almost all the medium-sized and Fortune 500 companies in our area. Companies can come to us with questions, and we can help them find answers. Our teams of graduate students, supervised by faculty, solve those questions. That creates a vibrant ecosystem for us. It creates a virtual cycle. Our students learn by working with real-world data and problems. These are not theoretical, made-up questions that some professor came up with! <noise> Companies get access to talent before the market does and also get access to faculty and PhD student expertise. I think the NYU ecosystem that Anindya has created is very similar.
Anindya Ghose: We have something called the Capstone Program, which is as real world as it gets. Student groups are embedded in companies, working on real datasets and real problems with a faculty advisor. Ravi has been an advisor; I'm the program director, so I'm pretty hands-on as well.
To your question about ecosystems, I see this as a collaborative effort of academia, industry, and policymakers. The private sector has to be involved. The good news is they are involved, and this is a journey. We're learning along the way, but the good news is that people are on board. Two years ago, if you said, "You run an AI program," people would say, "Oh, what kind of nerdy, geeky thing is that?" But after ChatGPT, everybody wants to be on the AI roadmap.
Michael Krigsman: What are some workforce issues that you're seeing in the companies you consult with?
Anindya Ghose: I'll give you a crisp example. Over the years, we have seen that everyone and their brother can do predictive modeling, but very few people can do causal modeling. One of the workforce-related skill sets that I've been pushing, which is finally being implemented, is revising the curriculum to include a significant presence of that third pillar of causal inference. We're giving students econometrics training, causal machine learning training, and field experiment training.
It's not enough to be good at predictive modeling. You need to be able to do causal inference as well. We are seeing recruiters come back and tell us, “This is helpful because your graduates are tooled up holistically. They understand not just prediction but also "Why is this happening? What will happen next? What is causing these changes?" We feel very good about this.
Ravi Bapna: There's a supply side of talent that can work within this AI framework, and there's a demand side within companies. In my opinion, there is a large and widening gap. Leaders, executives, senior VPs, directors—they’re not aware of what's possible. I was talking to a former student, an alum, at an event, and I asked her, "How's it going?" She said, "I've got a great job; I won't name the company. I'm really happy, I'm making a great salary, bought a new car. But, honestly, Prof, I'm really bored at work. I'm still running reports and creating visualizations in Excel. You guys taught us all these things—reinforcement learning, deep learning—but my company doesn't have managers and executives who can give us problems that can be solved using those technologies."
That's what I call the demand side problem within companies. We've been working hard to bring in cohorts of executives, through executive education, to educate them. "Look, you're sitting on tons of data. You’re making lots of wrong decisions because you're confusing correlation and causation. Let's start climbing that AI summit! Let's start making business value!"
Michael Krigsman: Arslan Khan asks, "If organizations are doing AI that can impact people's lives directly, in finance, housing, jobs, etc., do we need some sort of government entity to provide checks and balances?” Or, to put it another way, "What do you think about government regulation and policy-making when it comes to AI?"
Anindya Ghose: There's potential for regulation, but nobody knows yet what that regulation should look like. I urge caution. Yes, new technologies can be misused or used well, so there's room for some regulation, but let's not jump the gun and go too far because we’re all—including us—still learning. We don't know what needs to be regulated. That's my high-level answer. Obviously, there's more nuance, but that’s it in a nutshell.
Ravi Bapna: As we wrap up, any final thoughts or advice for folks in the enterprise?
Anindya Ghose: Please read our book, not because we're trying to sell it, but because Ravi and I have put our blood, sweat, and tears into it for the last 20 years each—40 years collectively! We are dying to share this knowledge with you. If I could, I’d give the book away for free. We just don’t have that capability. We would love for you to spread the message. We've tried to demystify this as much as possible, and we'd love to hear your feedback.
Ravi Bapna: I would echo that. This technology is here to stay. If we use it correctly, all the early research shows that we're going to see productivity gains. We are not going to replace human labor; we're going to augment it the right way. It’s up to leaders to get up to speed, educate themselves, and figure out how to do things better. There's no shortage of challenges to solve in society, and there's room for businesses to innovate. AI will be the capability that allows you to do that.
As Anindya said, we're excited to be Sherpas, helping people climb this AI summit.
Michael Krigsman: As you talk with so many organizations of various sizes, can you identify one big stumbling block? What's the biggest challenge, and how do you suggest solving it?
Ravi Bapna: It's the three "I"s that we started with. It’s inertia, lack of awareness (which we call ignorance—maybe a harsh term, but true), and, to get to a more nuanced level, lack of imagination and creativity. Anindya and I have talked about how this leads to a lack of innovation. This is a story we've seen over and over; we know what the barrier is. It’s at the human level.
Anindya Ghose: I'll give a longer-term view. A lot of the managers I work with—including CEOs—are thinking about the next quarter. "How do I make the next quarter look good?” When you think about AI-based transformation, it's not a three or six-month journey. It's a long-term journey, so don't expect results overnight. Be cognizant of the long-term roadmap, trust us on this, and adopt AI. It's going to pay off in the long run.
Michael Krigsman: An enormous thank you to Anindya Ghose and Ravi Bapna. Thank you both for taking the time to be with us. I'm very grateful.
Ravi Bapna: Thank you for having us.
Anindya Ghose: Thank you so much for having us.
Michael Krigsman: Thank you to everyone who watched, especially those who asked excellent questions. Before you go, please subscribe to our newsletter and YouTube channel. Check out CXOTalk.com.
We have amazing shows coming up this fall, so check out CXOTalk.com and join us.
Thanks so much, everybody. I hope you have a great day.
Published Date: Sep 13, 2024
Author: Michael Krigsman
Episode ID: 852