Data, AI, and Algorithms: New Year's Resolutions for 2018

Dr. David A. Bray

Distinguished Fellow

Stimson Center
Anthony Scriffignano

former Chief Data Scientist

Dun & Bradstreet
Michael Krigsman

Publisher

CXOTalk

Myths and hype surround many discussions about artificial intelligence, big data, and modern algorithms. For this episode of CXOTalk, host Michael Krigsman, talk with two extraordinary experts who bust the myths and offer straight talk on technology as we head into 2018.

57:26

Dr. Anthony Scriffignano, Chief Data Scientist at Dun & Bradstreet, and Dr. David Bray, Executive Director at People-Centered Internet, speak with CXOTalk about big data, analytics, artificial intelligence and modern algorithms in 2018

Scriffignano has over 35 years of experience in IT, Big-4 management consulting, and international business. Sciffignano leverages deep data expertise and global relationships to position Dun & Bradstreet with strategic customers, partners, and governments. A key thought leader in D&B’s worldwide efforts to discover, curate, and synthesize business information in multiple languages, geographies, and contexts, he has also held leadership positions in D&B’s Technology and Operations organizations and served as the primary inventor on multiple patents and patents pending for D&B.

Bray, former Chief Information Officer for the FCC, was named one of the top "24 Americans Who Are Changing the World" under 40 by Business Insider in 2016. He was also named a Young Global Leader by the World Economic Forum for 2016-2021. He also accepted a role of Co-Chair for an IEEE Committee focused on Artificial Intelligence, automated systems, and innovative policies globally for 2016-2017 and has been serving as a Visiting Executive In-Residence at Harvard University since 2015 focusing on leadership strategies for our networked world.

Transcript

Michael Krigsman: Data! Data is a beautiful thing, but there are so many myths. The reality of data, of AI, of analytics is, frankly, almost lost on us. Now, as we begin 2018 with Episode 270 of CxOTalk, we are myth busting. We are exploding the myths of data, AI, algorithms, and analytics. My God, we have got two of the best people on the planet to do that.

Before we begin, I am Michael Krigsman. I'm an industry analyst and the host of CxOTalk. I want to say thank you, as I always do, to Livestream. They have supported us for two years. CxOTalk is really taking off. But, man, Livestream, those guys have been hugely helpful to us. Go to Livestream.com/CxOTalk, and they'll even give you a discount on their plans.

One other request, please. Right this minute, tell a friend about CxOTalk. Send an email. Give them a phone call. Tell a friend to come watch, please. There's a tweet chat going on as we speak on Twitter using the hashtag #CxOTalk.

Without further ado, I want to introduce our two guests today. Let's start. Who do we start with? Let's start with Anthony Scriffignano. He is the chief data scientist at Dun & Bradstreet. He's been a guest on CxOTalk before. Anthony Scriffignano, how are you doing?

Anthony Scriffignano: Michael, it's great to be with you. I'm really excited about the conversation today. Thank you so much for inviting me.

Michael Krigsman: I am, as well, excited. David Bray, welcome back to CxOTalk. How are you?

Dr. David Bray: Doing great, Michael, and always glad to help expand the network of the amazing CXOs that you bring on CxOTalk.

Michael Krigsman: Very briefly, David; you need no introduction, but please tell us a little bit about yourself.

Dr. David Bray: Sure. I am serving as the executive director for what's called The People-Centered Internet coalition. It was co-founded by both Vint Cerf, co-creator of the Internet, and Mei Lin Fung. Our objective is to try and do demonstration projects that measurably improve people's lives using the Internet. It's all too easy to point out what might not be working or what might be wrong, or get angry or depressed. What we really need, though, are positive change agents that are willing to show a better way forward so that that can then be shown to the private sector, to public sector leaders, and we can show the world how we make sense of this rapidly changing world. In addition to that, I'm also a visiting executive in-residence at Harvard, which allows me to think about data, AI, algorithms, and how it's impacting the nature of leadership for our changing world.

Michael Krigsman: All right. Anthony Scriffignano, tell us a little bit about your role at Dun & Bradstreet, please.

Anthony Scriffignano: Thanks very much. I'm the chief data scientist at Dun & Bradstreet. Dun & Bradstreet is a company that's been around. We're in our 176th year right now as a company. It's very rare air in the United States. There are very few, less than a dozen companies that have actually managed to survive that long.

We focus on total risk and total opportunity for our business customers around the world. It's a matter of collecting information from hundreds of countries in different writing systems, in different languages. There are different laws in those different countries, so it's no small step to just get the information together and make it make sense.

Then, to the point that David brought out, to actually make it make sense in an increasingly relevant way. Our customers are asking more difficult questions. Compliance is becoming a very big issue. Privacy is becoming a very big issue, cybersecurity. All of these things are the problem sets that we help our customers address.

Michael Krigsman: Okay. Now, before we dive into our plan for myth-busting today, I want to remind everybody that there is a tweet chat going on right now using the hashtag #CxOTalk. Join in. Ask your questions. Share your comments.

A small request: Please tell a friend about CxOTalk right now. Ask somebody to join us.

All right, so our first myth that we need to talk about is the fact that everybody knows the more data you accumulate, the better your business is going to be. Right? Everybody knows that. Right? David, I see you're kind of nodding your head, so obviously you agree with that.

Dr. David Bray: No, I do not. [Laughter]

Michael Krigsman: [Laughter]

Dr. David Bray: But, I do agree that everybody seems to, or conventional wisdom seems to, think that more data equals better output. I think, really, what both Anthony and I want to underscore is it's really about, one, the quality of the data, but also thinking about the diversity of the data. If you have a lot of data but it's extremely biased or it's missing what you really want to focus on as a business, then it's not going to be practically relevant to what you're trying to achieve.

To put this in a context of how do you actually operationalize and actually think about data, it's really sort of going back to first principles and say, "What is the problem we are trying to solve? What are the insights we're trying to gain?" Once you answer that question, then you can say, "Well, are we looking in the right places? Are we actually analyzing the correct data from those streams? And, is it diverse enough to make sure we're not getting bias introduced into what we are seeing or we're actually missing things and having blind spots?"

Anthony Scriffignano: Yeah, if I can just add to that. The amount of data on earth is doubling at a rate that is arguably unmeasurable right now. There are lots of studies that they're all looking at different things. How much data is transferred on the Internet? How much data is stored on devices? All of those are proxies. We don't really know anymore.

There's a sense of thinking, "Well, gee. That's a lot more fodder. We can learn a lot more from that data." It turns out that data begets data. So, an event happens, and there's a lot of data that is just duplicative. Just having more information doesn't mean you necessarily have more ground truth.

If you use things like machine learning, they see more evidence of something. They train themselves on that, and they believe that those things are more important. Then we become focused on things that are probably less important, but more talked about or more occurring in the data.

David, you brought up the idea of veracity. That's a very big deal. You know I jokingly say, "Well, it's on the Internet. It must be true." Well, we know that's not true, right? Yet, our algorithms, certainly in AI, AI has a tendency to ingest things and then treat it as true. More and more, it's becoming important to have these skills of: how do I know when I have enough data; how do I know the right data; how do I understand the bias; can I defend stopping where I stopped; and how do I know it's true? These are all critically important skills. It's not just, bring me more data. That's almost certainly going to make things worth in today's world.

Michael Krigsman: What's the advice? What's the takeaway here for executives, Anthony?

Anthony Scriffignano: Well, I think the first thing is you have to be able to ask a good question. Don't start with the data. There's a big tendency to start with -- I get a message or a phone call a week from somebody. I call it the "Have I got a dataset for you" conversation.

It's great that you have a dataset. Congratulations. I wish you well. But, what problem am I trying to solve? I talk to you about how our customers' problems are changing. Let's take compliance, for example.

There are a lot of laws around the world that are getting richer and richer that basically say, "You have to know who you're doing business with, and you have to use some sort of best practice to prove that you know who they are, that they're not supporting terrorists, that they're not laundering money, that they're not" -- fill in the blank. Just saying, "Well, I looked them up on the Internet, and I found them on my favorite search engine," that's not nearly good enough, right?

You come to us, and we say, "Well, here's their beneficial owners. Here's their corporate link, et cetera." Great, great start. What are you going to do next?

My big advice would be to start with a question. That question was, "Who am I doing business with?" Don't start with, "Please give me your data, and I'll let you know if I need any help." That's just not going to end well with almost anybody these days.

Michael Krigsman: What does that mean, David Bray, start with the question? How can executives know, when it comes to data, what are the right questions that they should be asking in the first place?

Dr. David Bray: Right. I think, if I could wax poetically, briefly, just because you know we like to interject a combination of both left brain and right brain thinking here on CxOTalk, E.E. Cummings once said, "Always a more beautiful answer that asks a more beautiful question."

As Anthony pointed out, there are now new requirements that are being placed on certain businesses in terms of knowing who you're doing business with. Okay, so that's the answer that I want to achieve. Then the question is, well, how do I know who I'm doing business with at a level that is sufficient and dependable such in case that later someone says, "Well, you actually were working with this following company. Did you happen to know that they were doing money laundering?" or something like that. You can say, "Well, we did these following checks, and this is how we didn't find anything at the time," or, "it was apparent that they weren't doing at the time."

You really need to start thinking about, again, what is the beautiful answer that you want to achieve, and then what's the beautiful questions that you need to make sure that you're doing with the rigor of that. I think the other thing that you can also do is you can also pull together almost different people from your business units and say, "What are the important questions that our customers are asking us, our stakeholders are asking us?" Almost collectively brainstorm what are the interesting questions we're being posed that right now we cannot answer. Then you can create a prioritized list that said, "If I had to answer these top three questions, these would be the things I'd want to achieve."

Anthony Scriffignano: There's something that we talk about a lot. We call it the dispositive threshold. It's not a term you can look up anywhere. I made it up. It's basically the point at which you can dispose of the question. You can answer the question with the data.

The tricky thing is that when you get to that point, you have enough data to answer the question, but it's not necessarily enough data to answer the question right. Now, how do we define right? How do we define good enough? How do we define how raw my analysis needs to be before it would make a different decision? These are very, very big questions, and they're not answerable necessarily with math.

Sometimes, as David said, you've got to go talk to the users and say, "Are you marketing to them? Are you trying to sue them? Is somebody going to die if we get this wrong?" Those are different levels of adjudication, I would hope. We need to understand the sensitivity, the decision elasticity in the ways in which we're using data in order to use it as practitioners and not just pushing buttons and producing reports.

Dr. David Bray: If I could, real quick, build on what Anthony said, [and share] a practical example. Back in 2001, 2002, I was with the bioterrorism preparedness response program at the U.S. Centers for Disease Control. We were working with the state public health facilities and public health labs to try and actually figure out when they were seeing an increase in flu, things such as that.

Suddenly, one day, one of my team members came running to me and said, "We've just seen a five-times increase in the amount of flu in the southwestern part of the United States." I was like, "Well, that's curious. What's going on here?" When we looked, what happened was they were only updating their record set once every month, and so you got the data volume all at the same time.

Anthony Scriffignano: Yeah.

Dr. David Bray: Unfortunately, that made it all of a sudden look like there was a traumatic spike. And so, as Anthony mentioned, it's about making sure you ask the right question and then also understand the context in which the data is being received and being brought in.

There's also, without going too far down the rabbit hole, a lot of our statistical methods were developed for an air in which I would say is not big data, in which it was more than, say, 40 individual instances or whatever you're looking at, but maybe less than, say, 400,000. And so, there are these things called p-values, which are sort of the confidence level in which you are confident that what you're observing is a nonrandom event, but it's actually something that is statistically significant. The challenge is, as you grow the amount of data, you may find things that show up mathematically as appearing to be statically significant but, in the real world, might not actually be correlated whatsoever.

Anthony Scriffignano: Since you brought math into it, I just have to add to that. There is a common assumption in math that when we develop a regression equation that explains what we're looking at well enough to stay within a certain tolerance, we can call it a day and move on. The rest of the data that varies from the equation that we're using to predict the behavior is considered to be random.

It turns out it's not random at all. It contains pockets of bad guys. It contains pockets of fascinating new opportunity. And so, we have to be able to use methods, and sometimes there are AI methods that can do things like this. There are methods that are not necessarily supervised, but we'll get into this later, I'm sure. Methods that are based on observation, recursion, and learning.

Are we using modern methods, or are we using math from 19800? We really need to ask that question. If the answer is we're using math from 1980, there's no reason to go shoot yourself. It's just that you might want to get a little help.

Michael Krigsman: What should organizations do to ensure that they're using modern mathematics and data science techniques? To ask the question another way, how can this even be possible, Anthony? You've got the largest companies with the most modern and up-to-date data scientists. When you talk about people using the old math, are you talking about maybe farmers? I love farms, but--

Anthony Scriffignano: [Laughter] I don't want to pick on math. Math hasn't changed all that much.

Dr. David Bray: No.

Anthony Scriffignano: It's the way we use it that has changed. Let me give you an example. When Brexit became a thing, everybody came running to us and said, "You guys have a lot of experience. You've been in business. You could see all of this data. Can you tell us what's likely to happen in light of this change?"

First of all, the change hasn't happened yet. Second of all, it's unprecedented. All of the data that we have longitudinally is in an environment that cannot be extrapolated to the one you'd like to ask about. Math tells me, "No, I can't. I can't, with any degree of mathematical rigor, tell you the kinds of the questions you want to ask about what's going to happen." The easy thing is to just say, "No," and close the door.

If you say, "Well, what can you tell me?" Well, can I model, in the data that I have, what might have happened under certain scenarios, like a shift in the supply chain, a shift in governmental policy, or a shift in legislative rigor?" Those things have happened in the past, and I can see what kinds of impact they had. I can possibly project that to different scenarios. Then I can tell you, "Within this scenario, these sorts of things seem more than likely to happen than within this scenario." I'm giving you an answer to a question you didn't ask, but I'm giving you an answer to a question I can answer. Sometimes the beauty of being able to use all of the science and all these wonderful tools we have is to change the question into something that's more reasonable to answer with the data that's realistically available.

Michael Krigsman: Okay. Now, there's a myth, right? We're talking about myths today and trying to bust through the hype. We all know that the best thing you can do with your data is aggregate all of it together into a data lake so that it's useful to everybody, to all comers. David, any thoughts on this particular point?

Dr. David Bray: Well, there are two factors that make me cautious to think that that is a true myth. Again, we're myth busting today, so I'm happy to be partners with Anthony at serving as busting the myth. First, it's thinking about that the size of the amount of data that's growing on the planet, as he mentioned, is doubling about every 2 years to 18 months. That's growing exponentially.

If you begin to try and put everything all in one place, one is going to grow rapidly, but you can probably handle it with the cloud. But, the amount of time that it takes to actually transfer the data is also going to double about every 18 months. Over time, it's no longer scalable, and you don't have to bring it all into one place only because, if you do, the amount of time that it would take to do that is just going to be so huge. It's really not worth that.

The second thing, though, is also, that's a massive security risk. You think about if you put all the data in one place, why do people rob banks? Supposedly because that's where the money is. The same thing. If you put all your data in one place, then you're just putting a big target on yourself saying, "Here. Come." If this data happens to be valuable, proprietary, or personally identifiable, you might be actually creating a huge target.

What you really want to think about is, how can we actually have the data where it sits where it's already, possibly and ideally, hopefully, secure both in transit and in rest; and think about how we can then apply analytic techniques to it. When you think about the Internet of Everything where we're going upwards to right now about 21 billion to 25 billion network devices on the face of the planet to, in another 5 years, anywhere between 75 billion to 300 billion network devices on the planet, the amount of information that's going to be given by those sensors, that digital exhaust, you're going to have to actually think about, at the device level, what do I keep, what do I discard, and what do I actually say, "Well, I now need to actually have someone help triangulate and use this data with me." We're going to need more network approaches to how you analyze data across different data repositories as opposed to trying to put it all in one place.

Anthony Scriffignano: Let me just add one thing to that. You're indirectly talking about data at rest and data in motion. There's a joke that there are two kinds of people in this world: those that can put things into categories and those that can't. Right?

Michael Krigsman: [Laughter]

Anthony Scriffignano: I think data at rest and data in motion was a helpful metaphor, but there are really many different levels of data in motion. There's data in motion that's changing once a month and data in motion that's changing once a millisecond. I don't even know what it means to try to take that data that's changing once a millisecond and repose it somewhere afterward. It depends on the data, it depends on how it's being reduced, it depends on why we're using it, and all of those footnotes.

But, at the end of data, for data like that, if I can somehow either listen to it or play it where it lays, my life might get a lot easier. I think this tendency, and it started, I think, with a lot of the original applications in big data, the conversation started with, "Well, first, put all your data here and then push this button," right? Then we got to, "All right, well, just tell me where all your data lives, and I'll sort of point to it and I'll push this button." The first one was a little crazy because of the integrated supply chains and all the rules and how the data gets made. The second one gets crazy because of the permissions and the rules about what data can cross borders, and so forth.

We wind up in this conundrum where I want to see something holistically as if all the data was in one place, but probably the worst thing to do is to try to put all the data in one place. Again, you don't have to just give up and walk out of the room. That's an approachable problem. Hospitals deal with it all the time. There's biometric data that are being produced in real time, and there's billing data that gets produced once a month. We can fix this problem if we'd just back up from it and stop trying to conquer it with a simple assumption.

Michael Krigsman: How exactly will you fix this problem?

Anthony Scriffignano: My problem, and let me just use the world I live in, right? We want to understand total risk and total opportunity. What are the things that move slowly in terms of total risk? Well, how people pay their bills every month. That moves roughly on a monthly basis in different billing cycles, but we understand. You don't pay your bills every millisecond, right?

What moves more quickly? Well, fraud moves pretty quickly, right? Maybe I want to have a more agile process for detecting potential malfeasants than for detecting a change in propensity to pay bills. It's a question of not trying to paint everything with one broad brush and say, "This is the pangalactic answer for everything we're going to do with data."

I think that's how we get to this myth. Put everything in one place and do the same thing with everything. Okay, well then, the whole earth should speak Latin, and we should all use the same calendar. None of that is true.

Michael Krigsman: David, before we go on to our next myth on myth busting, which will be AI and cloud, which will be quite interesting, any final thoughts on this comment that Anthony just made about the desire for pangalactic data? Pangalactic data, what about that? [Laughter]

Dr. David Bray: Well, I think what he's mentioning is that context, context, context. I mean context matters. If there's anything that we want people to take away in 2018 is understanding the context of both how your data is produced, understanding how it's used and how you want to actually have it answer your questions. That really matters.

To give a practical example to C-suit leaders, about 2, 2.5 years ago it was an average of 72 hours of YouTube video was produced per minute. Now, less than 2 years later, it's more than 500 hours per minute of YouTube video. Now, we don't need to know everything that's in that video. We don't need to know that the pixel right here happens to be beige, brown, or whatever, or the pixel here is more gray or blue. But, it would be good if you could actually begin to apply analytic techniques to the data where it sits because, again, as Anthony mentioned, streaming data is really hard.

Video is an example of streaming data. We could say, "In this video, there are three people: Michael Anthony, and David. They're having the following conversation." Maybe you want to describe the background, maybe you want to describe the tenor of the conversation, but you don't need to know every bit in that data. You need to know really the important things that are relevant to your context.

What Anthony is mentioning is, figure out the velocity of the data that you're dealing with. If it's at a millisecond update, what you're really looking for are almost like the meta-constructs. What are the trends in that data that are relevant to the question you're asking? Whereas, if it's something that happens once a month, that might be more easily analyzable in a different fashion.

Anthony Scriffignano: There's one other nuance to what David is talking about, which I think is worth bringing up here. The science of understanding metadata is now coming through its adolescence. It used to be that the metadata was just this sort of dictionary that comes with the data that tells us how it's structured. Well, most data is unstructured these days, so good luck with that.

But, there is a lot of metadata, even with a video that's posted. I know when it was posted. I know who posted it. I know how long it is. I know what format it's in. All of those things are part of the metadata.

Imagine that I could look across the metadata that's available in a certain sphere that I'm looking at, and I could see that it suddenly spiked; or that the sentiment of the comments, the mean sentiment of the comments, has suddenly shifted negative; or that a common phrase has emerged that was never common before. That might be just enough to say to me, "Pay attention to this."

Maybe now I want to go and pull all that data in for a time being because I know that something interested has happened. It's the way your brain works. I like to say that nobody is paying attention to how their shoes feel until you say that. Then everybody starts to think about their shoes. Your brain does this all the time. We have algorithms that can do this as well.

Michael Krigsman: Okay. We have some questions from Twitter on data, but we've got to move on to AI. My apologies to anybody that has asked a question.

Dr. David Bray: We'll make sure to answer them. As a follow-up, we'll make sure to answer them.

Anthony Scriffignano: In a stream.

Dr. David Bray: Yes. [Laughter]

Michael Krigsman: Yeah, absolutely. If you've asked a question, just keep an eye on Twitter after the show. I want to remind everybody that right now there's a tweet chat using the hashtag #CxOTalk. Join us and participate in this conversation with David Bray from the People-Centered Internet initiative and Anthony Scriffignano, who is the chief data scientist at Dun & Bradstreet.

David, there's this widespread belief that cloud is going to take over and especially driven by AI. It makes sense to store all of our data in the cloud. It's pretty black and white. What do you think about that one?

Dr. David Bray: Obviously, I've been a proponent of cloud, and I think cloud services are a very important part of any organization wanting to modernize itself. However, you do need to think about, as you move to cloud services and, in particular, as you move to AI, does it make sense to be a private cloud, a hybrid cloud, a public cloud? Then, equally, as we already talked about, you don't really want to put all your data in one place for the variety of reasons we talked about that it's going to get really expensive to try and move it in transit. You don't necessarily need to do that. There's also a security risk.

It really is the idea; as Anthony mentioned, that whole pangalactic solution. There is no pangalactic solution to data. There's also no pangalactic solution to cloud or not cloud. I don't want to be Shakespeare and say, "To cloud or not to cloud? That is the question," But it is really thinking about, again, what are you trying to accomplish as a business, what is your mission, and then how best to apply that.

Now, I will say, for artificial intelligence and machine learning, I don't see that a lot of organizations are going to necessarily -- and I'm thinking about small businesses. I'm not thinking about the Fortune 500. A lot of small businesses, midsize businesses, they're probably not going to row their own machine learning or AI. What they're really going to be looking for is, can they get software as service offerings that then are applied to their unique data, and they get the advantage not from having written their own AI solution, hosted their own AI solution, or machine learning solution. They get the advantage because they have the unique data and unique processes associated with their business, and then they're applying to that the software as a service, machine learning, or AI solution.

I think that's really going to be how a lot of businesses look towards going forward. It's thinking about what is unique to what you do, and it used to be thought that what was unique to what you do was your software and what you hosted. I would increasingly say that what's unique to what you do is the business and the processes associated with that data that you're actually doing to run the business.

Anthony Scriffignano: I'm just going to add one other thing that a lot of times behavior begets terminology that didn't exist before to describe that behavior. We have to pay attention to the fact that there are two things happening right now that don't really have good names for them yet. One is that there are, let's call them, mega services like quantum computing, some particular cognitive AI, or anything that pretty much you're not going to build your own. You know what I mean? You're never going to build your own. It's classic to be able to offer things like that in the cloud so that they can be made available to a larger group of people instead of saying, "Please give me a billion dollars so I can go put this thing your basement.

The second thing that's happening is that the things that are part of the Internet of Things together are increasingly disconnected from the Internet. The Internet of Things today is mostly things connected to the Internet, to some application or service, but now we're seeing autonomous cars, drones, and all these things that are disconnected. Those things will increasingly have more powerful AI in them, and they will be able to modify their own goals. Shutter to think, but they will because the environment changes in ways that were unpredicted, so we have AI goal modification disconnected from a central server. I think Skynet is a term that's often used for that in science fiction, but we're very, very close to that.

Do these things have the ability to form a temporal cloud on the fly to share information in the context of their environment, of their purpose, of their mission, of their moment, of their geography, of their - fill in the blank? We don't really have those types of services or a word for those types of services, but you heard it here first. I think those kinds of things are going to start to happen where these sort of ad hoc clouds are going to form. They'll happen in a very virtual way, and very powerful things will start to happen with that.

Dr. David Bray: I would agree 100%. Actually, I like your term. Whether it's ad hoc clouds or ad hoc networking, it is going to be that you'll see this already in the current draft spec for 5G, there is the ability to do ad hoc networking that you don't have to go through a central provider. Just like we talked about you don't need to store all your data in one place. Similarly, networks, there'll still be the option to do centralized networking, and there are a lot of advantages to that. That's not going to go away. That actually allows certain economies of scale and efficiencies. But also, as mentioned with these Internet of Things devices, you may form your own private, ad hoc network for a brief period of time, whether it's for transportation capabilities, whether it's for just verification and security purposes, and maybe it's even just you're with your friends, you're at a concert, and you all just want to create an ad hoc network that is just local to that concert, but you're all sharing the experience in your comments and your conversations.

This is really going to be a trend, and we see that growing desire and a desire that is expressed somewhat by hype, but there is this desire for these distributed solutions to data. A lot of them right now are very nascent, and they're not really mature enough. It's the idea not to store all our data on one platform but have the ability to actually have a locus of control about what we do with our data, what we do with our devices. That's partly why I think, again, if you come back to the common denominator, what's the context in which you're operating? Of course, in my opinion, what are the people-centered aspects, ethics, and values that you want to do with that data and with that network?

Anthony Scriffignano: Let me just add two things. Amen to the last part of what you just said, and that's one of the many reasons you're awesome.

Dr. David Bray: You're awesome too. [Laughter]

Anthony Scriffignano: [Laughter] Ethics and values are not absolute, right? And so, when people start collaborating, and when I say people, when devices, when systems, when processes, when AI agents start to collaborate or when people start to collaborate, they don't necessarily share the same values. They may think they do, or they may realize they don't. Those are very different scenarios.

Just like people have enough trouble coming together and doing anything beyond ordering lunch, different systems that are disparate--authentication, validation, goal alignment--all of these things are really, really, really tricky problems, and they're not going to go away just because we invent a term like ad hoc networking. The people that at the concert, the implication there is, well, they're all Facebook friends or whatever, and they can join this thing if they're sort of pre-identified to each other. You're not going to let the creepy guy in that's just a black silhouette and doesn't have a name.

Well, the creepy guy that does have a black silhouette and doesn't have a name might be an AI agent in an ad hoc network that knows that the thing broke. Right? We've got to get better at that sort of thing. The reason it's hard to talk about it, again, is we don't have nouns and verbs for this yet, so we're going to have to build that.

Michael Krigsman: What about the fact that, as everybody knows, AI is changing the world, has changed the world, and AI is going to serve as our helper and serve as the means for helping us understand that data? What about that? Is that a myth? Is that a reality? Where do we stand in the world on that?

Anthony Scriffignano: "Understand" is a loaded word. We use terms like artificial intelligence and cognitive analytics, but the reality is, machines don't really understand anything. They contextualize things. They put things into ontologies that we conceive of. Sometimes they create their own.

There are neuromorphic methods, things like neural networks and so forth, that can form their own sort of "understanding." But, at the end of the day, if we say that a machine is going to help us understand something, we have to probably change the nature of that to being something more constructivist than positivistic.

A machine might help me understand where there's traffic congestion ahead. That's not really understanding. I understood congestion before this happened. The machine has some data I don't have. It tells me about it. We collectively realize I don't want to go that way, and I go a different way. There's no hermeneutic there. It's not bringing any new understanding of anything. It's just a bunch of data.

For a machine to be able to say to me, "Look, you're thinking about cancer all wrong. You need to be able to think about it in a totally different way." In the drug discovery process, I don't see us there in our children's children's generation yet.

Michael Krigsman: I have seen advertisements on television from some of the very, very largest organizations in the world, technology organizations, assuring me that this is already being done.

Anthony Scriffignano: Define "this." What's being done right now is that advanced artificial intelligence is being used to accelerate drug discovery. Absolutely. Amen. Go faster. Keep doing that.

It hasn't necessarily changed how we understand the process of doing that. It has accelerated it. It has in some ways made things possible that we couldn't do before. I'm going to get outside my swim lane here pretty quickly, but there's a difference between that and fundamentally changing the understanding of understanding, which is what your original question was.

Dr. David Bray: He's absolutely right. At the end of the day, we ascribe; we try to humanize AI and machine learning when in fact it's just math. It's sophisticated math, but it is not thinking like we do. It's even more challenging when people build robots that look like they have the human face to begin to ascribe human behavior to it.

The machine is mimicking the appearance and, often cases, of looking like it's thinking like we do, but it's just doing really complicated math. What it really is doing is advanced pattern matching, pattern identification, trend identification at a scale, yes, that is much better than what we can do as humans, given the limits of our own brain. It is still very, very, very powerful and I don't want to undersell what machine learning can do.

However, as Anthony rightly puts, understanding is a harder term to define. In fact, philosophy has spent the last 3,000 years trying to actually define how we understand anything, let alone what knowledge is. We humans haven't reached the conclusion of how we understand things.

I think it's safe to say that machines, as Anthony mentioned, are definitely accelerating the process of discovery, of new insights, with humans working with them, tipping and queuing where humans should pay attention to, tipping and queuing where humans should not pay attention to. It's not like the machine is, one, thinking like a human or, two, it's not like the machine is able to then turn to us and say, "I understand how you're seeing the world. It's not right. Here's the right way to see the world."

Michael Krigsman: My inner rudeness just comes out when I get excited because, quite frankly, I have seen these ads on TV from, again, big, huge blue companies. Okay. I've heard this. I've heard this. What they say is that machine learning is fundamentally changing how we relate to the world, how we think and that, in fact, whether it's math or it's hocus-pocus, it's all real.

Anthony Scriffignano: I think that's true. Let's just say the availability of machines that appear to be learning, just to get away from this whole problem of what learning is. It definitely changes. We have a tendency to believe what we see on the Internet. I don't want to get into fake news and everything, but there is a tendency to look things up and then say, "Well, here's the answer because I found it." There is a tendency to not want to take one more step and ask why something is true. There is a tendency to not really care how a square root is calculated anymore because there's a button we can press on a calculator.

I think these things are going to get us in trouble. I think we're starting down the path of not necessarily thoroughly understanding our machines and, at the same time, they're slowly becoming our overlords. Outlook says, "Go here," and you go here. The machine recommends to do something, and so you trade the stock. That's a very dangerous path. I think you should understand a little bit about the advice you're taking and why it was conceived.

At the same time, explainability is halfway out the window right now. There are certain algorithms that produce amazing results, and we don't understand how they get there. They can't "explain" it to us in terms that our ability to understand would be consistent with the explanation. It depends; it's a bit of a semantic argument whether that means that these machines are smarter. I would say they're better at doing that thing than us in that moment, maybe, but that the spark of intuition, the empathy, the artistic nature, the ability to conceive, I think we still own all of that. I think that we lose that because we stopped exercising that muscle memory, not because the machines got better at it.

Michael Krigsman: Okay. David, let me then redirect this to another issue that I've heard. As a matter of fact, I talk with a lot of software vendors all the time, and I've heard this. Okay?

We all know that machine learning, artificial intelligence, big data, predictive analytics, basically it's the same thing. Tell us. David, tell us the truth. Come on now.

Dr. David Bray: Well, we could easily divulge another two to three hours giving the taxonomy, but essentially they're not.

Michael Krigsman: The vendors, the software vendors have said that, so how could this be? I don't get it.

Dr. David Bray: A couple things: I would recommend don't believe everything you see on TV or the Internet, first and foremost. Then, two, recognize that oftentimes marketing does play fast and loose with terminology.

I do think it's worth exploring what you said there without going into a whole class on the differences between different types of neural networks, machine learning, structured, unstructured. I think it's just worth recognizing a couple things: One, you are seeing companies that initially came out maybe about a year, year and a half ago--without naming any names--came out and said this was about cognitive computing. You see them sort of moving away from that term because they've recognized that that may have misled or it may not have been the correct framing of what really is going on here because it's not like the machine is thinking like us.

Second, as Anthony mentioned, a practical concern is in Europe. General Data Protection Regulations go into effect starting in May of this year. Part of GDPR is that you actually have to explain if an automated decision is made. This is a thing that you have to practically think about if you're using, whether it's machine learning techniques, algorithms, or anything like that in your organization. You've got to actually ask the question of your data scientists, of your technology people and say, "Are we able to do that?" because technically that's part of GDPR. Again, I'm not a lawyer, but I believe you can actually be fined up to 4% of your company's revenue if you're in violation of GDPR.

Then the last thing I'll say is, how many of us will smack our computer or smack our smartphone when it's not working, or try to whack it or wave it as if it will somehow work better if we physically touch it or something like that?

There's also a wonderful example from Experimental Economics where they actually give someone a sealed envelope and say, "You're going to play the prisoner's dilemma with the computer." The computer has already been told how it's going to play, so it's not going to have any random choice or anything. It's already been told how it's going to play. Basically, it's going to make an offer of a certain amount of money. If you think the offer is fair, you accept it. If you don't think the offer is fair, you can reject.

Now, in that situation where the program has already been explicitly told what to do, you should accept every time. But, time and time again when humans play against a computer, if they feel like an unfair offer is made to them, they reject it even though they've been told and they later get to open up the envelope and see the machine has already been programmed explicitly to behave a certain way. This is partly just a challenge of our own selves that we, our human brains, really aren't used to operating in an environment in which we're interacting with something that's trying to mimic human intelligence, but isn't actually human intelligence.

Anthony Scriffignano: There are different definitions of AI and/or different definitions of intelligence. One of the definitions is to try to mimic the way we think. Another is to try to behave in a way we would behave. Another is to try to inform our decisions. Well, those are three completely different things.

A great example that's a little bit overused, but when AlphaGo beat the best Go player in the world, everybody said, "Oh, that's it. Computers are smarter than people." No, they're better at playing Go than people.

Now, apparently, there is something better at playing Go than the thing that's better at playing Go than people. That process will continue ad nauseum. I think that's awesome. I think the technology behind it is absolutely amazing. I don't think that means that computers are smarter than us. I think that means I probably don't want to play one in Go. [Laughter]

Dr. David Bray: [Laughter]

Anthony Scriffignano: We get to do other things with our time. If you think about the automation that's in the cockpit of an airplane, the auto flight systems, all these different things, eventually they say, "Well, your airplane. Something that I'm not prepared to handle, has just happened. You take over."

There's a reason why they do that. Because it's an inherently complex problem that's changing at a very high rate, and there's a great degree of human life involved, and there are ethical considerations and so forth. There is that everywhere. We give up ground to our AI agent that we choose to give up or that we should give up so we can free ourselves up to do something more important and more valuable.

I think part of your message, David, which I love, is people get to use their time, their talent, and their treasure to do things that make everything better. If we are busy calculating measures of central tendency and trying to figure out what the survey says, we're not doing that. Let's let the machines do that, and let's do something more valuable. Right?

Michael Krigsman: You two are of the two people who I respect among everybody that I know, the highest. Yet, I hear you saying this, and I talk with software vendors. I don't see a way of reconciling the kind of optimism, the universal optimism with the almost dystopian reality of contemporary software, enterprise software marketing. How do you accomplish the idealism that you're describing? Who has got the budget? The federal government?

Anthony Scriffignano: No.

Dr. David Bray: [Laughter]

Anthony Scriffignano: You stay in the loop. I used an example recently, and I regretted saying it. I'm probably going to regret saying it now, but if there's some software that monitors my email and conveniently schedules a meeting with me with somebody because I said, in an email, "That's great. Let's set up time to talk," maybe I didn't really mean it when I said it. [Laughter]

There are certain things that I probably ought to stay in the loop on, and there are certain things that maybe I want that, if I've had 27 meetings with David already and I say in an email, "Let's schedule a meeting," then let's schedule a meeting. But, if I've never met with this guy named Michael Krigsman, and I say in an email, "That sounds great. Let's talk," maybe it should ask me.

I think that our devices -- we tend to make these all about ourselves, right? If you think about that happening in a medical setting, you know, should I defibrillate: yes or no? At some point, if you're good enough to tell whether or not you should defibrillate, please go do that and tell me that you've just done it. Right?

Dr. David Bray: [Laughter]

Anthony Scriffignano: We have, obviously, [laughter] technology that does that now. But I will tell you when that technology first became available, it was pretty terrifying, right? The original technology required a doctor or someone trained by a doctor to push that button. We've come a long way.

We should keep coming a long way, but we should not give up the thinking part. We should not give up the reasoning part. We should not give up the advancement part and just get lazy and go expect our agents to do everything for us. We should do better things with the time that's liberated by letting our machines do increasingly valuable things for us.

Dr. David Bray: Michael, to answer your question as well, and amplify everything Anthony said because he is awesome and that's why I really enjoy interacting with both of you on this wonderful conversation, it really is about being able to have a locus of choice, a locus of being able to choose what is either done by the machine or you don't use the machine. I think what we may be sort of discovering is, just like we discovered that if you drink too much of certain beverages or if you eat too much of certain types of food or things like that that that's going to be impacting your life in certain ways, the same is true that if you rush headlong without thinking intentionally about the choices you're making with both your time, how you're using the technology, how you're using the tools, you may find yourself in a circumstance that is not exactly healthy, either to you, to your community, to your nation, and to the world.

At the same time, these are really empowering, powerful tools that can be used as a force for good, and I think they will. There are already examples, again. Working with humans to find better cures for cancer, that's amazing. Being able to actually analyze things faster than humans can and see across multiple doctors is amazing as well.

That said, it's almost like how, when we were in the Agrarian societies, we came up with tools that would begin to help with some of the automation of the farming and things like that. Then that allowed us to move to do other things with our time. This too is now a trend in which we're going to find cases where we do want to use the tools, we do want to use the machines because it allows us to then focus our attention elsewhere. The question then is, what is that that we want to focus our attention to and also can be aware of when the machines may be leading us astray because it may unintentionally be giving us misinformation or it may unintentionally be drawing correlations when in fact there is no correlation there.

That's where I think it really gets back to the roots of this conversation where we started today, which is, it's about the data and making sure that the data that is fed to that machine actually can inform it because it's a lot like a five-year-old. If you feed it bad data, I mean as we all know in computer science: garbage in, garbage out. If you feed that machine bad data, biased data, or inappropriate data, it's going to draw incorrect conclusions. Then you're going to have to be living with the consequences as a business, or an organization, as a result.

Michael Krigsman: Anthony, I thought you were going to jump in on that, but we're just about out of time. We're actually past time. Boy, this was a very, very fast 45 minutes. As we finish up, can each of you offer advice to two different groups of people, maybe three? Okay? These need to be sort of tweet sized, itty-intsy bytes because we really are out of time.

I'll ask you this. Number one, to business people who are hearing about data, AI, all of this stuff, what do they do with this in a practical way? Number two, to the software vendors who are selling these tools. Number three, to the government that is going to have to play a role in all of this as well.

David, should I start with you? Your advice to businesspeople on data, AI, all of this stuff.

Dr. David Bray: Know what you don't know. Build your network and your boards to help inform that what you need to know.

Michael Krigsman: Okay. That was pretty fast. Then, David, advice to software industry executives who are selling these tools.

Dr. David Bray: [Laughter] That's a harder one, Michael.

Michael Krigsman: [Laughter]

Dr. David Bray: I think I would probably say, think about how you can enrich your customers' lives. This may be paradoxical. Again, I'm not in marketing, so I may be committing a big faux pas, but undersell, overdeliver, and surprise customers with the connectivity and the time you give back to them.

Michael Krigsman: Okay. That's pretty good. Then, finally, advice to the federal government.

Dr. David Bray: Recognize the world is changing. Our institutions may be outdated and need to be refocused and rethought. It's not just about the federal government. It's about local governments, state governments, world. You talked about who is going to make this happen. It's going to be anybody anywhere, whether it's private sector or public sector, that comes together and tries to show a better, more positive way forward.

Michael Krigsman: Okay. I love how positive you are. Anthony Scriffignano, so what advice--brief advice because we're out of time--what advice do you have for business leaders, business executives, and decision-makers, about data, AI, and all of the stuff that we hear about that we've been talking about?

Anthony Scriffignano: Two things I would say to business leaders. One is, be humble. Don't try to take on everything all at once. There are going to be a lot of people that try to tell you to do it all at once. Don't do it all at once. Take one meaningful step.

The second thing I would tell them is, keep learning. Whatever skills you have in your organization, those are table stakes. This is a very rapidly changing environment. It's not good enough to talk about how smart your people are, how good you are at anything. You've got to keep that learning curve moving forward.

Michael Krigsman: Then advice for software industry executives that are supplying the tools and helping us aggregate the data and use the data.

Anthony Scriffignano: Keep doing what you're doing, but recognize the risk of rushing to market before you think about the unintended use of your products and services. There are way too many times where we're making the same mistakes over and over again in terms of cybersecurity, in terms of things that don't work as intended or that can be used in ways that were not intended. Please pay more attention to that because it's not going to get any better, and it's going to get worse at a much faster rate.

The second thing I would say to them is that there's plenty of pie. There's a lot of opportunity, and there are a lot of attempts to sort of create this specialized, you know, the Betamax versus VHS argument, "We've got this special protocol, and as soon as you adopt it, you could do all these magical things." The magical things happen when things are interoperable, so make them interoperable.

Michael Krigsman: Okay. Finally, Anthony, advice to the federal government, which is going to be involved in this in one way or another. We know that.

Anthony Scriffignano: The Internet doesn't have geographies like the world does, so please pay attention to what's happening around you. A lot of government policy is, "Insert name of my country here first," and that doesn't always work in a technology space. We really do have to think broader about global best practice. GDPR is a good example of trying to think about privacy across at least a region of the world, but it would be nice if the whole world would try to think about privacy, as an example. I know that's very Kumbaya, but there are ways that we can do some of this realistically. There are trade agreements. There are ways in which governments can collaborate and provide benefit to each other for things that they each want.

The other thing I would say is that there are always marginalized others. There's always the haves and the have-nots, and especially with technology. Things get advanced in different parts of the world, and we could do a much better job as governments thinking about how we serve the underserved with some of this amazing abundance of technology and capability that we are enjoying right now.

Michael Krigsman: Finally. I know we did the "finally," and now one more, last finally, a real one. We have an interesting question from Twitter, so very, very quickly. Mitch Lieberman from G2 Crowd asks, "Clock speeds of computers are so fast, and so why isn't AI farther along? What are we missing?" Either one of you want to take a quick shot at that one?

Dr. David Bray: I would say it's debated. It's not just about the speed at which things are done. It's the data, but it's also then as we already talked about. It's making sense of it; having the metadata.

Speed is definitely why we're able to do a lot of the things that we're able to do today, but this is really the third wave of AI. A lot of what we're doing today, actually the foundation was laid in the '70s and the '80s. It's only now possible because of that clock speed, but what's holding us back is the data, being able to draw correlations and, to some respects, we do need additional new algorithms and new techniques.

Anthony Scriffignano: I'll just add to that that there's traditionally only been two ways to make things faster. You either write a better algorithm or you get a faster environment. You parallelize. You run it on a faster machine. There's something happening right now as we speak that's going to change all of that, and that is that we don't have a zero or a one anymore. We have a cubit that's anything between zero and one, and that changes everything.

As quantum computing comes along, what happens when we have quantum artificial intelligence algorithms that can take advantage of being able to think in this sort of fuzzy way? I don't have an answer for that. I hope to be able to see some of that. I hope to not have to fear some of that. There's great possibility there if we start thinking about it and not just considering it science fiction.

Dr. David Bray: One last thing because I know you said this was final-final, Michael. I think this also underscores. No company board of directors would have members of the board that didn't know what a profit and loss sheets were. But, all these things that we just talked about here, if you're a company, or if you're a government, whoever your trustees are, whoever is actually serving on sort of the steering committee for your organization doesn't have at least a few people that are able to have these conversations, don't be surprised if your company is not able to keep up with, again, what Anthony has talked about, what you've talked about. The accelerated exponential pace of change with all these technologies. None of us have all the answers. But, together, as we work as network change agents, we can make sense of what we do for individual organizations, and what we do as a planet and as a species, together as well.

Michael Krigsman: Anthony, just as we go out, what was that that you said about cubits and quantum computing?

Anthony Scriffignano: The question was, clock speeds are getting faster. What I was trying to throw out there was that the fundamental nature of digital computing is changing, and so clock speeds aren't the only thing that is changing and that the shift to quantum computers will change everything much more than slightly faster computers will today because, effectively, things can happen in an instant.

Michael Krigsman: I'm sure that that must be true. I don't know what that actually is.

Anthony Scriffignano: I said "effectively."

Michael Krigsman: [Laughter]

Dr. David Bray: Invite him back, and we can have another conversation. Sure. [Laughter]

Anthony Scriffignano: Yeah. Insert very long, painful conversation here. But, it really does change, and we are really at the cusp of this happening right now. There are machines available today that are doing things that they've only begun to exploit the potential possibilities there.

Michael Krigsman: Okay. Quantum computing. On that note, I want to thank our two guests: David Bray is the executive director of People-Centered Internet, and Anthony Scriffignano, who is the chief data scientist at Dun & Bradstreet. I'm Michael Krigsman. You've been watching Episode #170 of CxOTalk. Please, please tell a friend. Like us on Facebook, please.

We'll be back next week with more incredible conversations. Tune in. Join us. Thank you so much for joining us today. Have a great day. Bye-bye.

Published Date: Jan 05, 2018

Author: Michael Krigsman

Episode ID: 493

Data, AI, and Algorithms: New Year's Resolutions for 2018

Transcript

Audio Podcast

Related Episodes

AI Research: McKinsey Global Institute on Artificial Intelligence

Designing AI: IPsoft CEO on Artificial Intelligence

Digital Transformation in the Pharmaceutical Industry: Innovation at Drug Companies

Workday: Women in Tech