Dr. Michael Wu is the Chief Scientist at Lithium Technologies. Michael received his Ph.D. from UC Berkeley’s Biophysics graduate program, where he modeled visual processing within the human brain using math, physics, and machine learning. He is currently applying similar data-driven methodologies to investigate and understand the complex dynamics of the social web. Michael developed the Facebook Engagement Index (FEI), Community Health Index (CHI) and many predictive social analytics with actionable insights. His R&D work at Lithium won him recognition as a 2010 Influential Leader by CRM Magazine.

In addition to the purely empirical methods, Michael also leverages social principles that govern human behavior (from sociology and anthropology, to behavioral economics and psychology, etc.) to decipher the intricate human components of social interactions.

Through this combined bottom-up and top-down approach, Michael has developed a sophisticated predictive model of influence and an evaluative framework for understanding gamification. To tackle challenging open problems (i.e., the value of WOM, social ROI, or the loyalty implications of gamification, etc.), Michael collaborates with academicians to conduct research on these unsolved problems.

Michael believes in knowledge dissemination. He speaks internationally at universities, conferences, and enterprises on his findings. His research and insights have been compiled and published in The Science of Social and The Science of Social 2.

Michael was a DOE fellow during his graduate career and was awarded four years of full fellowship under the Computational Science Graduate Fellowship. During his fellowship tenure, he served at the Los Alamos National Lab conducting research in face recognition. Prior to his Ph.D., Michael received his triple major undergraduate degree in Applied Math, Physics, and Molecular & Cell Biology from UC Berkeley.

Video Transcript: Michael Wu, Chief Scientist, Lithium Technologies

Michael:         

(00:03) Hello, welcome to episode number 84 of CXOTalk. Today we’re going to talk about data and analytics. I’m Michael Krigsman and my co-host Vala Afshar is a weary Vala today. He’s weary Vala Afshar because he was in Bali this week and he spent two days travelling in each direction and he was there for two days and he actually just got back. So he is not with us today, and so Vala, if you’re out there we’re thinking of you.

(00:43) But we are going to have a great show and we are here today joined by Michael Wu, who is the chief scientist at Lithium Technologies. Michael, how are you?

Michael:         

(00:55) I’m doing fine thank you.

Michael:         

(00:57) Well it’s great that you’re here with us and today we are going to learn about data and analytics. So let’s start by give us a sense of your personal background and tell us a little bit about Lithium and your role at what you do at Lithium.

Michael:         

(01:18) Okay, my background I would say that – well ready want to start! Shall I start like when I was born!

Michael:         

(01:30) Actually we have had people on CXOTalk start at during the time of leading up to high school and then high school and college, but how about your more recent work experience.

Michael:         

(01:44) Cool, okay. So I’m the chief scientist at Lithium and so some people call that a digital scientist but to me they are all the same. Basically we crunch numbers, test models and build models to predict things. In this case we try to understand social media, social customer behaviour on different social channels and predict their behaviour and predict their effect of their behaviour on the effects of business.

(02:20) That’s in short the focus of my research and what I do day to day consists of prototyping algorisms, writing code to thought leadership and for example participating in this show.

(02:41) So I was going to tell a little about Lithium. So Lithium is a social customer experience platform so we help big brands connect better to their customersand that’s basically our mission.

(02:56) We do that three different ways. Basically we provide a community platform that allows brands to build a community for their most passionate customers, to basically help them advocate about their product, to advocates of their products and self-support. And we also have a social media response portal which basically it’s one place where you can respond to all the conversations on social media; on Twitter, Facebook on Google Plus wherever those conversations maybe.

(03:30) Finally we have recently acquired cloud and that’s also another platform that helps brands connect to the influencers that are relevant to their brands, and basically help them scale their marketing efforts and create those medias.

Michael:         

(03:49) So you’re the chief scientist at Lithium and what does that mean and what does the chief scientist actually do.

Michael:         

(04:05) Well it’s a lot of things, so like I said a large part of it is crunching numbers, building models, and testing those models. Once we have built a model and validated that it actually works and is actually predictive, and one of my professors used to tell me you know, as a scientist you have two missions in life. One is you have to do good research to advance the field.

(04:32)Two, is that you have to essentially disseminate the knowledge and educate people. If you don’t do that basically knowledge gets shelved away on a shelf somewhere and nobody benefits from that. so that’s what I do.

Michael:         

(04:51) So as we talk about analytics a big focus of your work is creating very practical results. So why don’t we start with some examples of big data and how it’s useful and how big data can give us non-intuitive insight or understanding about what’s going on. So give us a couple of examples.

Michael:         

(05:17) Okay, so let me cite an example from my work, but I think this is a really interesting example.

(05:25) So there’s a company called  ZestFinance and basically you can apply for loans there and what’s interesting is that they find that when people fill out the application form would only upper case the lower case letters, they are more likely to default. But that’s interesting and that’s something that you wouldn’t have expected. You probably expect that in their payment patterns, their financial situation and all of the traditional underwriting mechanisms when they look at when they underwrite.

(06:05) Those are thefactors that affect the loan default rate, but it turns out the way a person types or fills out the form actually matters somehow. We don’t exactly know what the mechanism is yet, but we could hypothesis and maybe people who fill out the form with proper lettering cases. They are more careful and more detailed orientated and therefore they are probably more successful and more you know better off financially and better off, or better off to pay off the loan.

(06:46)So maybe that’s a hypothesize mechanism that but we don’t actually know what the mechanism is, but you know it’s possible it could be a mechanism but this is what you can find. You can find these unexpected factors.

(07:01) So what they do is that they actually leverage these insights, so the data is just pages and pages of loan application. But the insight and the information that derives from this data is that people have actually a lower default rate when they actually fill out the form with proper cases.

Michael:         

(07:28) So you don’t know what the reason is but you’re able to develop formulas and analysis the data in order to develop basically predictive causality, is that a correct way to say it.

Michael:         

(07:42) Yeah, so the causality could go either way so we’re not sure where the causality is yet but that remains to be tested. But what we find is that there is a correlation. When they actually leverage this insight and put into action and then say, okay, now if I use this as one of the factors in my underwriting and determine,you know, it’s not also to determine but also it adds to the traditional factors that you consider.

(08:19) Then they actually what they see is that they can actually lower their overall default rate. So that means in this case it is causal and if they actually take actions on this and it didn’t affect their loan default rate then we will know it’s not causal.  

(08:37) So whether it’s causal, the analysis will give you correlation and when you actually experiment with it and try out, that’s when you are able to figure out whether it is causal or not.

Michael:         

(08:48) So the analysis you said gives you a correlation, but then you actually have to do run the experiments in the real world in order to see if it’s causality.

Michael:         

(09:00) Yeah, that’s actually the easiest way to determine and figure out causality is to experiment with it. you know traditionally in the media industry very similar things happen is that you could go and buy media and don’t actually know whether it’s having an effect on their sales or their business KPI.

(09:26) What they see is a correlation, when they show ads sales go up, but then it could be the other way. It could be people are buying your stuff and they like the look of the ads to find out more about you, so the way they actually prove that is causal is that they could just shut off the add and see if it comes back down and then show the ads again and see if it goes back up. In

(09:49) So the fact that you can experiment with it, you can turn it on turn it off at any time and then your business KPI just follows that, that’s a sign of causality.

Michael:         

(10:03) Now you mentioned the differences between how data is analyzed in the present and the past. So before we go deeper into the data scientist, maybe give us some more background or explanation about data collection analysis and the past versus today using modern techniques like what you’re involved with developing.

Michael:         

(10:30) So I think you know there isn’t any business that don’t use data. I think you know throughout the history of business we’ve been using data all along. It’s just that today we have much more data and we can collect that much more efficiently, and store it and retrieve it much more efficiently.

(10:51) So and if you want to kind of understand how the data has been used previously before there was big data technology, then let’s look at atypical scenario. Say when you have a problem that you want to address, a business problem.

(11:08) Typically you start with that problem. You encounter some problem and you asked your analyst to go out and collect the data and then the data that you collected is collected specifically to address those problems, right, that’s why they are highly relevant.

(11:24) So as a result and you know like the traditional use of data is very relevant, but with big data technology such as todo, hive, and pig and this non-sequel technology. Basically, what it allows us to do is that now you can actually collect data before you even have a question. So you can collect data irrespective of any purpose or problem in mind, so you can collect it anyway because this is cheap and not expensive.

(11:57) So what that does it makes the data a lot more noisier, because it is not collected to solve any particular problem. So when you actually do have a problem you now have to go and look at your big data store and basically filter out that data that isn’t relevant to the problem.

Michael:         

(12:17) So in the past you had to be very specific about the problem you were trying to solve, and therefore the data that you needed to collect. Whereas now you can be a large vacuum cleaner.

Michael:         

(12:33) Yeah, to suck up everything.

Michael:         

(12:38) Then you can analyze it, operate on it later and then you can form the questions subsequently as well.

Michael:         

(12:48) That’s right. I mean that’s one of the advantages is that you can look at what interesting problem you can address with the data that you have, as well as going a traditional way.

(12:58) The traditional way you start with the problem and now with data you can also reverse, but typically in most business you still start with the problem. It’s just that now you have an extra step in filtering out the irrelevant data, because some data may not be relevant for you to address the problem.

Michael:         

(13:19) But what about really large companies and I’m thinking of say, Google or Facebook or a larger scale the NSA. I mean it seems that these companies are just collecting everything and these companies and the government is collecting everything they can get their hands on.

(13:40) So in that case it’s not deterministic at all, right. You’re just – whatever data we can get is good. Any data is good data, is that an accurate statement?

Michael:         

(13:48) Well any data is good data and is dependent on the problem – well I would say that’s not completely true. I think it’s true in the kind of the idealistic case, because you don’t know what problem you may have in the future.

(14:06) So say I collected some for example with ZestFinance, they actually did not as people did you fill out your form with lower case but just had the information anyway. That information is actually available and is not explicit, but you can actually look at all the application form and look at whether the people fill out the forms with the proper casing and capitalization, or they only use upper and lowercase.

(14:41) So that’s information just available in the data and it may be seen or considered irrelevant before, but later when you actually do the analysis it turns out it’s not irrelevant and it’s actually correlated. Then when you actually experiment and test it in the real world you actually found that to be causal and it’s actually predictive. So it’s actually very relevant in that case.

(15:07) So I think that you may not actually know what the data is useful for in the future, right. It may not be useful for you now, but in the future who knows.     

Michael:         

(15:19) So there is a transformation then that has to take place where you are making some broad assumptions about the type of problem that you may want to solve, and the type of data that is available to you.

(15:42) So lead us through the type of the kind of transformation that has to take place and the logic for the process that gets you from this large vacuum cleaning mass of data that we are sucking up to something that is useful and ultimately actionable.

Michael:         

(16:00) So today what big data technology enables business to do is really still at the largely infrastructure layer of all this big data technology that you’ve heard of like Hive, they work at the infrastructure layer.

(16:22) But what business really wants is the insights of the information that will help them make better decisions, so for example, whether to give this person a loan or not. So one of the surprising insights is that actually if people fill out the form with proper capitalization is actually determining whether they default or not. So those are the insights that business wants.

(16:55) So the data is actually just the application form that’s stored and you can search it and retrieve it. So what that means is that big data technology today is actually not meeting the needs of most businesses yet. What they do really well is to foundation the infrastructure layer and what the businesses actually want is to get the insights from that data.

(17:27) So what we need is basically analytics and data visualizations that will help us crunch this data and to extract the information and insights and make them available to the business decision-makers, so they can actually take action on those insights.

(17:45) I think that you know the next phase of big data revolution will be essentially proliferation of these analytic algorithms - very general algorithms that are out there for people to take advantage of.

(18:02) For example you could have a pile that is influenced algorithm. If you give me your data I will calculate and figure out the influencer for you, and maybe in the future there will be more of these types of algorithms. If you give me the data for this person’s tweets, I could figure out whether they are male or female, their age from just what they talk about and their linguistic profile.

(18:30) So there may be these types of algorithms that is provided as a service that you can use.

Michael:         

(18:36) But when I think of how most companies use data today and when I think about the products like enterprise software companies provide, and as far as I can tell from the most part it’s collecting data and then operating on that data and looking backwards. So now you have a dashboard that really is in truth, even although these software companies call it analytics. In truth, what it really is a pretty visual story summarizing their data. You’re talking about something different than that.

Michael:         

(19:18) I mean I think that big data is that it has this maturity journey as well. I think people need to be become sophisticated and learn about how to use data and how to use these algorithms.

(19:35) So speaking about getting raw data to the information insights that people actually want there are actually three categories or three classes of I think analytics that people can do to get to the insights.

(19:54) The first-class is basically what you just described, that these are what I call descriptive analytics and they usually show them in dashboards and when you have seen in a dashboard it is a summary of what has happened. It is a summary of historical data that you have collected. So whether they show me an aggregated or to sum it up, or to average it and do some simple computation on it, it’s a summary. So those are descriptive analytics and I would say that 80% of most of the analytics that most companies do fall into that category.

Michael:         

(20:40) Basically it’s a summary report, it usually presented in a dashboard, a visual format. Only now we can say, well it is also mobile responsive and it has got all sorts of fancy features, but basically you are talking about reporting.

Michael:         

(20:58) That’s right. So that’s what I call descriptive analytics. That’s basically you are just trying to summarized what has happened. But there is also two other classes of analytics that’s really useful for people to essentially get from data to insights.

(21:16) One is the next class and we have talked about this a little bit and that’s predictive analytics and then the next class is more advanced and you do prescriptive analytics.

(21:30) Usually the maturity of I think of anybody or even whether it is a company or person, you always start with descriptive analytics and then if you get enough data you become more sophisticated and then you could actually build predictive analytics. And if you are more advanced then you essentially do prescriptive analytics.

Michael:         

(21:54) So we have descriptive analytics which is essentially pretty reporting, and often at times position by software companies as predictive analytics even although it’s really not. Then we have predictive analytics that tell us the future in a sense. Then we have prescriptive analytics and so you drill down a bit in the differences between predictive and prescriptive analytics.

Michael:         

(22:24) So that’s very interesting, so in some way I would say that some people, prescriptive analytics and do summary reports predictive analytics. I think in some cases they can be used and I think these are situations that I would say the dynamics change very slowly.

(22:46) For example if you want to predict weather tomorrow is going to rain or not, you can always say it’s the same as today, right because the weather doesn’t change very abruptly. So if you just predict the same as today and you will pretty much get it right most of the time and it’s probably not good enough, but you’ll get it pretty close.

(23:08) So one way I would say that is the simplest type of predictive analytics is a trend line. I trend line is a predictive analytics that everyone is familiar with. You look at the data and the follow some trend and basically you can see that if you continue to follow this trend to moral or in the future it will be this value.

(23:38) So if it gets hotter and hotter and the summer is coming and right now it is getting colder and colder and winter is coming, then you know that tomorrow is probably going to follow a similar trend. So a trend line is a simplest I would say type of predictive analytics.

(23:55) But I would say that predictive analytics – I want to emphasize a point that they don’t have to predict in the time domain. They don’t have to predict I would say in the future. You can actually predict things in the past as well, but these are called the general predictive analytics. Basically, in this case you are trying to use data that you have to predict data that you don’t have. That’s what predictive analytics really allows you to do.

(24:30) So in the case of temporal prediction, when you are trying to project in the future, the data that you have are historical data and the data that you don’t have is future data which you can never have, right. Nobody can actually go to the future and measure what happens in the future. So all predictions about the future and even although they look the same that if I predict tomorrow’s weather and it’s going to be 60° I could predict that.

(25:01)That’s a number and you can plot out the same graph store it and retrieve it exactly the same way as the measured data that you can actually have. You can measure today’s data, but notice that the difference is here.

(25:13) Tomorrow’s temperature is not actually measure, nobody can actually measure tomorrow’s temperature. You can’t go to the future tomorrow and measure that tomorrow is going to be 60° and come back and report it.

(25:24) Tomorrow’s temperature is actually predictive and actually a result of computation and the output of some model that is not being measured, so that’s one distinction that I wanted to draw for the people to understand.

(25:41) If you understand that then basically predictive analytics is really simple and it’s basically what you put into a model and the output of the models tells you something that you don’t already know and you don’t have. Those armed basically predictive analytics.

(26:00) I would say in social media there are a couple of times of predictive analytics that people are actually familiar with. For example, sentiment analysis, that’s actually a predictive analytics. Sentiment analytics is that nobody actually goes out and report that and their sentiment is positive for Apple or android or whatever. They just say I love my iPhone I love my new android and they will speak it in natural language.

(26:30) So the natural language is the data that we have, the input and the model looks at the use of linguistic processing, you know natural language processing and its statistical model. So when built a model people use these this type of language that typically means that they have positive sentiment or negative sentiment. So it’s actually computed and it has not actually been measured.

Michael:         

(26:57) Okay, let’s come back to that in a moment and we have a comment from Twitter and I’ll just tell everybody that we are talking with Michael Wu who is the chief scientist of Lithium and really one of the top practitioner, practical data scientists in the world. So this is your opportunity to ask him whatever you want and I encourage you to ask questions.

(27:26) But we have a comment from Alan Duncan, who is an analytics analyst at Gartner. He makes the point of talking about the type of data collection that you are discussing earlier, so Alan says, if you don’t have a problem was the point of collecting data. Data only exists for some purpose as he calls it horse feathers.

Michael:         

(27:54) I would say that it shows time horizon. If it’s a problem that’s never going to be dealt with in a year or two or five years or something like that, then you probably don’t care. But for if you are talking about the NSA or something like that, they’re time horizon is much longer. So they may have to look back to when they were born, 20 years ago, so all this data becomes important when you actually look at your time horizon is much longer. So it depends on different types of problems that essentially has different time horizons and different timescale.

(28:48) For business, because business I would say changes fairly rapidly and the market changes pretty quickly, so I would say things that are maybe too old or too outdated, or maybe you don’t have to keep them. But for an organisation like NSA and the problem that they deal with has a much longer time horizon and you do have to keep that data. Even although they may not have value for you today, but they may be valuable 20 years later and you don’t know that.

(29:29) I mean the problem may not show up here today, but the problem may arise 20 years later. But if you wait until 20 years later and say, and then you realise that oh shoot I didn’t collect that data then it’s too late.

Michael:         

(29:43) So, time is clearly a function of relevance, or relevance is a function of time plus other factors.

Michael:         

(29:55) That’s right. I would say that it is context. Relevance is context specific. Context specific means it’s relevant to who, when, where and for example if I’m travelling, like a few weeks ago I was travelling in Budapest so the weather there and the traffic there is really relevant to me. It may not be relevant to you because you’re not there, but this week I came back to San Francisco and saw the weather there or the traffic there is no longer relevant to me and I don’t really care about it.

(30:32) So if I have to keep track of the data I have to solve problems as to getting around places and I don’t have to keep a couple of weeks, because I know where I’m going to be. But after I get to places then I need to know where to get food, what’s the best restaurants, so those are what I would say relevant data and relevant information for me when I’m there.

(30:58) So basically it’s the who, where, when, and what and basically all the context of the problem that you are trying to solve. Right, so for me right I’m trying to get there and make sure that I can get around, get to places and to speak beyond time and also enjoy the food there and all of that. So that’s a problem I am trying to solve.

(31:25) So what’s relevant and what is not relevant is determined by the problem you’re trying to solve, so if this data actually addresses this problem then it’s relevant, but what is relevant is specific to the problem. But if you have the same problem as I am and that same data will be relevant to you. If you don’t have the same problem as I am, then those data will not be relevant to you.

Michael:         

(31:48) So we were talking before about sentiment and social media and I know that in your work you’re very careful to connect the analysis that you do to what you call actionable results.

Michael:         

(32:07) So that’s how to deal with the third class in analytics and that’s what I call prescriptive analytics, so let me just give you an example of what prescriptive analytics is.

(32:22) So the simplest type of predictive analytics is a trend line, and the simplest prescriptive is for example Google map, it tells you where you need to go. It has prescribed a route for you to get to where you want to go that’s in essence what it does. But just as in predictive analytics it doesn’t have to happen in the timeframe domain to predict things right, predict peoples sentiment in this sentiment domain.

(32:57) once you have a model you can predict things in the past you can put the Bible through a predictive sentiment classifier, and you can actually see when particular characters in the Bible is angry or happy or something like that.

(33:17) You can predict things in the past. So the same thing for prescriptive analytics, they don’t have to happen in the geospatial domain also, they can happen in the business domain or where the destination or the goal that you want to get to is some business KPI for example, I want to achieve the highest customer satisfaction or the greatest lift in revenue, those would be your business KPI that you are trying to. And prescriptive analytics will prescribe you what you need to do and what you need to focus on in order to get to those KPI.

Michael:         

(33:56) So how do you know, how does the business now what are the appropriate KPI’s and therefore the right type of analytics to apply. What’s that determination?

Michael:         

(34:09) The business knows what KPI they measure on and it’s the same traditional business KPI. There is no difference. Basically, social media metrics – these are operational metrics, I mean the number of impressions, number of fans, number of followers – all of these are what I call operational metrics. These are not business level KPI.

(34:40) Let me step back a little bit. If the goal of the business is to have more fans – I don’t know if any business exists, then that would be a business KPI. Most businesses are not out there to get more fans, most businesses out there are to make more money or to save money or to serve customers.

Michael:         

(35:04) Actually if we think about many start-ups, especially social start-ups who are not focused on revenue their measurement is that do we have more fans. Do we have a growing user base, even although it may be free but they know it’s costing them money and they are in effect subsidizing every one of those users. So I think that use cases are real ones.

Michael:         

(35:25) Yeah so I think in that case during those early phases then if that’s the goal of their business then that would be the proper KPI for them. But if your business is not mature enough, for example you have shareholders where you have to make sure that they are profitable then that would be your KPI. You’re not a start-up anymore and then – if you are start-up, yes maybe that could be your KPI for a while, but eventually they will have to shift over to the more traditional type of KPI, so pretty standard the traditional business KPI.

Michael:         

(36:13) So good data just tweeted an interesting chart describing different types of analytics and in addition to descriptive, predictive and prescriptive they added diagnostic. So and they say, diagnostic is why did this happen and what insights can I gain. So it’s sort of sits between prescriptive and predictive.

Michael:         

(36:38) I would would say that is still descriptive analytics. Descriptive analytics tells you what happened and you can use that data. Diagnostic or not diagnostic is really a use of the data. You can use predictive analytics or descriptive analytics or even prescriptive analytics to diagnose what happened in your business process or your data collection process, and operation and all of that.

(37:04) So that’s what I would say an application distinction, I mean how the data is being manipulated or modelled perspective.

Michael:         

(37:19) So now let’s go back to social media. With social media we tend to focus on what some people call vanity metrics. How many people like us and the things that feed our ego like the things that you were describing earlier will make us feel good. But you’ve made the point that those type of metrics and simply looking at these type of metrics are far less important, than understanding our customers and their needs and how are we going to get more customers. So maybe link all of that up for us.

Michael:         

(37:57) I think whether it is descriptive, predictive or prescriptive, the ultimate goal is to help business decision-makers and also to be able to take actions on the analytics of the data that they see.

(38:20) So that means action ability is really important, which I haven’t really defined in what it is yet, but this is actually a term that has been thrown around and a lot of people say that they provide actionable analytics, but what do they actually mean. So maybe I will take a little digression a little bit to kind of discussed that before answering your question.

(38:46) So actionable is a type of analytics and is also prescriptive analytics. Prescriptive analytics, remember it has to tell you a course of action where you can take the action and affect the outcome. If you cannot take an action then basically it’s not prescriptive analytics.

(39:19) So with prescriptive analytics there are two things that are actually important. One is that we have to understand prescriptive analytics is a specific type of prescriptive analytics, in a sense of let’s take a look at prescriptive analytics in the problem of temporal predictions, and say you are trying to predict tomorrow’s weather.

(39:39) So that’s a very simple problem that everybody understands and we know that if we look at the near future, say tomorrow or two days from now. We can predict the weather in these couple of days very well, but if we look further and further into the future, let’s say a week, 10 days, or maybe a month then your prediction gets worse and worse.

(40:03) So there is this notion of what we call a predictive window, and that means within this window the error that you make in prediction is still acceptable to you. So that’s what we call a predictive window.

(40:19) We talk about action ability, you have to have another measure cold reaction time and that means basically these are the time that it takes for you to act or to take action against what you have learned from this predictive analytics - these predictions. So one of the most important criteria for action ability is that your reaction time has to be shorter than the predictive window, let me illustrate what that means.

(40:47) With the weather prediction again, so we can predict weather pretty good and pretty well say for like a week – five days to 10 days or something like that, we can predict weather pretty well within that time range. If you try to predicted further in the future the error gets pretty big and becomes not very useful to us.

(41:13) So wants the reaction time for us to take action against the insight of what we can learn from this model. Probably, if we know that tomorrow is going to rain you just bring an umbrella and it takes you only a day at the most to take action against it. So in this case the reaction time is to take action is about a day, but the predictive window is accurate up to maybe 10 days. So in this case the reaction time is shorter than the predictive window and that means the weather models today is actually an actionable model.

(41:50) Let’s take a look at a more difficult problem. Let’s say earthquake prediction.

(41:57) So today, earthquake prediction is actually not very good and we can actually only predict may be a few seconds before the earthquake actually happens. So what is the reaction time for us to take action against this prediction – the insight that we learn from this prediction. If we know that the earthquake is going to happen it will probably take us in the order of minutes to tens of minutes to get to safety or prepare ourselves for an earthquake.

(42:26) So I would say in this case the reaction time is tens of minutes, which is much longer than the predictive window which is in the order of a few seconds. So in this earthquake model is what we call a non-actionable model and that means that even although it can provide you with an insight basically you can’t really act on it and basically have to prepare yourself to get ready to do preparation ahead of time, to make sure that when it’s going to happen you can act right away.    

Michael:         

(43:02) So we are almost out of time, but we have another question from Twitter which is a kind of interesting one and it connects back to what we were talking about earlier about social media and sentiment and all of that. This is from Christopher Kelly, who says he’s wondering if your thoughts on how we can measure the value of brand awareness if it’s not monetarily, if it’s not about money. Are there other ways to measure the value of brand awareness using data?

Michael:         

(43:34) Yeah, I think ultimately it still comes down to money eventually, right. It’s just that it may not come to your next quarter and it may be 20 years later. Like the branding, Coca-Cola has been around for so long and look at the value of the brand there. So I think it still comes down to the monetary value eventually, but you have to look at a much longer time horizon there.

(44:07) That means a lot of factors go into beyond just awareness, People’s sentiment about this particular brand and all that stuff. I think these all go into that, but I don’t think this is a simple question to answer. I think you have to do a much more longer time in analysis to see the long-term effect to understand the value of a brand awareness.

Michael:         

(44:37) So we are almost running out of time again, but before we go, give us practical simple advice first for marketers and second for individuals who want to improve their awareness on social media, because that’s actually what everybody wants.

Michael:         

(45:12) Well I would say for business, you should definitely start out with some business problem. I think data, as somebody tweeted, when you don’t have a problem that the data is actually not providing value and that is true, but you don’t know what problem you may have in the future. So the problem is actually very important.

(45:37) So if you can and you want to have big data initiative or a big data strategy, first identify some problems so that the data that you collected have more immediate value. Certainly, they will have a longer shelf life, they may have value and they may prove valuable in 10 or 20 years down the line. But you don’t realise the value until you actually address a problem 10 or 20 years down the line, so that I would say latent value and you don’t realise those yet.

(46:09) Then you basically have to see what kind of attributes or what information and what insights you can get from this data, they are actually predictive of these business level KPI. Whether it is revenue lift, brand awareness and in terms of the long run it still comes down to value.

(46:34) So for individuals, I think there are I would say that there is lots of data out there about people’s consumption of social media or participation in social media. And you can actually look at that data.

(46:48) I actually write a blog myself and for me I’m actually very interested in how much should I write, how frequently should I write to get the biggest bang for my effort. Certainly if you are a company, that may not be the problem. Certainly if you write more you certainly are going to get more page view, but I am actually resource constraint and I don’t have a lot of time.

(47:18)So I actually wanted to see and I typically like to look at metrics, something like an output per unit input. So how many page views I get per blog I write. So certainly if you write more blog you’re going to get more page view, but you get a diminish in return when you write more.

(47:39) Certainly when you write one every minute and if somebody can actually write one blog every minute a lot of those blogs will just get pushed away and they will get read. So actually you get a lot of page view per blog you may get overall more page view, but you may not actually get more page view per blog.

(47:59) But if you don’t write frequently enough then people certainly your readership will drop off as well. So there is some kind of middle ground where you will get the most page view per blog that you write.

(48:17) So what’s the frequency, what’s the right frequency for that. For my own blog I actually done some analysis and it turns out to be somewhere between 11 and 12 days and that’s the frequency I should publish so, to get the most views per blog.

(48:34) You can do some simple type analysis like that for yourself so that you maximize your social media efforts.

Michael:         

(48:43) Okay, well this could go on for a long time and I want to thank you. We’ve been talking with Michael Wu, who is the chief scientist of Lithium Technologies, and I hope I’ve understood a fraction of what you have told us today. Thank you so much Michael for joining us, it’s been a very insightful conversation and I really appreciate all of the tweets and the questions that we had as well. I hope you will come back and join us again Michael another time.

Michael:         

(49:15) It’s my pleasure I’d love to come back sometime.

Michael:         

(49:19) And everybody who has been watching, thank you so much and to my friendly co-host, Vala Afshar, wherever you are out there in the world I hope it’s going well and we’ll see you again next week, and everybody will see you again next week, same time, same place on CXOTalk. Thank you so much, bye-bye.