Join CXOTalk for an exclusive interview with Neema Raphael, Chief Data Officer and Head of Data Engineering at Goldman Sachs, as we explore the data revolution in finance, digital transformation strategies, and the future of the industry.
Data and Analytics at Goldman Sachs
Chief Data Officer and Head of Data Engineering
In this episode of CXOTalk, we host Neema Raphael, the Chief Data Officer at investment bank Goldman Sachs, who explores the importance of data strategy in the financial services industry.
Raphael explains Goldman Sachs' three-pronged approach to managing data: through platform development, content curation, and governance and quality control. The conversation includes a discussion on the challenges and opportunities that come with handling complex, real-time data in a high-stakes environment such as financial services. He also highlights the firm's commitment to open-source platforms, notably its platform called "Legend," as a tool to promote data interoperability throughout the industry. During the conversation, Raphael offers advice on developing a successful data strategy and discusses the role of cloud technology in modern data management.
The conversation includes these topics:
- Why data is important in financial services
- Goldman Sachs’ approach to managing data
- Challenges and opportunities of managing real-time data at scale
- Managing data complexity and the role of technology infrastructure
- Why Goldman Sachs open-sourced its data platform
- How Goldman Sachs assigns financial value to data
- Ensuring data quality at scale
- Importance of infrastructure, automation, and platform in managing data
- Implementing data policies and governance
- About Goldman Sachs’ data team
- Promoting data interoperability through open-source platforms
- The challenge of building an open-source community
- Balancing data governance and innovation: insights from the financial industry
- Advice from Goldman Sachs on building a successful data strategy
- The role of cloud computing in data strategy and its impact on business outcomes
Neema Raphael is Chief Data Officer and Head of Data Engineering at Goldman Sachs. Previously, Neema was head of Research and Development Engineering, responsible for determining the firm's strategy for emerging technology such as digital assets and artificial intelligence. Prior to that, he led various engineering teams in Core Engineering and was a member of the Core Strats team within the Securities Division that built SecDb. Neema joined the firm in 2003 as an analyst in the Technology Division and was named managing director in 2013.
Michael Krigsman: Today we are discussing data at Goldman Sachs. We're speaking with Neema Raphael, the firm's head of data engineering and their chief data officer.
Neema Raphael: I've been at Goldman about 20 years, always as a software engineer, data engineer, strat or quant, as we call quants here or strats. Now, as you mentioned, I run our data team, our data engineer team, and chief data officer for Goldman.
Data engineering at Goldman is probably a little bit different than other places. I sort of organize my brain and my team in sort of three buckets.
We have our platform team, which hopefully I'll have a little bit more time to talk about, and our curation, our content and curation, team. And also, as the chief data officer with my chief data officer hat on, sort of our governance and quality team. That's sort of the background of the work we do.
Why data is important in financial services
Michael Krigsman: Data is so obviously important to financial services, but it would be really interesting to hear your perspective on the role of data, how data fits into the world of Goldman Sachs.
Neema Raphael: Information is the lifeblood of financial services. People say, "Well, what is that?"
I think, in financial services (and at Goldman, of course), information is actually our currency. So, a lot of decision-making, a lot of helping our clients, a lot of innovation is all based on what information we see out in the world and what information we see internally. Organizing that, making sure it's readily accessible, making sure people could get to that data quickly to do their work or to help their clients starts becoming sort of like a core component of doing business at Goldman Sachs.
Goldman Sachs’ approach to managing data
Michael Krigsman: You mentioned the various aspects of your role, how you break things up. Do you want to dive into that and tell us a little bit more about that?
Neema Raphael: The first thing I think about data is that it has to be a first-class asset at Goldman. People have to believe that it could help solve problems, help our clients, help us innovate in financial services.
The first thing we sort of did is, okay, how do we want to work with data at Goldman? Again, as an engineer, that always pushed me to sort of think about data as an engineer would think about their code.
The analogy I always say – or maybe it's not even an analogy – but the way I always talk about data is you've got to think about data just like you were thinking about your code. That sort of has pushed us to build a platform team that really thinks about that, thinks about data engineering as an engineering function, thinks about the workflows of data engineers, how we could help get leverage and scale to our businesses so that they could start making decisions better, faster, cheaper, easier. And so, the platform team (on my team) is really laser-focused on that.
How do you make data engineering workflows just seamless, beautiful, and all the things that developers are used to? IDEs, code completion, making sure that we have reproducibility of our data like we have of code. All of that sort of becomes their first-class role is to only think about how do we make data awesome, working with data awesome.
Then we have a content team. Again, like I said, a lot of our workflows at Goldman run on data, so you could imagine part of our team is a real-time market data team, so all the data streaming from exchanges or different venues and things like that in real-time, super low latency.
We're talking about millisecond latency here to get to trader screens or to get into the algorithms. And so, there's a whole team of mine, which is in the curation business and the content business, whether it's real-time market data or things like reference data that we use like who are our clients or what products can we trade, and things like that. And so, that team is really, I'd say, sort of a data engineering bent team that uses our platforms to sort of do their work to make sure the data is amazing quality-wise, accessible, and ready to go.
Then the third bit is really about the framework that we do data governance and data quality around. A little bit around policy and frameworks, but then how do we push our platform team to then build in those mechanisms or controls on the platform so when you do your work at Goldman, you sort of get those for free, I'd say, or complementary.
That's sort of a little more about the three buckets we talk about here.
Challenges and opportunities of managing real-time data at scale
Michael Krigsman: What I find particularly interesting is, unlike most businesses which of course have their business data, you're dealing with huge volumes of real-time information of real-time data at which there is enormous amounts of money and risk at stake. I have to imagine that that places an additional kind of intense burden on you as you're thinking about this whole technology landscape.
Neema Raphael: I would sort of flip it in a positive. We don't think of it as a burden. We think of it as a real opportunity. That I think is the first sort of mindset shift you have to think about when you're in these sort of high-stakes games.
Yeah, of course, it's high stakes. Of course, things could go wrong. But really, it's more about the opportunities to make sure that, again, we're helping our clients do the right thing. We're helping the economy. We're helping the whole financial system do the right thing. We take pride in that instead of a sort of burden thing.
One thing I'd say is I actually liked that you said huge amounts of data. I actually say, at Goldman Sachs, we're in the sort of medium-sized data but very complex data world.
Our data is super complex. The types of data are complex. The speed is complex. There's everything from super real-time low latency all the way to end-of-day batch processing.
The relationships between our data is complex. The product catalog is complex. It's not just SKUs of whatever. It's actual financial instruments that people have made up. The complexity is really the really interesting and cool challenge here versus, I'd say, the volume.
Managing data complexity and the role of technology infrastructure
Michael Krigsman: What makes the data complex? You were just describing a little bit. Can you drill into that?
Neema Raphael: The complexity, again, comes from various angles. Like you mentioned, the complexity comes from the speed is one dimension, but also the relationships of the data. You could imagine there's data about the stocks and bonds that our clients want to trade for one example, then there's a whole infrastructure around those.
I don't know how many people here are familiar with financial services, but there are derivatives on those products, so now you're not just talking about stocks and bonds, but you're talking about really complex now algorithms plus complex data elements about now that have a relationship that have been now made up in the world on those stocks and bonds. The layering of this data and the complexity now that people have come up with such creative ideas to help our clients and things like that starts pushing the boundaries of how this data is related and interconnected and curated.
Hopefully, that gives a little bit of a sense.
Michael Krigsman: You are managing these very large volumes of real-time data. Do you want to tell us at all about your technology infrastructure? It's not a question I usually ask, but it seems that, in this case, given the equities, the derivatives, everything else that you had just mentioned, it seems the technology infrastructure has to play a very important role.
Neema Raphael: The infrastructure absolutely plays a huge role here. We have an incredible infrastructure team. And you could imagine, again, that this goes all the way down to sort of the hardware layers, like the networking, the stack, the computers you use, the network cards in the computers you use. All of this stuff sort of actually matters at this sort of latency and scale.
Again, I don't want to super over-index on the real-time stuff. The real-time stuff is very important, a very critical and important part of our world, for sure. And the team does an amazing job all the way from the hardware layer up.
But we also have data challenges just as the real-time data comes in when you do a trade. Right? That also starts being complex, and so we have built sort of a data platform here we call Legend, and we've actually open-sourced it recently (in the last couple of years).
We've built this tech stack over the last ten years, internally. About two years ago, maybe three now, we open-sourced it. It's fully on GitHub. You could check it out. We gave the code to a nonprofit, an open-source nonprofit called FINOS. Check it out on GitHub. I'm happy to talk more about that as well.
Why Goldman Sachs open-sourced its data platform
Michael Krigsman: Why did you open-source it? At first glance, it seems like an odd decision to me because one would think that a firm like Goldman Sachs would want to keep all of that infrastructure to yourself.
Neema Raphael: There are a couple of reasons. One is, as we were building this, we talked to a lot of clients who were having exactly the same data challenge as we were.
You know we're obviously a very client-centric firm, so we thought, "Look. This has helped us so much internally. This platform has helped us so much internally. We'd love to give it a chance to help our clients, to help push the industry forward as well."
And so, you also have to separate a little bit about the platform itself versus the content that we curate and work in the platform. We haven't yet – or maybe never will – open-source the actual content, but we could talk about that as well.
But the platform and the work we've done to sort of standardize how we work with data, we thought that was so powerful that actually giving it out to the community and building a community around that could actually help the whole industry and our clients and ourselves.
Michael Krigsman: You are wanting to share best practices, approaches to working with data, things like that with the community, and then, of course, everybody is going to have their own data content within those constructs.
Neema Raphael: That's exactly right. The interesting part is, even if we don't give out the content, we are working with industry-standard bodies to even describe the data and structure the data and the linkages to the data.
It's a really cool and powerful technique working with these standardization bodies, like, what does a derivative look like? Even if our data looks slightly different than our clients, at least we could standardize on how we talk about those things. That's what the Legend platform really excels at.
Michael Krigsman: It's really a contribution to the broader data science, data and analytics community, essentially, ultimately.
Neema Raphael: Absolutely. Absolutely.
I think the platform is a general-purpose platform for working with data. Then we specialize on some standards and things like that, working, which is, I think, another contribution to the financial industry, in general, is just how do you get to that interop and standardization of at least data contracts or how we talk about different terms and relationships of different pieces of information.
Michael Krigsman: Please subscribe to our YouTube channel and hit the subscribe button at the top of our website.
We have a number of questions on Twitter, so why don't we jump over to some of those? We have an interesting question from Arsalan Khan. Arsalan is a regular listener, and so thank you, Arsalan, for listening and for this great question.
How Goldman Sachs assigns financial value to data
He says, "Given that data is such an important asset, how do you assign the financial value to that data? Who is it that decides what data has financial value and how much its worth?"
I'll just add to that. I'm assuming that that helps. That attachment of value is one of the things that guides your priorities in terms of where you focus your team focuses.
Neema Raphael: What we do is we work with our business line, so I run our core data team. We work with our business lines about what they need for their clients, to serve their clients better, what they need for their business to scale, what they need from us to help them sort of get an edge or innovate in the data realm.
A lot of it is about the value of the overall outcome versus just ascribing value to some piece of data. And so, the way I think about, from a core team, helping people is really what is the outcome we're trying to drive for our clients and for our businesses and for innovation, and really look at that holistically.
Again, I think it's a little bit of a misnomer or mistake to sort of try to say, "Okay, this data set creates this much value. This creates that much value."
It's like, "What did we do for that business and for our clients?" Really we take those wins as platform and data wins collectively with those teams.
Michael Krigsman: That makes perfect sense because the point is not some piece of data or body of data. The point is what are we doing with that data.
Neema Raphael: Exactly.
Michael Krigsman: And as you said, what are the resulting business outcomes? Then you have a framework for valuing the data because we're trying to get to the outcome.
Sorry. I didn't mean to answer for you. [Laughter]
Neema Raphael: No, no, no. That's exactly right. That's great. That's exactly the right summary, and that's exactly how we do ROI, our return on investment on the work we do.
We actually work hand-in-hand with those teams and say, "Okay, were we able to reduce risk by X?" or "Were we able to help our clients do Y better?" or faster, or get them into a better position for Y, or "Did we enable the business to do some new thing that they were not able to do?" That's really the ROI calc at that level.
Ensuring data quality at scale
Michael Krigsman: We have another question from Twitter, another great question. This is from Natalie Bean who says, "How do you balance data quality and data quantity?" We haven't even spoken about data quality yet, so this is a great question.
Neema Raphael: When you think about data as a first-class concept, when you think about data, when you do the same things you do with your code that you do with data (like think about your data architecture upfront, think about how you structure your data, how it relates to other pieces of data) and you do that work upfront, we have seen that, yeah, the upfront work takes a little bit longer, but the huge benefits of that become apparent as you're trying to scale these things.
We deal with it. We deal with the scale and the volume and the complexity like any engineering org is. We've built tools. We've built platforms. Then we make sure that those scale to those problems.
Now when we do data at Goldman Sachs, it's like we do it on the platform and everyone knows, "Okay, now the quants are going to get their data in days instead of months because we have set up the right platform constructs and engineering constructs for that."
Again, it's not a perfect silver bullet answer of, like, "As things grow, there's some equation," but we have seen that the investment in the platform and the tools and the workflows, that's the thing that helps us scale.
Importance of infrastructure, automation, and platform in managing data
Michael Krigsman: Again, I find it really fascinating that the platform and, as you said, the workflows play such a crucial role. But I suppose it's entirely logical that when you need that data (when that data is so important, and it needs to be right, and it needs to be consistently right) that you need that infrastructure and the automation to make it happen at that level of quality.
Neema Raphael: That's exactly right. This is my personal view coming from an engineering background. I solve scale problems and volume problems and complexity problems by breaking it down into engineering steps and engineering platforms. And so, that's sort of been our ethos of this whole Legend platform is really about that, how to attack that problem from an engineering lens.
Implementing data policies and governance
Michael Krigsman: We have another excellent question from Twitter. This is from Lisbeth Shaw who is now alluding to governance, which you mentioned earlier but we definitely should talk about that. She says, "How do you build in the mechanisms that support data policies?"
Neema Raphael: The point of the platform isn't that it just magically makes these things go away. It's that it makes you think of them upfront. It makes you think of these concepts as you're doing your data design and your data work.
Really, when you're on the Legend platform thinking about your data, the first thing you think is, what is your data contract? That's actually the first bit of the workflow.
Describe your data. Describe how you want to publish it. Describe how you want other people to see it or consume it. How you want to track things like lineage. All of that is sort of built in, baked in as a first-class concept.
Again, I don't want to oversell the thing. You still have to think about it. You still have to do it. But the point is bringing it up front instead of hiding it as some secondary thing.
You have to think about the entitlements, the security, the encryption. All of that you think of as sort of the first class assets, first-class concepts, as you're designing your data flow or your data workflow or your data production or data consumption patterns. All of that sort of comes together in the platform.
Michael Krigsman: It's fascinating, again, to me because it seems that your emphasis is definitely placed higher on that infrastructure and platforms than other chief data officers that I've spoken with. But at the same time, you're dealing with a level of data complexity combined with the speed and the financial consequence associated with it.
Neema Raphael: That's right.
Michael Krigsman: I think few other companies would have that set of combination of circumstances.
Neema Raphael: Exactly. The quality, the data has to be right. It has to be consistently right, as you said. And the ramifications of that are pretty big.
Michael Krigsman: We have another question relating to governance, and this is again from Arsalan Khan. He says, "How do you decide what data is good to use, what is not? How do you address biases in the data you're collecting and using?" And this is interesting. He says, "Do the business lines agree with your conclusions?" So, how do you also get everybody on the same page around this stuff?
Neema Raphael: We are not some isolated team in the corner with pointy hats doing this in isolation. I think that's the first, most important thing to get. We are hands-on-keyboard together with the businesses making sure that the data we're using, first, is right, but right for them and right for the use case, and really is solving their problems.
The first point I would make is aligning ourselves with business outcomes and with the businesses is the first thing, so not being in some isolated backroom like, "Oh, we know best about everything, every piece of data."
Again, the financial domain is so complex that we would never even pretend to do that. We have people who are highly skilled in financial areas so that we can make those decisions with our businesses in a joint venture fashion, but we would never say we're the experts in everything.
I think that's the first piece is you have to be connected to the business and the business outcomes. That's the first piece. Again, the quality and the governance then becomes an aligning incentive and joint sort of venture.
There's a healthy tension, of course. We want to do things at scale and they want to sort of solve the problem immediately. But the point is, again, that bringing these two teams together really helps accelerate that and then get to the right answers and the right data.
Michael Krigsman: This then leads to the question about the composition of your team. It's obvious that you have very deep technology and data expertise. But as you just alluded, in order to do your job, again, I have to assume that you need equivalent financial depth of expertise, especially when you go into concepts like derivatives, as you were describing earlier.
Neema Raphael: We do have a team of strats or quants in the maybe more financial world who have come from being on the desk, understanding how data is used on the desks, and have sort of a STEM background, whether it's technology, math, physics, whatever, who are equivalent counterparts on my team that work with these groups and actually understand the financial domain but cross over to the technical domain. That's what we call our data design and curation team. Pretty cool, high-powered crew that really is bound to those business flows.
About Goldman Sachs’ data team
Michael Krigsman: Those folks, which team are they a part of? Are they part of your organization or part of the finance or trading organization? Where do they actually fit?
Neema Raphael: In our team, we have a subset of that team, but those sort of people are also all embedded in the businesses as well. There's embedded what we call embedded desk strats or embedded quants in various businesses. But then I have a sort of small selection of that team that works specifically on data with those other business teams – if that makes sense.
Michael Krigsman: What are the elements that comprise a successful data team such as you have?
Neema Raphael: It really follows sort of the structure of the team, like the platform team is high-powered software engineers. They come from a software background. Their goal is to make the software bulletproof and build the right workflows for data engineers.
Then the second bucket is basically data engineers, people who are content experts but can use the platform, configure the platform, build data pipelines, build curation pipelines, build data models that then get shared out.
Then the third bucket is really this sort of hybrid. Again, I'll use the word strat because that's what we use internally. This hybrid team that really straddles deep finance and deep tech together.
Then the fourth is sort of our governance framework team. They're the people who sort of set the policies, set the framework, and work with the divisions to sort of make sure they're working in the bounds of our framework.
Promoting data interoperability through open-source platforms
Michael Krigsman: One of the topics that we have not really touched upon is the notion of data interoperability. I know that's important to you, so can you tell us about that?
Neema Raphael: We touched a little bit upon it. That's a big, big reason we open-sourced our Legend platform was exactly for that.
We felt that if we could bring some platform standardization in the industry for us, our clients, our counterparties, that then at least the discussions about how data works or how it should be connected or how it should be described in the industry, that we could help push that forward. We could help push that forward with standards, with other bodies. They could all do that work in one sort of sane way in our platform – well, it's not even our platform – in the open-source community platform, and that will benefit everybody.
It's absolutely a big, big play. I'll mention one project we're doing in the FINOS community, which is the open-source, nonprofit that we gave Legend to.
We brought ISDA, which is the standards derivatives body. They have built a data model called CDM (common domain model) for derivatives. That data model is now available in Legend for people to collaborate on in a collaborative environment out in the wild.
It has nothing to do with Goldman Sachs. They're just using the platform. So, these are pretty cool things that we're trying to do to push the interop and standardization out.
The challenge of building an open-source community
Michael Krigsman: How has the uptake been in the broader community of this platform and data interoperability?
Neema Raphael: The cool part is that our code is out there. We have clients actually deploying it, using it. We have other counterparties also looking at it doing POCs in their own environment.
The flipside of that is building an open-source community is very difficult. We vastly underestimated how much time and effort and energy goes into making sure people know how to use the thing. The documentation. Do they understand the value? Can they contribute code back? All of these things.
We knew it would take time, but I think we sort of grossly underestimated how much time to build that big community around these projects.
Michael Krigsman: What are the benefits that you have presented to the community as to why they should engage in this?
Neema Raphael: It has helped us do data better at Goldman. It has helped us organize our data. It has helped break down silos. It has helped the data quality massively. It has helped the data governance aspects massively internally, and so we point to sort of those success stories about how we do it internally that we could also help clients do the same thing. And if there's appetite, of course, that we could also help the broader industry break down those silos as well.
Michael Krigsman: We have another couple of questions that are coming in from Twitter. Do you think of your data or your platform as a product that you can sell, given that you've built all of this infrastructure?
Neema Raphael: I will say that's not the intention of doing it. I think, though, again, we have seen that as clients use it, as other banks use it, that people sort of come back to us and ask, "Okay, well, what could the support model be?" or "Hey, can you host this as a SaaS for us so that we don't have to deal with setting up the infrastructure or running the infrastructure?" or things like that.
Again, just not to oversell anything, I think we're still in the very crawl stages of even thinking about that community and how we could help do better and better. But our intention isn't to make money off of this thing. It's really about the standardization parts.
Now, if people see a large adoption and see a value, I think it could be an exciting potential opportunity.
Balancing data governance and innovation: insights from the financial industry
Michael Krigsman: We have another question from Twitter. Can you give any examples where using the platform has resulted in innovative ways for users to use or combine the data?
Neema Raphael: We sort of took, again, a data-driven approach to helping our firm and our risk managers and our salespeople and our traders understand that risk from a data-driven perspective. We have combined all that data and information into our platform and have basically – I hate to use this word – democratized the access to that.
Our team has linked a bunch of that data together, done some interesting analysis, but more importantly, we have given those base sets of data and the relationships between those data that may not be obvious to our users' hands so that now they have the access to that information. Now they could come up with creative things or risk-mitigating things on top of that data. I think that's been maybe one that resonates pretty recently here.
Michael Krigsman: It's kind of the reverse of shadow IT that CIOs didn't like years ago. I'm going back maybe five years. You're actually taking the data and, I assume, the tools, and placing it in the hands of users so that they can be creative, they can innovate with that data.
Neema Raphael: Exactly, but in a fully governed, quality way where that shadow IT problem isn't really a problem. We have put all the right guardrails and all the right governance around that so now people can actually innovate in a safe space, in a safe way, but it's not just locked to our team.
Michael Krigsman: Can you give us some insight into what kind of governance or how do you balance the need for security for privacy against making that data available, accessible, and easy to use for folks?
Neema Raphael: Before we do anything with data, we have, first of all, very strict policies and frameworks about understanding who is allowed to see what data. And even if you're allowed to see it, should you see it?
We call it "need to know." Just because you're allowed to see it, maybe you don't actually need to know that client information.
First of all, as an overarching theme, I'd say one of the most critical pieces is writing those policies down and actually then enforcing them in the platform. But before we do anything, that is a clear thing.
Then we have very clear rules about who can get access to what, how, and then again, to be a little broken record, encode those rules in the platform and make sure that that is the basic layer.
If we get that wrong, nothing matters. So, the first thing is to make sure that that baseline works before giving access to anything.
Michael Krigsman: In other words, as long as the foundation of governance, risk control, compliance with regulations, whatever is necessary, as long as those elements are in place, then you're able to share the data and let people – I was going to say – have free range. Obviously, that's not the case, but have enough rein where they can use that data creatively in the service of whatever their business goals happen to be.
Neema Raphael: Much better said than I said it, but exactly. Exactly the right mental model.
Advice from Goldman Sachs on building a successful data strategy
Michael Krigsman: What advice do you have for folks who are building a data strategy (based on what you've learned and done at Goldman Sachs)?
Neema Raphael: Make sure you attach yourself to business outcomes, the things that people care about at your company. As technologists in general – maybe I'll make a broad generalization – sometimes we care a lot more about the tech than the outcome.
But specifically for a data strategy, I think it's even more important to over-index on the outcomes because data sort of becomes this nebulous thing where now, like, "Well, what does it mean to have a data strategy? What does data even mean? Why do I need that? In the abstract, I get why I want information. But what are you talking about?"
The first, first thing I talk about always is attach yourself to business outcomes and show how the data and the data strategy actually makes those cheaper, faster, better, easier. It makes money for your clients. It helps reduce risk. It helps save money. That's the first sort of baseline advice I give to everybody.
Below that then becomes sort of, okay, set out what is the platform strategy going to be. How are you going to actually make the engineering work? Think about the business workflows and the developer workflows.
Then we talked a little bit about the org structure and the framework. I think those are also key pieces.
Make sure that if you're going to do an engineering strategy, you have a strong platform team. If you're going to have a content team, make sure they actually understand the domain that you're working with. And so, bringing those org pieces together.
But the number one key is to make sure you're driving outcomes.
The role of cloud computing in data strategy and its impact on business outcomes
Michael Krigsman: What's the relationship between your data strategy and cloud? Where does cloud fit into all of this? I can't believe we haven't spoken about that.
Neema Raphael: I think of the cloud again as a tool in the toolbox. It's not a means to an end. It's in my arsenal how I want to build my platform.
Yeah, I want infinite scalability. Other people have done the hard work on the infrastructure front. I want great databases that people have built, to not rebuild that on my own.
To me, the cloud is a great enabler. It's a great tool in my toolbox. It's a great way to get the scale. It's definitely a big part of the overall data strategy.
But again, just to be clear, I just want to make sure, like, it's not a thing for the sake of a thing. It's like, okay, these are great components, great pieces of infrastructure that I can now stand on the shoulder of giants and execute on. That's the way I think about it. It's a super important part of the strategy, just to be clear.
Michael Krigsman: I can see that you're very purpose driven. You are always coming back to the reference point, "Why are we doing it? What are we doing? What are we getting out of it from the business outcome standpoint?" That's very, very, very clear.
Finally, one last question. What is the relationship between data and building business models, economic models, financial models? How do those pieces fit together?
Neema Raphael: In technology, it comes down to algorithms and data. Those are the two big inputs.
You need the data to be great. You need it to be clean. You need it to be organized. You need to make sure all those pieces are set. It's easily accessible. It's findable. It's governed.
Then that becomes a major input into the algos, whether you're doing forecasting or algorithmic trading or helping your client with something. Those pieces just have to work together as a team to get the job done.
Michael Krigsman: Again, I'm putting words in your mouth, but this is just summarizing what you've just been talking about is the linkage to the outcome and being clear about what's the data going in, what's the expected result at the other end, and ensuring that the two match up.
Neema Raphael: Yep. Yep. Exactly.
Michael Krigsman: With that, unfortunately, we are out of time. Neema, I just want to say a huge thank you for spending time with us. I really, really appreciate it.
Neema Raphael: Thank you. Thanks for the great questions. Thanks to the audience for amazing questions. I really had a really fun time talking to you, Michael. Thanks for having me.
Michael Krigsman: Everybody, thank you for watching. I just want to say a huge thank you to Neema Raphael. He is the head of global data engineering and the chief data officer of Goldman Sachs.
Now before you go, please subscribe to our YouTube channel and hit the subscribe button at the top of our website. Actually, you know, the subscribe button has moved to the bottom of our website, so hit the subscribe button at the bottom of our website so we can send you our newsletter and keep you up to date on our upcoming live shows.
Thanks so much, everybody. Hope you have a great day, and we'll see you next time.
Published Date: Apr 28, 2023
Author: Michael Krigsman
Episode ID: 785