/ Insights / View Recording: Responsible AI: Data Governance Insights View Recording: Responsible AI: Data Governance June 20, 2024 How Responsible AI Data Governance Drives Real Business Value Is your organization ready to scale AI safely and effectively? Responsible AI data governance is the foundation for unlocking value from Microsoft Fabric, Purview, and advanced AI solutions. In this webinar, Concurrencyโs experts reveal how to prepare, certify, and govern your data for AIโensuring compliance, agility, and trust. Learn practical frameworks and see how leading companies in Chicago, Milwaukee, and Minneapolis are leveraging Microsoft and ServiceNow expertise to operationalize AI with confidence. WHAT YOUโLL LEARN In this webinar, you’ll learn: How to frame your AI and data governance strategy for maximum business impact Steps to move from raw, unstructured data to certified, trusted datasets How to operationalize RAG (Retrieval Augmented Generation) and vector databases for enterprise AI Practical use of Microsoft Fabric and Purview for data lineage, labeling, and compliance Building effective guardrails for AI safety, groundedness, and access control FREQUENTLY ASKED QUESTIONS How do we choose and certify datasets for AI use across departments? Start by identifying value-driven use cases, then define certified datasets with clear ownership and schema. Use tools like Microsoft Fabric to ensure ongoing data lineage, and Purview for labeling and access control. This process builds trust and enables cross-departmental AI adoption. What governance controls should wrap prompts, inputs/outputs, and models in RAG apps? Apply three layers: respect data access permissions, set strict input/output controls at the application level, and implement model guardrails to prevent unauthorized or unsafe responses. Microsoftโs latest tools help automate these controls for enterprise AI. How do Microsoft Fabric and Purview work together to manage lineage, labels, and access? Fabric provides the platform for ingesting, certifying, and sharing data, while Purview overlays security, compliance, and labeling. Together, they enable seamless data flow, clear lineage, and robust governance across your Microsoft ecosystem. What steps convert raw, messy ERP/doc data into cleansed, retrievable knowledge? Ingest data into a modern platform, cleanse and structure it, then certify datasets for use. Use chunking and vector databases for unstructured content, and maintain lineage and attestation for trust. This ensures your AI is powered by accurate, up-to-date information. How does groundedness detection reduce hallucinations in enterprise AI? Groundedness detection validates that AI responses are based on actual source data, reducing the risk of hallucinations. Microsoftโs Azure AI Studio and related tools now offer built-in groundedness checks for safer, more reliable enterprise AI. ABOUT THE SPEAKERS Concurrency, Chief Technology Officer at Concurrency, is a recognized leader in AI, data ecosystems, and Microsoft-centric governance. With deep expertise in Microsoft Fabric, Purview, and ServiceNow integration, Nathan helps organizations design and operationalize responsible AI strategies. He is joined by Solution Architect Brian Haydin, who brings hands-on experience in AI solution delivery and data platform modernization. EVENT TRANSCRIPT Transcription Collapsed Transcription Expanded Concurrency OK, welcome to responsible AI data governance. 0:0:11.846 –> 0:0:19.736 Concurrency We are going to have a great conversation today about how you prepare your data state to be able to gain ground and leveraging AI within your organization. 0:0:20.126 –> 0:0:31.796 Concurrency This conversation is one that we are been really looking forward to because so many companies are trying to gain ground in AI, but they sometimes forget the fact that they need to get data ready in order to make that happen. 0:0:32.6 –> 0:0:35.426 Concurrency So we’re going to talk a lot about how you prepare yourself for that journey. 0:0:35.586 –> 0:0:39.706 Concurrency And today on this call you have myself, Nathan Lesneski. 0:0:39.716 –> 0:0:41.616 Concurrency I’m concurrency chief technology officer. 0:0:41.846 –> 0:0:43.936 Concurrency I would love for you to connect with me on LinkedIn. 0:0:43.946 –> 0:0:53.556 Concurrency I have a weekly newsletter on AI and all things AI leadership and also do a lot of posts and conversations around the AI data ecosystem. 0:0:53.566 –> 0:0:56.296 Concurrency So love to talk with you and connect with you online. 0:0:56.486 –> 0:0:57.556 Concurrency And we also have Brian. 0:0:57.566 –> 0:0:58.946 Concurrency Brian, you want to introduce yourself as well? 0:0:59.546 –> 0:0:59.896 Brian Haydin Yeah. 0:0:59.906 –> 0:1:0.756 Brian Haydin Hi, I’m Brian Hayden. 0:1:0.766 –> 0:1:2.736 Brian Haydin I’m a solution architect taking currency. 0:1:3.66 –> 0:1:5.106 Brian Haydin Been working with this AI technology for. 0:1:6.956 –> 0:1:10.136 Brian Haydin Yeah, I’m probably 8 years now, and thanks for having me. 0:1:11.56 –> 0:1:11.526 Concurrency Awesome. 0:1:11.536 –> 0:1:12.946 Concurrency So glad that you are here. 0:1:13.216 –> 0:1:18.186 Concurrency One thing that we all love to happen on this conversation today is for you to certainly use the chat. 0:1:18.196 –> 0:1:28.726 Concurrency So when you have questions or things that you’d like to learn about as we go through this conversation today, by all means drop those questions in the chat and we will address some real time as we’re going through the conversation. 0:1:28.936 –> 0:1:34.546 Concurrency And we’ll also maybe even hold it to the end if it’s a big enough topic and we can cover it at our Q&A section as well. 0:1:34.856 –> 0:1:37.826 Concurrency So leverage that chat, put your questions out there. 0:1:37.896 –> 0:1:42.196 Concurrency Let’s engage in helping you to be able to move your ball forward on the data and AI front. 0:1:43.326 –> 0:1:44.766 Concurrency So what are you gonna learn today? 0:1:44.826 –> 0:1:47.896 Concurrency Uh, we have really endeavored to make sure that this is useful to you. 0:1:48.166 –> 0:1:51.796 Concurrency There’s four things that we are going to accomplish within this session. 0:1:52.106 –> 0:1:57.176 Concurrency The first thing we’re going to do is frame the AI and data preparation conversation. 0:1:57.186 –> 0:1:58.716 Concurrency How should you think about it? 0:1:58.816 –> 0:2:20.506 Concurrency And I think that’s important to consider because the way you might have thought about framing your AI conversation and the way you think about the relationship between data and your AI journey, and then how you think about data is something that has shifted in the last year and has shifted as companies have been able to more assertively gain ground in their their AI and data journey. 0:2:20.516 –> 0:2:24.286 Concurrency So we’re going to frame that up and then we’re going to talk about how you think about your data state. 0:2:24.476 –> 0:2:29.406 Concurrency And then from that, we’re going to talk about how you consider your AI data model. 0:2:29.416 –> 0:2:32.936 Concurrency What I mean by data model, there’s a sort of loosey Goosey term. 0:2:32.946 –> 0:2:44.446 Concurrency What I mean by AI data model is how do I think about the sorts of data that I have within establishing an AI solution for my customers or my internal customers, my external customers? 0:2:44.626 –> 0:2:52.876 Concurrency How do I think about the data that sits in that ecosystem from preparing to positioning to the data that flows out of the system? 0:2:53.466 –> 0:3:10.236 Concurrency How do I think about that model and then on the tail end of this, we’re going to talk a little bit about AI, data safety and some of the tools that go into picturing and proposing and positioning your AI solution well, to be able to protect what you’re delivering to your end customers at the end of the day. 0:3:10.246 –> 0:3:16.106 Concurrency And this kind of goes into Data preparedness and governance guardrails that sit around these solutions. 0:3:16.116 –> 0:3:26.196 Concurrency And Brian, are gonna have a conversation about these topics and hopefully it’s interesting to you and you learn something from it and it’s an opportunity for you to take this into followed conversations within your business. 0:3:27.606 –> 0:3:38.536 Concurrency So we’re gonna start this by talking about what does data governance look like in the context of an organization, and then we’ll Flow this into that framing conversation. 0:3:38.546 –> 0:3:51.116 Concurrency So one of the things that companies are thinking about as they seek to establish a successful data governance journey, the first piece of this is that they realize that the data landscape continues to grow. 0:3:51.126 –> 0:3:58.336 Concurrency You’ve probably seen other presentations in the past where people talk about the rate of increase of data that’s happening within most organizations. 0:3:58.346 –> 0:4:4.246 Concurrency This particularly true as connected devices become even more significant within our organization. 0:4:4.256 –> 0:4:12.226 Concurrency It’s not just data that exists from like the business system, but the very nature of the business, the core product of the business producing data that can be used to accomplish good. 0:4:12.336 –> 0:4:32.866 Concurrency One of the things that I was talking with an individual who who is very much into the OT side of the manufacturing ecosystem and he refers to OT as sort of this undiscovered country of value that is existed for so long but so infrequently do we leverage the data that exists within the OT ecosystem or data that’s coming from our very products. 0:4:33.156 –> 0:4:34.746 Concurrency Great opportunity for us to talk about. 0:4:34.756 –> 0:4:35.676 Concurrency Like, how do I harness that? 0:4:36.736 –> 0:4:42.456 Concurrency The second is having to understand the operational silos exist within the business. 0:4:42.466 –> 0:4:56.966 Concurrency So we’re going to talk more about this as those operational functions being intentional parts within the business, but also limiting the extent to which they are truly silos that have difficulty sharing data with each other to accomplish organizational outcomes. 0:4:58.516 –> 0:5:17.86 Concurrency The 3rd and something that’s directly related to it is this balance that exists between governance and enablement, and this is this idea of data agility, the idea that data exists to accomplish valuable outcomes for our business and without having those valuable outcomes, Governance really has no function. 0:5:17.436 –> 0:5:23.706 Concurrency If I have data but the data doesn’t accomplish anything, then I’m governing something that has no real benefit to the business. 0:5:23.896 –> 0:5:32.756 Concurrency On the flip side, if I’m trying to accomplish benefit, but I don’t govern it, I don’t apply necessarily controls, then I’m gonna fail at this last one, which is a lot. 0:5:32.796 –> 0:5:34.306 Concurrency Ensuring that compliance. 0:5:34.316 –> 0:6:2.396 Concurrency Exists with external regulations or even you could pivot that last statement to say expectations from the customer expectations which the customers expecting from me in terms of my handling of their data and no more is that more significant when it comes to data within like security when you see some of these frequent platforms that you use on a daily basis being compromised and taking advantage of with the data that you hold dear, your identity are being used in ways that really are inappropriate. 0:6:2.646 –> 0:6:5.136 Concurrency Brian, which I’m gonna put you on the spot for a second. 0:6:5.366 –> 0:6:8.436 Concurrency Which one of these do you think is the most important? 0:6:8.536 –> 0:6:15.766 Concurrency Which do you think is the most sort of most critical or maybe even the most difficult in the context of the scaling organization? 0:6:17.436 –> 0:6:21.666 Brian Haydin I think that Data agility is the most critical for organizations. 0:6:22.136 –> 0:6:36.386 Brian Haydin You know this this technology is growing so fast where we’re where we are today from six months ago is is phenomenal and the biggest impediments that organizations are facing right now is the the ability to actually use this. 0:6:36.746 –> 0:6:38.606 Brian Haydin But it’s also really difficult, right? 0:6:38.656 –> 0:6:41.686 Brian Haydin And we’re gonna get to, you know, some of the purview discussion. 0:6:42.276 –> 0:6:44.46 Brian Haydin There’s work that goes into it. 0:6:44.576 –> 0:6:54.576 Brian Haydin The good news is there’s a lot of really good towards that are out there for it, like at like Microsoft Fabric, but that’ll help accelerate some of that. 0:6:55.186 –> 0:6:57.266 Brian Haydin But that I think would be the pillar I would choose. 0:6:58.26 –> 0:6:59.66 Concurrency Umm, OK, thank you. 0:7:0.666 –> 0:7:1.156 Concurrency I agree. 0:7:1.206 –> 0:7:2.176 Concurrency I think you’re totally right. 0:7:2.186 –> 0:7:11.476 Concurrency They’re like getting to a point where you truly do have data agility is so unlike where most organizations are at and that goes to this conversation. 0:7:11.486 –> 0:7:12.776 Concurrency We’re gonna have about framing. 0:7:12.886 –> 0:7:24.616 Concurrency So one of the things that I think companies that think about as they have this idea of data is that data has changed in terms of information that we’re leveraging within our greater state. 0:7:24.626 –> 0:7:35.706 Concurrency And one of the biggest changes that’s changed within the last year and a half has been the availability of general knowledge that’s being applied into our data state from outside sources. 0:7:36.6 –> 0:7:48.546 Concurrency Uh, no more significant than that of large language models that bring with them this general ability to be able to converse and engage and have conversations in a way that’s so human like. 0:7:48.616 –> 0:8:1.456 Concurrency But it’s based on this idea of a general set of capabilities, a general set of knowledge, not just specific understanding but general knowledge that allows for that interchange, that feels more human, like or feels more enabled. 0:8:1.886 –> 0:8:10.956 Concurrency Then perhaps I’ll a model that’s built on natural language has in the past, but that gets combined with this idea of enterprise knowledge that’s unique to us. 0:8:11.226 –> 0:8:35.536 Concurrency And so this pattern of combining general knowledge, whether it’s external information from managed sources that are on the Internet or other sources that are from partners but more so to internal knowledge which is trusted, combined together into patterns like rag that are bringing both the trusted Knowledge together into a a pattern that like that serves my customer need. 0:8:36.16 –> 0:8:50.646 Concurrency But realizing that this is a whole ecosystem of governance that we’ve frequently haven’t had to manage in the past, many of our established data platforms and data patterns have been so focused on even a nuanced view of what enterprise knowledge looks like. 0:8:50.726 –> 0:9:2.66 Concurrency And the enterprise knowledge has been so focused on this idea of just what exists in our earpiece system or structured databases and not even as broad as what true enterprise knowledge has become. 0:9:2.566 –> 0:9:3.506 Concurrency So that’s really. 0:9:4.606 –> 0:9:11.836 Concurrency Reinforced, I think in where we’ve seen maturity curves in the past and how they’re changing in the future. 0:9:11.846 –> 0:9:21.516 Concurrency So this maturity curve, it’s one I’ve actually been using for some time, but it’s one that I think is changing in the context of that previous slide you just saw. 0:9:21.806 –> 0:9:32.76 Concurrency So I’m going to explain this first, but then I’m going to pivot it to talk about how the estate has changed and we need to think about it even more broadly than this. 0:9:32.326 –> 0:9:43.836 Concurrency So in this maturity curve, you can see that many organizations start right down here in the lower left hand corner and the way you would think about that kind of starting point is I’ve got a business system. 0:9:44.436 –> 0:9:54.86 Concurrency But the way that I view data from that business system is by exporting it to some sort of vehicle report or even more common like an Excel spreadsheet. 0:9:54.236 –> 0:10:15.256 Concurrency And most of us at this point feel very seen because many of our ways that we’ve handled data in our organizations is just that we have a business system, our finance department exports it to excel or inventory management, department exports it to excel, they build pivot tables and then they use that to be able to organize their next steps. 0:10:15.606 –> 0:10:44.716 Concurrency It’s very much like data gets frozen at that point, and then there’s human intuition applied to it to be able to take that next step, and from there you see organizations realizing ohh like nobody can truly trust that data because it’s still a person exporting it, cleansing it, doing something with it and the repeatability, the stability of that data state as a platform for a scaled understanding of how we leverage data to establish trust in the organization is missing. 0:10:44.896 –> 0:10:56.626 Concurrency So organizations have spent a lot of time building these traditional on premise data warehouses built on overnight etls that run and then our visualized through variety of dashboards and scorecards. 0:10:57.156 –> 0:11:8.206 Concurrency And one of the challenges with that approach has been it took a lot of time to get to value and many organizations, they spent two to three years just going down that road. 0:11:8.656 –> 0:11:17.146 Concurrency If they build it, they will come and they never really finish building it and the outcomes of the actual build never really got to that point. 0:11:17.156 –> 0:11:18.556 Concurrency And sometimes that can be even. 0:11:18.566 –> 0:11:31.986 Concurrency The modern service states where I dumped everything in snowflake and didn’t have really get to value and you have to ask yourself, did I really even get there and which sort of moves to this next stage of this idea of modern data framework? 0:11:32.46 –> 0:11:33.376 Concurrency Can I invest? 0:11:33.576 –> 0:11:36.276 Concurrency I dumped a bunch of stuff in Snowflake I. 0:11:37.16 –> 0:11:38.586 Concurrency Have been trying to leverage that. 0:11:38.996 –> 0:11:45.986 Concurrency I didn’t really get to the cleansing part and I’m still ultimately maybe still dumping it back to excel out of that. 0:11:46.26 –> 0:11:56.966 Concurrency That managed data state in the cloud, not realizing that I have to build myself in a framework that lets me get to these higher level capabilities, because where does real value establish? 0:11:57.516 –> 0:12:2.126 Concurrency It’s not just getting to the dashboard that people can trust, which is important. 0:12:2.256 –> 0:12:11.576 Concurrency It’s getting to predicting something that’s going to happen, prescribing the action I can take and then storytelling based upon that data on a potential choice that I could be making in the future. 0:12:11.966 –> 0:12:18.756 Concurrency It all that’s based on establishing a baseline for me to establish in front of the customer internally. 0:12:19.66 –> 0:12:25.286 Concurrency Now all of this is true in the context of a sort of structured data state. 0:12:26.386 –> 0:12:36.826 Concurrency The next step upon this is to say we have a set of data that that is true for in this structured data ecosystem. 0:12:37.256 –> 0:12:50.126 Concurrency What’s made this even more complicated is I now have all this unstructured data that’s been part of that picture and all these partner relationships that enter into that consumed estate. 0:12:50.196 –> 0:12:53.326 Concurrency So what I’m establishing my bronze. 0:12:53.336 –> 0:12:57.606 Concurrency Silver, gold pattern that existed back at this. 0:12:58.466 –> 0:13:9.256 Concurrency This sort of maturity curve as I’m moving into preparing data to do high capability workloads on the AI runs which you know is really a prerequisite for me to go up here. 0:13:9.766 –> 0:13:16.776 Concurrency I now realize that that has broadened beyond the structured data state to now be inclusive of the unstructured data state as well. 0:13:17.406 –> 0:13:32.166 Concurrency Brian, what have you seen as like the biggest challenge companies have established, serialized having to broaden their picture to include sets of data that they wouldn’t have even thought about before as being part of that broader ecosystem. 0:13:34.986 –> 0:13:36.706 Brian Haydin Happy is a tough one. 0:13:37.556 –> 0:13:49.986 Brian Haydin So, uh, where I’ve where a lot of these conversations around bringing in these disparate data systems have landed have been the lake house, you know, sort of architecture, right. 0:13:48.396 –> 0:13:48.616 Concurrency Umm. 0:13:50.46 –> 0:14:0.756 Brian Haydin And being able to land your data into a lake and then be able to incorporate that into your workspace and fabric. 0:14:1.6 –> 0:14:4.46 Brian Haydin Umm, you know and do something meaningful with it. 0:14:4.736 –> 0:14:7.166 Brian Haydin That’s where you know, that’s where we’ve been using the tools. 0:14:7.806 –> 0:14:15.36 Brian Haydin Umm, you know, in terms of how do you like what other sources that people have access to? 0:14:15.346 –> 0:14:17.136 Brian Haydin I mean, to me, it’s just a Knowledge thing. 0:14:17.826 –> 0:14:25.976 Brian Haydin You know documents, you know that you normally wouldn’t have considered to be data sources, things with like picture elements and whatnot. 0:14:26.646 –> 0:14:32.296 Brian Haydin You know those are are now meaningful things like images like you can actually use them to do real work. 0:14:32.746 –> 0:14:35.576 Brian Haydin So I don’t know what that help. 0:14:36.46 –> 0:14:51.746 Concurrency Yeah, I mean, you know use point which is one of the biggest data sources we’re seeing in the gender, the AI space isn’t just the document isn’t just the content that exists in the ERP system, but might be documents that are used for customer support. 0:14:51.846 –> 0:14:59.216 Concurrency So like, we’re working with a customer right now, they have thousands of documents that represent their installation, build plans. 0:14:59.476 –> 0:15:7.486 Concurrency And without those documents, they’re end customer can’t use their product, but it’s so is so poorly structured. 0:15:7.496 –> 0:15:13.736 Concurrency It’s so like disparately deployed across their environment, but it’s difficult for those end customers to be able to find that information. 0:15:13.746 –> 0:15:18.226 Concurrency So they seek solutions to be able to make it easier for them to do business with them. 0:15:18.556 –> 0:15:21.966 Concurrency But it’s not like they’re just pulling it from an ERP system. 0:15:21.976 –> 0:15:24.996 Concurrency They’re having to understand like, how do I leverage that data? 0:15:25.6 –> 0:15:28.286 Concurrency That’s as important as what exists in our ERP system. 0:15:28.296 –> 0:15:36.436 Concurrency From a customer point of view, but now needs to be surfaced and more usable approaches that I hadn’t considered in the past just because of the like. 0:15:36.646 –> 0:15:42.166 Concurrency How we made it, you know, in a sense, like we just kind of accepted that it was somewhat unusable or difficult to take advantage of. 0:15:42.206 –> 0:15:44.206 Concurrency So yeah. 0:15:42.216 –> 0:15:43.616 Brian Haydin Yeah, in real time data too. 0:15:43.706 –> 0:15:45.376 Brian Haydin Like, that’s another important part. 0:15:45.466 –> 0:15:51.536 Brian Haydin You know that that was difficult to use, you know, because models took, you know, so long to generate. 0:15:51.546 –> 0:15:53.256 Brian Haydin How do you how do you even think about it? 0:15:53.546 –> 0:15:55.936 Brian Haydin I’ve got millions of transactions coming in per minute. 0:15:55.946 –> 0:15:57.876 Brian Haydin From all these you know, IoT devices. 0:15:58.866 –> 0:15:59.706 Brian Haydin Yeah, for sure. 0:16:1.606 –> 0:16:15.516 Concurrency So all of that fills in from a consumption standpoint, but then realize that as you are leveraging AI solutions, you are creating new data and you are creating data that you wouldn’t necessarily had in the past. 0:16:15.526 –> 0:16:25.476 Concurrency The generated content that is produced via your AI system, especially in generative AI, you’re producing text exchanges with your customers that are present in transcripts. 0:16:25.736 –> 0:16:31.916 Concurrency You’re producing responses code that might be built or even as Brian you said, the telemetry. 0:16:32.136 –> 0:16:43.676 Concurrency Let me try data of how either not just the data that’s entering into the system, but the user interaction within that generative AI system and how I might apply and improve it in the future. 0:16:43.766 –> 0:16:59.256 Concurrency This real time intermediary step of what I’m creating as I’m interacting with my customer is a new piece of data that I may not have considered the past because it was previously a human performing at work, or it might be just a human performing at work, not like a human in partnership with the AI system. 0:16:59.626 –> 0:17:5.956 Concurrency So that exists as whole new space that I have to maintain as something I care about and then tied to that. 0:17:5.966 –> 0:17:41.456 Concurrency Is this idea of who owns all this and and how do they think about the maintenance maintaining of that data, which brings us all back to that whole governance idea that from consumption what data is necessary to serve my use case to the creation of data that exists throughout its lifecycle to how I maintain and control what data is flowing through what system and how I think about customer data versus data that is coming from the next ternal partner to Dave that I don’t wanna expose any controls around to employ sensitive data, especially in Internal AI. 0:17:41.516 –> 0:17:42.306 Concurrency Use cases. 0:17:42.536 –> 0:17:50.166 Concurrency What don’t I wanna system providing information back to my end customer around all of this exists within the governor state. 0:17:50.866 –> 0:17:55.616 Concurrency So when you’re building these AI systems, all of this exists. 0:17:55.626 –> 0:18:13.716 Concurrency We have the knowledge that encompasses everything that we just talked about, but actions that are being performed by the AI system, that there’s data being produced as a result, and then even as we start to go up the stack, there’s sets of activities that are formatted in the context of instructions that are performed, even delegated sets of activities. 0:18:13.726 –> 0:18:21.926 Concurrency So one of the futures I’ve been suggesting companies think about is this idea that right now when we think about a lot of AI agents, we’re so familiar with the like. 0:18:21.936 –> 0:18:22.666 Concurrency I asked a question. 0:18:22.676 –> 0:18:30.46 Concurrency I get information back pattern and that’s a very much the infancy of what AI systems are gonna be producing. 0:18:30.196 –> 0:19:1.216 Concurrency Think about the future of AI systems as I’ve asked it to perform an action if performs that action, and it may even be a delegated set of AI agents that each performs subparts of that action, and for me to have trust in that system means I have trust in the data in trust in the successful execution of those sub parts of the action that are delegated to AI agents in a similar way to which if I was delegating to a set of employees to perform actions, I am trust that they can perform it, that they know. 0:19:1.306 –> 0:19:4.716 Concurrency How to do it and that they’re working from the right data to get that work done? 0:19:5.426 –> 0:19:6.766 Concurrency Same kind of idea exists here. 0:19:7.746 –> 0:19:18.476 Concurrency So this is a very busy picture, but it really pivots to the the change and when I’m noted like how do I, how do I think about data? 0:19:18.486 –> 0:19:22.696 Concurrency This is the way that we’re really urging you to think about data in your ecosystem. 0:19:23.46 –> 0:19:24.436 Concurrency Start with the end in mind. 0:19:24.766 –> 0:19:27.276 Concurrency So if there’s one major suggestion, I would have for you. 0:19:27.286 –> 0:19:28.616 Concurrency It’s start with the end in mind. 0:19:29.6 –> 0:19:31.216 Concurrency Don’t think about your data state as a monolith. 0:19:31.426 –> 0:19:40.886 Concurrency Think about it as value streams that are leveraging data to accomplish good within your organization and that begins with this idea of value driven use cases. 0:19:41.446 –> 0:19:45.496 Concurrency So inside of each of these value driven use cases and these are examples, right? 0:19:45.506 –> 0:19:49.666 Concurrency Your business might have different value driven use case domains. 0:19:51.56 –> 0:19:58.906 Concurrency Each of these relate to uh set of consumers that are that are leveraging data executive financial visibility. 0:19:58.996 –> 0:20:0.526 Concurrency Where’s my truck? 0:20:0.716 –> 0:20:6.276 Concurrency Personal optimization like Employees who’s what kind of turnover do I have? 0:20:6.286 –> 0:20:7.986 Concurrency Where’s turnover more significant? 0:20:7.996 –> 0:20:19.286 Concurrency What are things happening within that facility that might cause more turnover, demand, inventory optimization, manufacturing optimization, the actual OT ecosystem itself? 0:20:19.596 –> 0:20:21.46 Concurrency Quality data or data? 0:20:21.386 –> 0:20:22.676 Concurrency It’s online from your system. 0:20:22.686 –> 0:20:25.776 Concurrency Video analysis from the real product you’ve delivered to your customers. 0:20:26.266 –> 0:20:30.246 Concurrency Sales optimization data or even like digital optimization data. 0:20:30.606 –> 0:20:45.406 Concurrency All of that represents that sort of for me, formation of use cases that are coming out of each of these different domains and then relating to that is this idea of the data domain that serves it. 0:20:45.466 –> 0:20:53.16 Concurrency So you’ll underneath this underneath each of these are these ideas of data domains that are present in the business. 0:20:53.26 –> 0:21:1.886 Concurrency And another thing to reinforce is that it may govern the secret system, but the business is the owner of this data. 0:21:2.976 –> 0:21:5.126 Concurrency And sometimes organizations don’t think about it that way. 0:21:5.136 –> 0:21:13.906 Concurrency They may think about it as an owner of the business truly is the one that actually understands the data, understands its purpose and its function within your organization. 0:21:14.436 –> 0:21:54.816 Concurrency So underneath that exists owners and then centric to any one of those are consumers of that data and that could be business consumers of that data where there’s models produced and there’s a data product serving them and data scientists in AI scenarios that are more about giving access to the raw data so they can position a AI model that’s able to respond to a need and then across the bottom of this entire estate exists this idea of a certified data set that represents what the truth is, what data can be used as you are working to. 0:21:54.866 –> 0:22:7.16 Concurrency Accomplish those value driven use cases and sometimes across this you’re exchanging data or leveraging data to accomplish greater needs like a customer 360 which might cross several data domains. 0:22:7.26 –> 0:22:15.696 Concurrency So accomplish good and there’s more efficient models that exist now from a technical standpoint to reduce duplication of data to serve that goal. 0:22:15.846 –> 0:22:21.296 Concurrency But from a sort of top view, this is how you used to start thinking about. 0:22:21.306 –> 0:22:22.856 Concurrency You notice that I didn’t start Data up. 0:22:22.866 –> 0:22:25.836 Concurrency I started use case down because we have to start with. 0:22:25.886 –> 0:22:40.586 Concurrency What am I actually trying to accomplish with my data and how do I then think about data as an input to that that I either have right now in a prepared sense or I don’t and it positions me to focus energy to get that data ready. 0:22:41.646 –> 0:22:51.6 Concurrency Our Brian, you know Howard, how are you seeing companies sort of succeed at at thinking about this picture like where we’re any thoughts that you have on this? 0:22:52.766 –> 0:23:2.756 Brian Haydin So I you know, I’ve been talking about fabric a lot and I think every time you called on me, I’ve said fabric but so but it it, I think it’s important really. 0:22:56.56 –> 0:22:56.276 Concurrency Umm. 0:22:58.846 –> 0:22:59.106 Concurrency Yeah. 0:23:3.546 –> 0:23:4.526 Brian Haydin So a lot. 0:23:4.536 –> 0:23:10.66 Brian Haydin You know, the customers that I’m talking with are beginning journeys, you know, from something to something, right. 0:23:10.76 –> 0:23:12.486 Brian Haydin And a lot of times it’s in the fabric ecosystem. 0:23:12.816 –> 0:23:19.586 Brian Haydin And so the conversations that I’m having with them right now are starting with that governance and the security at the beginning. 0:23:19.596 –> 0:23:33.886 Brian Haydin It’s so much easier to do it, and when you know that it’s gonna be needed and that those controls are already in place, then it is for you to, like, build up these data domains and then try to backtrack and figure it out. 0:23:34.276 –> 0:23:39.996 Brian Haydin So I’m having a lot of conversations around this and incorporating that into our delivery stack. 0:23:40.896 –> 0:23:44.466 Brian Haydin To make sure that you know security and governance is is a first class citizen. 0:23:45.996 –> 0:23:46.816 Concurrency Yeah, that’s huge. 0:23:46.826 –> 0:23:58.246 Concurrency I mean, just even talking fabric for a minute, like one of the coolest things about fabric is, you know, a lot of times when we talked about this structure, there’s a lot of like, publisher subscriber models. 0:23:58.256 –> 0:24:21.686 Concurrency Like it duplicates the data to another domain to use it like fabric has this idea of like not moving the data but then making it available in another domain which is like a huge efficiency but still maintaining the the the data domain idea which goes right to your point of like governance and establishing a effective playing around for people to be able to do the work. 0:24:14.16 –> 0:24:14.276 Brian Haydin Umm. 0:24:22.756 –> 0:24:23.106 Brian Haydin Yeah. 0:24:23.116 –> 0:24:26.576 Brian Haydin And you know, put this in the context of, like, the AI story, right. 0:24:26.836 –> 0:24:35.826 Brian Haydin I mean, there’s so many different use cases from like, you know the HR chat bot that’s gonna serve up somebody’s salary because somebody didn’t, you know, think through the problem correctly. 0:24:36.696 –> 0:24:42.706 Brian Haydin But you know these like the AI technology is to some organizations it’s, you know, it’s the unknown. 0:24:42.716 –> 0:24:43.246 Brian Haydin It’s scary. 0:24:43.256 –> 0:24:54.746 Brian Haydin It’s a new thing, I guess I want to learn about this ohm and, you know, putting in like the putting the governance into the conversation at the beginning helps to allay some of those fears that people are having right now. 0:24:56.556 –> 0:24:56.866 Concurrency Totally. 0:24:59.306 –> 0:25:12.496 Concurrency So to that point of starting with the end in mind, we suggest that organizations, as they’re getting started, think about building a value analysis around the ways they’re going to leverage data. 0:25:12.506 –> 0:25:25.96 Concurrency You notice that this is very much around like understanding a category of value, a name of that use case, a description of what you’re trying to accomplish, and it’s value category operations versus revenue. 0:25:25.266 –> 0:25:53.66 Concurrency And then even how difficult it is to do and what you might find is that in some cases your data is already ready, but it hasn’t been governed such as like document data or your data may be ready, but you haven’t figured out how to use it yet, or it might be a situation where the data isn’t ready at all and you need to position a forward strategy to be able to make that data useful asset in the future, like we’ve worked with companies just like man, I wish we had better understanding of our customers so. 0:25:53.526 –> 0:26:10.6 Concurrency We need to position a way for us to get that data as we sell them new products, which is starting from this sort of ground up like we need to change the way that we’re gathering data from our customers and it’s a journey to get there and we know that it’s a sort of long stick to be able to get to the end of that game. 0:26:10.976 –> 0:26:13.226 Concurrency But we also know that that’s the right thing to do. 0:26:13.396 –> 0:26:16.266 Concurrency So it might position us for some value. 0:26:16.276 –> 0:26:20.586 Concurrency We’re gonna get in the future understanding that this is a goal of ours as an organization. 0:26:22.356 –> 0:26:23.106 Concurrency So what? 0:26:23.116 –> 0:26:29.816 Concurrency We’re gonna pivot into now is talking about to sort of if this was like act 1/2 and three, right? 0:26:29.826 –> 0:26:31.146 Concurrency We just got through act one. 0:26:31.376 –> 0:26:37.366 Concurrency Act two is now let’s talk about how data is used in building an AI solution and then act three. 0:26:37.376 –> 0:26:39.946 Concurrency Of course, we’re gonna talk about some governance topics surrounding that. 0:26:40.476 –> 0:26:49.26 Concurrency So when you think about a building, a copilot, and what I mean by a copilot is don’t just think about a copilot as like just Microsoft Copilot. 0:26:49.36 –> 0:26:51.516 Concurrency OK, think about like a copilot generically as a platform. 0:26:52.156 –> 0:27:4.966 Concurrency Realize that a copilot can leverage the diversity of information in that very little of this may be governed right now in your environment, and much of this might be content that lives in your SharePoint environment. 0:27:5.156 –> 0:27:12.176 Concurrency So like Azure, starting even to roll out M365 copilot, one of the first conversations we’re having is do you know what Shared or not? 0:27:12.186 –> 0:27:15.146 Concurrency You have 10 years worth of legacy SharePoint data out there. 0:27:15.196 –> 0:27:17.746 Concurrency Do you even know what people’s access to that looks like? 0:27:17.856 –> 0:27:21.926 Concurrency Do you remember when you turned on delve and what you could see or not see about a person’s usage? 0:27:22.366 –> 0:27:53.686 Concurrency That’s like multiplied by 10 as you start enabling things like empty 65 copilot, but then also this accessibility of data that becomes available as your data verse as a position and especially data verse becoming really, really accessible with copilot studio and things like fabric surfacing that data more accessibly for us and you know where I was going especially is to think about the data as it flows through all those data sources and ecosystems. 0:27:53.696 –> 0:27:56.376 Concurrency Think about as a flows through this this pattern. 0:27:56.486 –> 0:28:17.186 Concurrency OK, so we have if we’re built maybe just to define Reg for a second, this is retrieval augmented generation, the idea that I have a static model that allows me to interact with my in customers, but ultimately the data that feeds the intelligence of the system is your data that it’s your data and it’s your data alone. 0:28:17.376 –> 0:28:27.446 Concurrency But you need to prepare it for it to be effective, and that might be preparing a document, or it might be preparing a business system to be able to be integrated into a platform. 0:28:27.676 –> 0:28:34.976 Concurrency So on the left hand side, you see the ingestion of that data from whatever raw data you have, and again that could be a document. 0:28:35.446 –> 0:28:39.446 Concurrency It could be a, you know, sequel database, whatever that raw data is. 0:28:40.486 –> 0:28:46.816 Concurrency And as you review that data, you realize ohh, that’s what we’ve been using like there’s a problem. 0:28:46.826 –> 0:28:48.746 Concurrency We need to get to it. 0:28:48.826 –> 0:28:52.496 Concurrency Being cleansed, the data that represents our end solution. 0:28:52.506 –> 0:29:10.196 Concurrency So like even if we go to like the most basic rag pattern I’m building an HR chat bot and you look at your employee manual and you’re like, ooh, like, I gotta improve the quality of that manual because that’s the data that my HR chatbot’s gonna use to be able to interact with my end customer inside the business. 0:29:10.716 –> 0:29:13.886 Concurrency And it’s not answering certain types of questions because I my data sucks. 0:29:14.136 –> 0:29:28.676 Concurrency So like how do I improve that that cleanse data and that’s the accountability of the business you’re saying business, you need to apply intelligence to that data source, maybe in partnership with tech, to be able to get to that to that pattern. 0:29:30.446 –> 0:29:47.856 Concurrency And then as you’re flowing through that, you are potentially chunking that data, meaning I’m breaking it up into smaller pieces that’s ultimately feeding a vector database and a vector database is essentially like a way of understanding data that shows a JSON sees between one word and another. 0:29:47.926 –> 0:29:59.976 Concurrency So I think like boat and propeller or draft in depth, these ideas of these relationships between words that we understand because we are humans that know adjacency. 0:30:0.696 –> 0:30:14.836 Concurrency Umm, but sometimes like if you think about like I’m learning another language, I might not know the relationship between one foreign word and another foreign word in a way that like a native speaker might know, that’s sort of how a vector database is starting to help us, right? 0:30:14.846 –> 0:30:25.866 Concurrency So what’s happening is that data is flowing into the creation of a vector database that then serves the interaction that happens in retrieval response patterns. 0:30:26.76 –> 0:30:30.246 Concurrency With that end customer and that end customer is I’m asking questions. 0:30:30.296 –> 0:30:30.856 Concurrency I’m responding. 0:30:31.326 –> 0:30:33.216 Concurrency I’m asking the the model to do something. 0:30:33.226 –> 0:30:35.486 Concurrency It’s using the vector database to inform the action. 0:30:35.496 –> 0:30:40.916 Concurrency That’s then taken and then it’s logging and maintaining information that happens in between us. 0:30:46.66 –> 0:30:46.706 Brian Haydin Battery easy. 0:30:48.296 –> 0:30:50.976 Concurrency Uh, what part of this is the hardest from your perspective? 0:30:51.16 –> 0:31:0.246 Concurrency Like if I’m a business that’s like trying to get started in AI understanding I have a huge diversity of data and maybe I’ve picked a use case like what’s the hardest part of this? 0:31:0.296 –> 0:31:2.936 Concurrency Or maybe even like couple hardest parts from your perspective. 0:31:3.476 –> 0:31:14.876 Brian Haydin Yeah, I think the big challenge is for a lot of organizations is that first step going from raw data to cleanse data and you know we hear the protests from customers all the time. 0:31:14.886 –> 0:31:17.496 Brian Haydin Well, I can’t pull this out of my ERP right now. 0:31:17.786 –> 0:31:19.636 Brian Haydin It’s things are such a mess right now. 0:31:19.646 –> 0:31:20.536 Brian Haydin I just don’t even wanna. 0:31:20.716 –> 0:31:27.796 Brian Haydin I don’t even wanna mess with it ohm, but that to me it it’s almost a cop out. 0:31:28.476 –> 0:31:32.986 Brian Haydin The whole purpose is that your data is not going to be ready when you pull it in, right? 0:31:32.996 –> 0:31:39.326 Brian Haydin It’s up to the data owners to make sense of it and clean it so that it isn’t certified data set. 0:31:39.336 –> 0:31:40.386 Brian Haydin So you can start to use it. 0:31:40.756 –> 0:31:47.446 Brian Haydin So I would say like First off like that’s like overcoming that challenge with the organization’s definitely a lot of conversations happen. 0:31:47.906 –> 0:31:53.226 Brian Haydin And then in conjunction with this, in terms of starting these projects, it’s hard work. 0:31:53.526 –> 0:32:6.416 Brian Haydin You know, I mean it’s, you know, uh, imagine going into a storage unit that hasn’t been touched in 20 years and trying to, you know, I’m trying to find that notebook that you remember putting there that that’s what it’s like. 0:32:0.806 –> 0:32:0.966 Concurrency Yeah. 0:32:6.966 –> 0:32:8.326 Brian Haydin So it’s a lot of work. 0:32:9.576 –> 0:32:12.726 Concurrency I sort of feel like this use cases that are ready document driven. 0:32:13.556 –> 0:32:24.856 Concurrency It’s like you built a document driven scenario intake all these documents and then you’re like in the response, you’re finding that it’s it’s answering with the document from 20 years ago. 0:32:25.276 –> 0:32:27.756 Concurrency It’s like it’s like, why is this answering this wrong? 0:32:27.766 –> 0:32:39.886 Concurrency Well, wait in that storage unit was a document from 20 years ago that is completely wrong now, not representing our product, but it’s used to answer the question like stewarding that data has become an interesting question problem. 0:32:40.556 –> 0:32:40.976 Brian Haydin For sure. 0:32:44.396 –> 0:32:48.726 Concurrency Umm, so I I I’ll I find this fascinating. 0:32:48.736 –> 0:32:59.466 Concurrency This this part of this conversation, which is like what are the three golden rules of building the data that’s surrounds building an AI application and the AI application model itself. 0:32:59.556 –> 0:33:1.46 Concurrency And this is sort of AI. 0:33:1.56 –> 0:33:2.476 Concurrency Application in a microcosm. 0:33:2.486 –> 0:33:4.766 Concurrency OK, you have this idea of a prompt. 0:33:5.76 –> 0:33:6.966 Concurrency It’s interacting with the AI application. 0:33:7.116 –> 0:33:11.836 Concurrency There’s user content, there’s skills and resources that facilitate that. 0:33:12.76 –> 0:33:24.476 Concurrency It’s ah that that AI applications ability to perform its task and then there’s content that you’re outputting back to the user, you know, as a result of that, there’s some rules that exist regarding doing this well. 0:33:24.686 –> 0:33:33.816 Concurrency So the first of that is you have data that has access controls and it has boundaries surrounding it that someone is accountable for. 0:33:33.826 –> 0:33:41.376 Concurrency Ideally, the system owner right most the first rule is to respect that data. 0:33:42.316 –> 0:33:58.236 Concurrency So an example of that would be, let’s say that you have you have data that exists in SharePoint that is, uh, ingested into an AI application and that data that exists in SharePoint allows only certain people to be able to access that information. 0:33:58.686 –> 0:34:12.536 Concurrency Your AI application should respect those rules for those individuals, and it should respect the labels surrounding those documents or those assets that are then translated into the consumer interacting with it. 0:34:12.746 –> 0:34:28.676 Concurrency So that cuz sometimes people might ignore those intake them and then allow for a person to get access to data that they should know to have access to sort of the situation, separation of concern that exists regarding those documents needs to be maintained so the right people have access to the right things. 0:34:28.766 –> 0:34:29.876 Concurrency So that’s rule one. 0:34:30.366 –> 0:34:38.996 Concurrency Rule 2 is that your application should govern specifically what kinds of input outputs certain data you’re able to perform. 0:34:39.6 –> 0:34:57.766 Concurrency So you’re building like a pattern or a set of controls here, and some of these are now available for Microsoft at like further control what you can and can’t do against your application, but facilitating very strict input output controls to the AI application through the vehicle of the prompt. 0:34:58.16 –> 0:35:20.496 Concurrency But the prompt itself is actually governed by a set of guard rules, and then the third here is that there’s model controls that put guardrails around what the AI model itself can do, and essentially what will happen is that if your AI model has access to something, it’s going to give the information back to the application. 0:35:20.506 –> 0:35:26.926 Concurrency So we need to establish guardrails that surround that picture so we’re able to provide the right data back. 0:35:27.766 –> 0:35:39.376 Concurrency But you know, in a way which is sufficiently protected to enable the user to have the right, you know, the right access to the right data and not be able to compromise my model in a way that’s it really shouldn’t be. 0:35:39.606 –> 0:35:46.196 Concurrency So that’s why this AI application you’re really exists before this spot because there’s so many ways to like. 0:35:46.206 –> 0:35:57.206 Concurrency If I just get raw access to a model like any sort of large language ecosystem that has a rag pattern behind it, I’m gonna find that I can compromise it in ways that I really shouldn’t be allowed to. 0:35:57.896 –> 0:36:13.416 Concurrency I’m a Brian from your perspective like, which has been the most evolving of these like like what in the think about the like the last year, OK, like when we first started doing rag to like now like, what do you think has changed the most and how you think about this over that last year? 0:36:15.366 –> 0:36:21.386 Brian Haydin The Ohh the tooling right now has become a little bit more ubiquitous and unified. 0:36:21.436 –> 0:36:27.76 Brian Haydin You know to be able to support security and governance, there’s still some discrepancies. 0:36:27.86 –> 0:36:35.526 Brian Haydin You know, I’m thinking of, you know, some of the copilot studio governance that doesn’t have the synergy that the other platforms would have. 0:36:35.406 –> 0:36:35.646 Concurrency Hmm. 0:36:36.196 –> 0:36:47.996 Brian Haydin You know you you have some ability to govern, but I think we’re getting closer to being able to establish the three golden rules pretty easily and out of the box, in whatever medium. 0:36:48.66 –> 0:36:54.36 Brian Haydin Uh that you want to explore with AI, so I think you’re gonna see that come together a little bit this year. 0:36:54.746 –> 0:36:57.556 Brian Haydin Certainly people don’t pay enough attention to it. 0:36:58.286 –> 0:37:12.676 Brian Haydin The other thing too is that I’d like some of the features that are coming out in the Azure AI studio, like the groundedness detection and you know, really starting to elevate safety and risk, you know, with some of the with these technologies. 0:37:14.6 –> 0:37:18.966 Concurrency I totally agree on that groundedness detection thing like and that is truly an evolving space. 0:37:18.976 –> 0:37:37.226 Concurrency But like one of the bigger problems that existed was like the AI model responding with content from a source document or database that actually you didn’t come from that database, or actually from that document and grounded this, really representing this idea that like we’re gonna double check. 0:37:37.236 –> 0:37:38.6 Concurrency We’re gonna do discount. 0:37:38.16 –> 0:37:42.316 Concurrency Double check on the data to make sure you actually responded with something that actually existed. 0:37:42.936 –> 0:37:43.326 Concurrency Umm. 0:37:43.556 –> 0:37:48.336 Concurrency Essentially a trust validation, but that being part of the platform, not something you actually have to go build now. 0:37:48.926 –> 0:37:49.136 Brian Haydin Yeah. 0:37:51.426 –> 0:37:53.736 Concurrency OK, so each of these represent layers of defense. 0:37:53.746 –> 0:37:58.216 Concurrency They represent controls and uh activities. 0:37:58.226 –> 0:38:0.46 Concurrency We have to build into our application platforms. 0:38:3.546 –> 0:38:11.166 Concurrency OK, so each of these sort of layer into this idea of AI content safety as we think about our governance. 0:38:11.456 –> 0:38:15.306 Concurrency The first is how do I train my user to interact with my system? 0:38:15.556 –> 0:38:19.826 Concurrency So what are the ways that I ask them to interact? 0:38:19.876 –> 0:38:26.506 Concurrency How do I prompt them appropriately to get the data they need and that’s both a security thing and A like ease of use thing. 0:38:26.516 –> 0:38:30.566 Concurrency So like AI systems that start off with like, here’s the way that you can work with me. 0:38:30.576 –> 0:38:49.656 Concurrency And here’s kinds of questions you can ask, and here’s ones that I won’t respond to, giving them clear guidance as to how you can interact in the AI system is a just a best practice in general, but then also building policy around that initial intake to say, Nope, you’re not getting through, you can go through this spot. 0:38:49.726 –> 0:38:54.176 Concurrency But I’m not letting you go through over here because that’s not what my system is built to do. 0:38:54.226 –> 0:38:55.46 Concurrency Like you’re not. 0:38:55.566 –> 0:39:15.66 Concurrency My system doesn’t exist to perform that function for you and then building and place those data access controls that further, like allows to go down certain routes and then that also then existing in the context of the application itself and the AI model, both represented as ways to be able to provide that protection and governance has to cross all of those. 0:39:16.516 –> 0:39:26.796 Concurrency So as we sort of go into the next stage of this, we’re going to talk about like how does data Flow into this type of application model? 0:39:27.466 –> 0:39:31.36 Concurrency And there’s two big domains we always talk about when we talk about AI. 0:39:31.86 –> 0:39:34.786 Concurrency And it’s also super relevant in the context of managing that data. 0:39:34.796 –> 0:39:39.956 Concurrency One, it’s really this Management: between commodity and mission driven data and AI use cases. 0:39:40.306 –> 0:39:46.136 Concurrency So commodity mean really representing many of the platforms I use are going to have AI platforms. 0:39:47.306 –> 0:40:2.506 Concurrency Lesser interesting thing to say, but like they if I have officer 65, M 365 copilot, Salesforce Dynamics ERP platforms, you name it, they’re all gonna have a type of copilot that gets lit up with their platform just because it’s what everyone’s doing. 0:40:2.516 –> 0:40:2.736 Concurrency B. 0:40:2.746 –> 0:40:7.906 Concurrency Naturally, it’s a way for us to drive value inside that commodity platform. 0:40:7.916 –> 0:40:19.636 Concurrency We have to realize that a certain amount of data governance is gonna happen or need to happen in the context of what it’s delivering to our customers before it even gets to like bigger level use cases. 0:40:19.886 –> 0:40:40.956 Concurrency So if I think about M365 copilot, an example of what I might use to govern the data that’s available to it is SharePoint permissions and labeling, but the overarching technology that I might look at is something like purview for Office 365 that then forces the application of labels, understands what labels exist, where they exist, how they’re being used to. 0:40:40.966 –> 0:40:47.556 Concurrency Think about that broader commodity ecosystem and then on the sort of mission driven side, we have the same challenge, right? 0:40:47.566 –> 0:40:49.436 Concurrency We have this need for a certified data set. 0:40:49.486 –> 0:41:0.46 Concurrency We have this need for establishing the owners of that data and that is more highly structured data sources or sort of pools of less structured data. 0:41:0.56 –> 0:41:2.206 Concurrency But it’s not a SAS platform per se. 0:41:2.216 –> 0:41:13.926 Concurrency Might be something that’s more like living in a data lake in some domain, so we also have a need to manage that ecosystem, and sometimes this can cross like it concerns like the people who probably care about this. 0:41:13.976 –> 0:41:20.416 Concurrency The commodity side for especially for off 365 right now are probably different than the people right now who care about data. 0:41:20.666 –> 0:41:34.396 Concurrency You know, in the mission driven side like more highly structured data and realize that like your your concerned about that data needs across both of those domains, it needs to think about all of it and it needs to cross the entire ecosystem. 0:41:35.626 –> 0:41:54.136 Concurrency So I showed this this diagram before I’m bringing it back because I want you to remember like each of these functional towers represent diversities of domain, that diversities of data that serve it, and there’s a Governance flow for each of those towers. 0:41:54.316 –> 0:42:21.646 Concurrency So we’re gonna sit on this for a second because I think this is a good way for us to think about the overall picture is we have a source system and that source system is usually like a business system or it could be outputted data from a connected device and Alt, platform or it could even be data created by an individual user like a document that sits in this source system and that’s consumed by something it’s consumed by that modern data platform. 0:42:21.936 –> 0:42:51.766 Concurrency And at modern data platform like being something like a fabric or something and that sitting then into movement into that modern data platform and an automated way, not just a one time move, but something that establishes that this lineage of trust that this flow is something that’s ongoing, that it is updated, it’s respected, it’s coming from this source system because if I cut off the source system from the modern data platform and I only move it once, it’s only as good as the last time I moved it. 0:42:52.186 –> 0:43:8.26 Concurrency Now I’m thinking about it in a continuous flow which gets to this critical point which many companies don’t have yet, which is you have a certified data set and I don’t mean like spend the next two years building your like your mapped data platform, right? 0:43:8.36 –> 0:43:10.406 Concurrency I’m thinking talking here specifically about. 0:43:10.416 –> 0:43:24.306 Concurrency I’ve got a use case I need to understand what data is necessary to certify in order for me to have trust that I’m executing that’s that’s certified use case well and have certified data that serves it and that’s an agreed schema. 0:43:24.556 –> 0:43:31.246 Concurrency It’s published for the use by the consumers and then I have an understanding of how it got there is that ELT is an eltel. 0:43:31.256 –> 0:43:33.386 Concurrency Like what process did it go through? 0:43:33.396 –> 0:43:46.436 Concurrency What lineage and existed for it to get from point A to point B which then moves me into this idea of who can access the data, which is potentially the actual rights that existed here on step one, and moving that into? 0:43:46.446 –> 0:43:47.736 Concurrency How is it being used? 0:43:47.946 –> 0:43:50.476 Concurrency Is it a data science use case where I’m truly just? 0:43:50.526 –> 0:43:53.946 Concurrency I’m using doing prediction and there is there’s no intermediary step. 0:43:54.906 –> 0:43:56.66 Concurrency Is it a big data use case? 0:43:56.206 –> 0:44:8.676 Concurrency It connectivity between people and it’s consumed or it’s is it dashboards self service which then has its own scoped access and then at the end of this pattern exists this overall flow of? 0:44:8.686 –> 0:44:46.456 Concurrency How I bringing that on an ongoing basis to my customers in Brian earlier you mentioned preparing the data being like one of the more difficult challenges like even just getting from source system to you know the the platform itself being one of the more more difficult parts of that picture from your perspective like how does that relate to like certification of data and like am I what am I, how do I judge like what part of that I’m actually certifying as or like I guess explained to me like the relationship between that that preparation period and. 0:44:46.466 –> 0:44:47.746 Concurrency The certification of the data. 0:44:50.96 –> 0:44:50.646 Brian Haydin Yeah. 0:44:50.656 –> 0:44:53.366 Brian Haydin So OK, let’s take like a new customer. 0:44:53.616 –> 0:45:3.46 Brian Haydin You know, as an example, so my sales team might consider a new customer to be somebody that hasn’t done business with us in the last 12 months. 0:45:3.586 –> 0:45:7.746 Brian Haydin But and and I have like another department. 0:45:7.756 –> 0:45:18.146 Brian Haydin Maybe it’s my compliance department that considers like anybody like a new customer that’s ever like a customer that’s ever done business with us is not a new customer like it’s just if this is the first time. 0:45:18.596 –> 0:45:20.326 Brian Haydin So, like who owns that data? 0:45:20.846 –> 0:45:43.676 Brian Haydin You know, to determine what the custom, you know what defines the new customer first, like that’s a challenge getting like different, you know, people just say, OK, I own this and make the decision and then obviously it’s it becomes like preparing that data you know and then the certified data set is what’s actually delivered to the data consumers. 0:45:43.926 –> 0:45:48.196 Brian Haydin So you mentioned before, like you know, there’s the business consumers that seems to be pretty easy. 0:45:44.6 –> 0:45:44.266 Concurrency Umm. 0:45:48.466 –> 0:45:58.966 Brian Haydin I’m gonna use my silver, gold kind of medallions, you know, probably for some of the the visualizations, but the data scientists, they’re gonna want something a little bit more raw. 0:45:58.976 –> 0:46:1.126 Brian Haydin They’re probably gonna want, like, something in the bronze layer. 0:46:1.826 –> 0:46:10.396 Brian Haydin Umm, you know, in order to do their their work and bronze might, you know, might actually have conflicts of like what defines new customer. 0:46:10.786 –> 0:46:23.546 Brian Haydin So like you know that to me is like where those like delineations are, you know, the certification is not only in terms of a readiness, but also kind of geared towards who the consumer is. 0:46:25.826 –> 0:46:28.876 Concurrency Huge point that especially like preparation of data. 0:46:28.886 –> 0:46:30.926 Concurrency What it needs to go through for a person to use it right? 0:46:29.946 –> 0:46:30.206 Brian Haydin Yep. 0:46:33.626 –> 0:46:33.806 Concurrency So. 0:46:35.106 –> 0:46:49.556 Concurrency Umm, So what you’ll notice is I took that picture, I turned it on side and I was as I was preparing the slide, it was sort of like what are you doing with this slide? It’s. 0:46:49.266 –> 0:46:49.746 Brian Haydin My quote. 0:46:51.716 –> 0:46:51.966 Concurrency No. 0:46:51.976 –> 0:46:52.486 Concurrency Makes sense. 0:46:52.766 –> 0:46:57.246 Concurrency Uh, so it really looks like a verticalized stack there, right? 0:46:57.256 –> 0:47:9.586 Concurrency You’re starting from something to get to that consumed asset for a domain and maybe even you have to combine that together to accomplish an end game or outcome. 0:47:9.776 –> 0:47:16.146 Concurrency But you’re turning outside and flowing up through that cycle for a consumer to be able to leverage it. 0:47:16.156 –> 0:47:18.566 Concurrency And the question is that that’s a lot of work. 0:47:18.576 –> 0:47:19.106 Concurrency Holy cow. 0:47:19.806 –> 0:47:30.26 Concurrency But how do I trust this data if I don’t go through that pattern and some of your data might be closer to that level of preparedness than others, but it’s the hard work that needs to go. 0:47:30.36 –> 0:47:36.686 Concurrency Your team needs to go through to be able to trust the data to deliver something that really matters, and that’s really what we’re getting to here. 0:47:36.696 –> 0:47:47.356 Concurrency Like some, there’s some things in your business that just kind of matter, but there’s a lot of things in your business that really matter, and sometimes the most important things that create new revenue or operational savings, they’ve really matter. 0:47:47.486 –> 0:47:58.6 Concurrency It means that your data needs to matter and the accuracy precision of your data needs to matter needs to go through a process that enables it to be as effective and useful as possible. 0:47:59.216 –> 0:48:8.706 Concurrency So all of that is needing some surrounding tooling, so this last piece of this I’m just going to cover a couple tools that I think would be useful for you to think about. 0:48:8.716 –> 0:48:15.526 Concurrency And I want you to really ready your questions if you have any here at the end, I will have a few minutes I think to cover a few of these. 0:48:15.596 –> 0:48:21.886 Concurrency So one of the tooling that I would encourage you to look at Azure examining leveraging things like fabric is Microsoft purview. 0:48:21.936 –> 0:48:28.596 Concurrency I talked about that just like recently in if you have an offshore 65 environment, you probably already know that this is part of your picture. 0:48:29.306 –> 0:48:32.796 Concurrency But if you have a like a cloud data state, you may or may not be using this. 0:48:33.106 –> 0:48:37.786 Concurrency This is a tool set that like or or like even something roughly comparable to it. 0:48:37.796 –> 0:48:41.256 Concurrency You need something that gives you the picture of the picture of the data state. 0:48:41.526 –> 0:48:51.736 Concurrency We’ve referred to this also as like the picture of the elephant, like you’re here the term a bunch of blind people go up to an elephant and it’s a rope or it’s a tree trunk or it’s a it’s a hose or whatever. 0:48:51.746 –> 0:48:52.706 Concurrency Well, it’s elephant, right? 0:48:52.716 –> 0:48:54.216 Concurrency But they all have different pieces of it. 0:48:54.506 –> 0:49:4.126 Concurrency This sort of picture of the elephant is also something that, like a platform like this really is, is oriented around it’s goal, is give me that picture. 0:49:4.636 –> 0:49:8.146 Concurrency Also allow me to apply security to the data. 0:49:8.256 –> 0:49:9.446 Concurrency Allow me to govern it. 0:49:9.456 –> 0:49:20.686 Concurrency The data map the picture and then the like understanding the risk and compliance ecosystem that surrounds it and kind of going into purviews probably a conversation in and of itself. 0:49:20.696 –> 0:49:24.516 Concurrency It’s its own sort of deep dive realize that it’s a stack. 0:49:24.526 –> 0:49:31.486 Concurrency That’s evolved substantially over the last several years, and it’s one that has become more and more integrated just in the fabric itself. 0:49:31.496 –> 0:49:50.176 Concurrency So, I mean just talking fabric for a second, if you’re a power BI user, you’re already a fabric user like Power BI is basically now an extension of fabric or fabric is an extension of power BI, and it’s essentially making every power BI user a data user, a data, a data preparedness user. 0:49:50.386 –> 0:49:57.416 Concurrency So a lot of the tooling from purview is getting integrated right within that state, like lineage and certified data sets. 0:49:57.426 –> 0:50:16.676 Concurrency So the idea of endorsing a set of data that data being certified by a person as the trusted set of data that’s used by a set of business consumers that are producing power BI reports super useful like one of the most challenging things for self service is like I see 4 versions of this same table. 0:50:16.686 –> 0:50:19.236 Concurrency Like which one is the one I should actually consume from? 0:50:19.526 –> 0:50:32.176 Concurrency Noting that this is certified, noting that it exists in a certain level of medallion so people know what to choose and what to use, especially in scale environments like sometimes smaller organizations like I only have one report writer. 0:50:32.186 –> 0:50:34.76 Concurrency It’s all in it like, OK, I get it. 0:50:34.706 –> 0:50:35.436 Concurrency Organization. 0:50:35.446 –> 0:50:37.556 Concurrency That’s much more scaled out than. 0:50:37.706 –> 0:50:40.456 Concurrency You don’t have time to have like one person building all your reports. 0:50:40.466 –> 0:50:43.416 Concurrency You’re scaling this out across the organization, and that’s truly the idea. 0:50:43.426 –> 0:50:45.516 Concurrency Like, how do I democratize data access? 0:50:46.196 –> 0:50:56.566 Concurrency So your democratized data access by saying build which you need, but do it on this do it on the certified data set and that is based on this idea of lineage. 0:50:56.626 –> 0:50:59.356 Concurrency The lineage being like, where did you come from? 0:50:59.506 –> 0:51:0.616 Concurrency We all have lineage. 0:51:0.626 –> 0:51:2.656 Concurrency From a personal standpoint like. 0:51:3.176 –> 0:51:3.866 Concurrency I. 0:51:4.56 –> 0:51:6.86 Concurrency But there’s data lineage too, right? 0:51:6.96 –> 0:51:7.386 Concurrency It came from this source. 0:51:7.476 –> 0:51:8.706 Concurrency It flowed through these steps. 0:51:8.716 –> 0:51:9.906 Concurrency It got to this point. 0:51:10.96 –> 0:51:15.646 Concurrency It’s one of the most common things in companies that don’t have lineage in data and are back at that maturity level. 0:51:15.656 –> 0:51:18.126 Concurrency One are I don’t trust your data. 0:51:18.136 –> 0:51:19.546 Concurrency Is this really the truth? 0:51:19.556 –> 0:51:21.176 Concurrency Is this where did this come from? 0:51:21.186 –> 0:51:23.206 Concurrency What did you do to get it to look like this? 0:51:23.456 –> 0:51:30.986 Concurrency And there’s sort of a lack of alignment between the this is truly representative of the truth of the environment. 0:51:31.136 –> 0:51:35.306 Concurrency Data lineage helps us to establish that truth both by knowing. 0:51:35.456 –> 0:51:36.426 Concurrency Here’s the source. 0:51:36.436 –> 0:51:37.566 Concurrency Here’s how it got here. 0:51:37.576 –> 0:51:38.896 Concurrency Here’s how I certified it. 0:51:38.976 –> 0:51:43.86 Concurrency Here’s the people accountable for for signing off that that data is accurate. 0:51:43.196 –> 0:51:48.616 Concurrency And then uh, and aligning individuals within the business to be able to provide that access. 0:51:48.626 –> 0:51:51.426 Concurrency Hey, attestation of the data itself. 0:51:51.506 –> 0:51:59.276 Concurrency So lineage is kind of like one of those nice things, and the more that’s just integrated into what I do as opposed to like this extra step is super important. 0:52:0.116 –> 0:52:3.776 Concurrency So OK, so we’ve got about 8 minutes left. 0:52:4.6 –> 0:52:6.826 Concurrency We’ll love to answer any additional questions that you have. 0:52:6.836 –> 0:52:8.596 Concurrency Let’s kind of balance over to the chat here. 0:52:8.606 –> 0:52:9.956 Concurrency I think, OK. Yes. 0:52:9.966 –> 0:52:10.956 Concurrency Oh, Amy, thank you. 0:52:11.216 –> 0:52:19.136 Concurrency I’m gonna cover this before we like Azure preparing any questions, but we do have some things you can take as next steps, so ways that you can work with us. 0:52:19.346 –> 0:52:34.746 Concurrency We would love to do a free data governance assessment to take a look at how you’re thinking about your data state and position that for you to be able to make forward steps and that may be like you already understand your States, you’re not sure how to secure it and govern it. 0:52:34.756 –> 0:52:39.786 Concurrency Maybe you’re not even sure how you’re gonna use your data in the future, and maybe you need to back up and think about the end in mind. 0:52:40.36 –> 0:52:48.746 Concurrency So ways that we can help with that are if the data governance assessment we can also step into that end in mind with our AI and copilot and visioning sessions. 0:52:48.976 –> 0:52:52.866 Concurrency We’ve run over 50 of these with executive teams in the last six months. 0:52:53.76 –> 0:53:6.86 Concurrency This Is Us helping you to think about how does the mission of my business become augmented by the availability of artificial intelligence and data to be able to help me scale and in create more revenue and operational savings within the organization. 0:53:7.576 –> 0:53:10.26 Concurrency That’s something that we invest in for you. 0:53:10.76 –> 0:53:20.26 Concurrency So these are all things that we invest in to help create an effective partnership, but even more so help you to take advantage of data to accomplish powerful outcomes within your organization. 0:53:20.36 –> 0:53:25.896 Concurrency So as you closed down Phil, the survey, may we want to know if this is useful to you. 0:53:25.986 –> 0:53:26.186 Concurrency B. 0:53:26.196 –> 0:53:27.616 Concurrency We love to work with you again. 0:53:28.156 –> 0:53:30.56 Concurrency Indicate what those things could be. 0:53:30.146 –> 0:53:33.276 Concurrency If you have suggestions as to how we could do this better, give us those suggestions. 0:53:33.286 –> 0:53:37.26 Concurrency We want to get your feedback or if you love it, tell us what you loved about it. 0:53:37.36 –> 0:53:37.946 Concurrency So we can do more of it. 0:53:38.416 –> 0:53:42.166 Concurrency And with that, we would love to take any outstanding questions you have. 0:53:42.226 –> 0:53:43.536 Concurrency So thank you. 0:53:44.66 –> 0:53:45.506 Concurrency Go ahead and drop any questions you have. 0:53:45.516 –> 0:53:48.286 Concurrency We’ll stick around and hang out. 0:53:57.136 –> 0:53:59.46 Amy Cousland This slide I thought questioned about the slides. 0:53:59.56 –> 0:53:59.846 Amy Cousland They will be available. 0:54:0.526 –> 0:54:1.196 Concurrency Thanks, Amy. 0:54:1.266 –> 0:54:4.16 Concurrency Yep, we’ll make sure that that’s available to everyone if, uh. 0:54:4.56 –> 0:54:5.526 Concurrency Make sure to reach out if you want those slides. 0:54:11.776 –> 0:54:14.416 Concurrency Awesome, everyone have a wonderful afternoon. 0:54:14.496 –> 0:54:15.646 Concurrency Thank you for your time. 0:54:16.96 –> 0:54:19.986 Concurrency We really enjoyed presenting this to you as a great topic for us to dig into. 0:54:20.616 –> 0:54:22.526 Concurrency Please reach out. Follow up. 0:54:22.596 –> 0:54:24.726 Concurrency We’d love to have more conversations and have a great day.