/ Insights / View Recording: Responsible AI: Data Governance Insights View Recording: Responsible AI: Data Governance June 20, 2024In the rapidly evolving landscape of artificial intelligence, ensuring the integrity, security, and ethical use of data is paramount. Join us for an insightful workshop where we will explore the critical role of robust data governance in maximizing the effectiveness and reliability of AI systems.This webinar will cover:Understanding Data Governance: Learn the fundamentals of data governance and its importance in the context of AI.Best Practices for Data Management: Discover strategies for managing data quality, consistency, and security to enhance AI outcomes.Compliance and Ethics: Discuss the legal and ethical considerations in AI data governance, including privacy concerns and regulatory compliance.Case Studies and Real-World Examples: Gain insights from real-world applications of data governance in AI projects, highlighting successes and lessons learned.Future Trends: Explore emerging trends and technologies in data governance that will shape the future of AI.Whether you’re a business leader, data scientist, or AI practitioner, this webinar will provide you with actionable insights and practical tools to optimize your AI initiatives through effective data governance. Transcription Collapsed Transcription Expanded Nathan Lasnoski OK, welcome to responsible AI data governance. 0:0:11.846 –> 0:0:19.736 Nathan Lasnoski We are going to have a great conversation today about how you prepare your data state to be able to gain ground and leveraging AI within your organization. 0:0:20.126 –> 0:0:31.796 Nathan Lasnoski This conversation is one that we are been really looking forward to because so many companies are trying to gain ground in AI, but they sometimes forget the fact that they need to get data ready in order to make that happen. 0:0:32.6 –> 0:0:35.426 Nathan Lasnoski So we’re going to talk a lot about how you prepare yourself for that journey. 0:0:35.586 –> 0:0:39.706 Nathan Lasnoski And today on this call you have myself, Nathan Lesneski. 0:0:39.716 –> 0:0:41.616 Nathan Lasnoski I’m concurrency chief technology officer. 0:0:41.846 –> 0:0:43.936 Nathan Lasnoski I would love for you to connect with me on LinkedIn. 0:0:43.946 –> 0:0:53.556 Nathan Lasnoski I have a weekly newsletter on AI and all things AI leadership and also do a lot of posts and conversations around the AI data ecosystem. 0:0:53.566 –> 0:0:56.296 Nathan Lasnoski So love to talk with you and connect with you online. 0:0:56.486 –> 0:0:57.556 Nathan Lasnoski And we also have Brian. 0:0:57.566 –> 0:0:58.946 Nathan Lasnoski Brian, you want to introduce yourself as well? 0:0:59.546 –> 0:0:59.896 Brian Haydin Yeah. 0:0:59.906 –> 0:1:0.756 Brian Haydin Hi, I’m Brian Hayden. 0:1:0.766 –> 0:1:2.736 Brian Haydin I’m a solution architect taking currency. 0:1:3.66 –> 0:1:5.106 Brian Haydin Been working with this AI technology for. 0:1:6.956 –> 0:1:10.136 Brian Haydin Yeah, I’m probably 8 years now, and thanks for having me. 0:1:11.56 –> 0:1:11.526 Nathan Lasnoski Awesome. 0:1:11.536 –> 0:1:12.946 Nathan Lasnoski So glad that you are here. 0:1:13.216 –> 0:1:18.186 Nathan Lasnoski One thing that we all love to happen on this conversation today is for you to certainly use the chat. 0:1:18.196 –> 0:1:28.726 Nathan Lasnoski So when you have questions or things that you’d like to learn about as we go through this conversation today, by all means drop those questions in the chat and we will address some real time as we’re going through the conversation. 0:1:28.936 –> 0:1:34.546 Nathan Lasnoski And we’ll also maybe even hold it to the end if it’s a big enough topic and we can cover it at our Q&A section as well. 0:1:34.856 –> 0:1:37.826 Nathan Lasnoski So leverage that chat, put your questions out there. 0:1:37.896 –> 0:1:42.196 Nathan Lasnoski Let’s engage in helping you to be able to move your ball forward on the data and AI front. 0:1:43.326 –> 0:1:44.766 Nathan Lasnoski So what are you gonna learn today? 0:1:44.826 –> 0:1:47.896 Nathan Lasnoski Uh, we have really endeavored to make sure that this is useful to you. 0:1:48.166 –> 0:1:51.796 Nathan Lasnoski There’s four things that we are going to accomplish within this session. 0:1:52.106 –> 0:1:57.176 Nathan Lasnoski The first thing we’re going to do is frame the AI and data preparation conversation. 0:1:57.186 –> 0:1:58.716 Nathan Lasnoski How should you think about it? 0:1:58.816 –> 0:2:20.506 Nathan Lasnoski And I think that’s important to consider because the way you might have thought about framing your AI conversation and the way you think about the relationship between data and your AI journey, and then how you think about data is something that has shifted in the last year and has shifted as companies have been able to more assertively gain ground in their their AI and data journey. 0:2:20.516 –> 0:2:24.286 Nathan Lasnoski So we’re going to frame that up and then we’re going to talk about how you think about your data state. 0:2:24.476 –> 0:2:29.406 Nathan Lasnoski And then from that, we’re going to talk about how you consider your AI data model. 0:2:29.416 –> 0:2:32.936 Nathan Lasnoski What I mean by data model, there’s a sort of loosey Goosey term. 0:2:32.946 –> 0:2:44.446 Nathan Lasnoski What I mean by AI data model is how do I think about the sorts of data that I have within establishing an AI solution for my customers or my internal customers, my external customers? 0:2:44.626 –> 0:2:52.876 Nathan Lasnoski How do I think about the data that sits in that ecosystem from preparing to positioning to the data that flows out of the system? 0:2:53.466 –> 0:3:10.236 Nathan Lasnoski How do I think about that model and then on the tail end of this, we’re going to talk a little bit about AI, data safety and some of the tools that go into picturing and proposing and positioning your AI solution well, to be able to protect what you’re delivering to your end customers at the end of the day. 0:3:10.246 –> 0:3:16.106 Nathan Lasnoski And this kind of goes into Data preparedness and governance guardrails that sit around these solutions. 0:3:16.116 –> 0:3:26.196 Nathan Lasnoski And Brian, are gonna have a conversation about these topics and hopefully it’s interesting to you and you learn something from it and it’s an opportunity for you to take this into followed conversations within your business. 0:3:27.606 –> 0:3:38.536 Nathan Lasnoski So we’re gonna start this by talking about what does data governance look like in the context of an organization, and then we’ll Flow this into that framing conversation. 0:3:38.546 –> 0:3:51.116 Nathan Lasnoski So one of the things that companies are thinking about as they seek to establish a successful data governance journey, the first piece of this is that they realize that the data landscape continues to grow. 0:3:51.126 –> 0:3:58.336 Nathan Lasnoski You’ve probably seen other presentations in the past where people talk about the rate of increase of data that’s happening within most organizations. 0:3:58.346 –> 0:4:4.246 Nathan Lasnoski This particularly true as connected devices become even more significant within our organization. 0:4:4.256 –> 0:4:12.226 Nathan Lasnoski It’s not just data that exists from like the business system, but the very nature of the business, the core product of the business producing data that can be used to accomplish good. 0:4:12.336 –> 0:4:32.866 Nathan Lasnoski One of the things that I was talking with an individual who who is very much into the OT side of the manufacturing ecosystem and he refers to OT as sort of this undiscovered country of value that is existed for so long but so infrequently do we leverage the data that exists within the OT ecosystem or data that’s coming from our very products. 0:4:33.156 –> 0:4:34.746 Nathan Lasnoski Great opportunity for us to talk about. 0:4:34.756 –> 0:4:35.676 Nathan Lasnoski Like, how do I harness that? 0:4:36.736 –> 0:4:42.456 Nathan Lasnoski The second is having to understand the operational silos exist within the business. 0:4:42.466 –> 0:4:56.966 Nathan Lasnoski So we’re going to talk more about this as those operational functions being intentional parts within the business, but also limiting the extent to which they are truly silos that have difficulty sharing data with each other to accomplish organizational outcomes. 0:4:58.516 –> 0:5:17.86 Nathan Lasnoski The 3rd and something that’s directly related to it is this balance that exists between governance and enablement, and this is this idea of data agility, the idea that data exists to accomplish valuable outcomes for our business and without having those valuable outcomes, Governance really has no function. 0:5:17.436 –> 0:5:23.706 Nathan Lasnoski If I have data but the data doesn’t accomplish anything, then I’m governing something that has no real benefit to the business. 0:5:23.896 –> 0:5:32.756 Nathan Lasnoski On the flip side, if I’m trying to accomplish benefit, but I don’t govern it, I don’t apply necessarily controls, then I’m gonna fail at this last one, which is a lot. 0:5:32.796 –> 0:5:34.306 Nathan Lasnoski Ensuring that compliance. 0:5:34.316 –> 0:6:2.396 Nathan Lasnoski Exists with external regulations or even you could pivot that last statement to say expectations from the customer expectations which the customers expecting from me in terms of my handling of their data and no more is that more significant when it comes to data within like security when you see some of these frequent platforms that you use on a daily basis being compromised and taking advantage of with the data that you hold dear, your identity are being used in ways that really are inappropriate. 0:6:2.646 –> 0:6:5.136 Nathan Lasnoski Brian, which I’m gonna put you on the spot for a second. 0:6:5.366 –> 0:6:8.436 Nathan Lasnoski Which one of these do you think is the most important? 0:6:8.536 –> 0:6:15.766 Nathan Lasnoski Which do you think is the most sort of most critical or maybe even the most difficult in the context of the scaling organization? 0:6:17.436 –> 0:6:21.666 Brian Haydin I think that Data agility is the most critical for organizations. 0:6:22.136 –> 0:6:36.386 Brian Haydin You know this this technology is growing so fast where we’re where we are today from six months ago is is phenomenal and the biggest impediments that organizations are facing right now is the the ability to actually use this. 0:6:36.746 –> 0:6:38.606 Brian Haydin But it’s also really difficult, right? 0:6:38.656 –> 0:6:41.686 Brian Haydin And we’re gonna get to, you know, some of the purview discussion. 0:6:42.276 –> 0:6:44.46 Brian Haydin There’s work that goes into it. 0:6:44.576 –> 0:6:54.576 Brian Haydin The good news is there’s a lot of really good towards that are out there for it, like at like Microsoft Fabric, but that’ll help accelerate some of that. 0:6:55.186 –> 0:6:57.266 Brian Haydin But that I think would be the pillar I would choose. 0:6:58.26 –> 0:6:59.66 Nathan Lasnoski Umm, OK, thank you. 0:7:0.666 –> 0:7:1.156 Nathan Lasnoski I agree. 0:7:1.206 –> 0:7:2.176 Nathan Lasnoski I think you’re totally right. 0:7:2.186 –> 0:7:11.476 Nathan Lasnoski They’re like getting to a point where you truly do have data agility is so unlike where most organizations are at and that goes to this conversation. 0:7:11.486 –> 0:7:12.776 Nathan Lasnoski We’re gonna have about framing. 0:7:12.886 –> 0:7:24.616 Nathan Lasnoski So one of the things that I think companies that think about as they have this idea of data is that data has changed in terms of information that we’re leveraging within our greater state. 0:7:24.626 –> 0:7:35.706 Nathan Lasnoski And one of the biggest changes that’s changed within the last year and a half has been the availability of general knowledge that’s being applied into our data state from outside sources. 0:7:36.6 –> 0:7:48.546 Nathan Lasnoski Uh, no more significant than that of large language models that bring with them this general ability to be able to converse and engage and have conversations in a way that’s so human like. 0:7:48.616 –> 0:8:1.456 Nathan Lasnoski But it’s based on this idea of a general set of capabilities, a general set of knowledge, not just specific understanding but general knowledge that allows for that interchange, that feels more human, like or feels more enabled. 0:8:1.886 –> 0:8:10.956 Nathan Lasnoski Then perhaps I’ll a model that’s built on natural language has in the past, but that gets combined with this idea of enterprise knowledge that’s unique to us. 0:8:11.226 –> 0:8:35.536 Nathan Lasnoski And so this pattern of combining general knowledge, whether it’s external information from managed sources that are on the Internet or other sources that are from partners but more so to internal knowledge which is trusted, combined together into patterns like rag that are bringing both the trusted Knowledge together into a a pattern that like that serves my customer need. 0:8:36.16 –> 0:8:50.646 Nathan Lasnoski But realizing that this is a whole ecosystem of governance that we’ve frequently haven’t had to manage in the past, many of our established data platforms and data patterns have been so focused on even a nuanced view of what enterprise knowledge looks like. 0:8:50.726 –> 0:9:2.66 Nathan Lasnoski And the enterprise knowledge has been so focused on this idea of just what exists in our earpiece system or structured databases and not even as broad as what true enterprise knowledge has become. 0:9:2.566 –> 0:9:3.506 Nathan Lasnoski So that’s really. 0:9:4.606 –> 0:9:11.836 Nathan Lasnoski Reinforced, I think in where we’ve seen maturity curves in the past and how they’re changing in the future. 0:9:11.846 –> 0:9:21.516 Nathan Lasnoski So this maturity curve, it’s one I’ve actually been using for some time, but it’s one that I think is changing in the context of that previous slide you just saw. 0:9:21.806 –> 0:9:32.76 Nathan Lasnoski So I’m going to explain this first, but then I’m going to pivot it to talk about how the estate has changed and we need to think about it even more broadly than this. 0:9:32.326 –> 0:9:43.836 Nathan Lasnoski So in this maturity curve, you can see that many organizations start right down here in the lower left hand corner and the way you would think about that kind of starting point is I’ve got a business system. 0:9:44.436 –> 0:9:54.86 Nathan Lasnoski But the way that I view data from that business system is by exporting it to some sort of vehicle report or even more common like an Excel spreadsheet. 0:9:54.236 –> 0:10:15.256 Nathan Lasnoski And most of us at this point feel very seen because many of our ways that we’ve handled data in our organizations is just that we have a business system, our finance department exports it to excel or inventory management, department exports it to excel, they build pivot tables and then they use that to be able to organize their next steps. 0:10:15.606 –> 0:10:44.716 Nathan Lasnoski It’s very much like data gets frozen at that point, and then there’s human intuition applied to it to be able to take that next step, and from there you see organizations realizing ohh like nobody can truly trust that data because it’s still a person exporting it, cleansing it, doing something with it and the repeatability, the stability of that data state as a platform for a scaled understanding of how we leverage data to establish trust in the organization is missing. 0:10:44.896 –> 0:10:56.626 Nathan Lasnoski So organizations have spent a lot of time building these traditional on premise data warehouses built on overnight etls that run and then our visualized through variety of dashboards and scorecards. 0:10:57.156 –> 0:11:8.206 Nathan Lasnoski And one of the challenges with that approach has been it took a lot of time to get to value and many organizations, they spent two to three years just going down that road. 0:11:8.656 –> 0:11:17.146 Nathan Lasnoski If they build it, they will come and they never really finish building it and the outcomes of the actual build never really got to that point. 0:11:17.156 –> 0:11:18.556 Nathan Lasnoski And sometimes that can be even. 0:11:18.566 –> 0:11:31.986 Nathan Lasnoski The modern service states where I dumped everything in snowflake and didn’t have really get to value and you have to ask yourself, did I really even get there and which sort of moves to this next stage of this idea of modern data framework? 0:11:32.46 –> 0:11:33.376 Nathan Lasnoski Can I invest? 0:11:33.576 –> 0:11:36.276 Nathan Lasnoski I dumped a bunch of stuff in Snowflake I. 0:11:37.16 –> 0:11:38.586 Nathan Lasnoski Have been trying to leverage that. 0:11:38.996 –> 0:11:45.986 Nathan Lasnoski I didn’t really get to the cleansing part and I’m still ultimately maybe still dumping it back to excel out of that. 0:11:46.26 –> 0:11:56.966 Nathan Lasnoski That managed data state in the cloud, not realizing that I have to build myself in a framework that lets me get to these higher level capabilities, because where does real value establish? 0:11:57.516 –> 0:12:2.126 Nathan Lasnoski It’s not just getting to the dashboard that people can trust, which is important. 0:12:2.256 –> 0:12:11.576 Nathan Lasnoski It’s getting to predicting something that’s going to happen, prescribing the action I can take and then storytelling based upon that data on a potential choice that I could be making in the future. 0:12:11.966 –> 0:12:18.756 Nathan Lasnoski It all that’s based on establishing a baseline for me to establish in front of the customer internally. 0:12:19.66 –> 0:12:25.286 Nathan Lasnoski Now all of this is true in the context of a sort of structured data state. 0:12:26.386 –> 0:12:36.826 Nathan Lasnoski The next step upon this is to say we have a set of data that that is true for in this structured data ecosystem. 0:12:37.256 –> 0:12:50.126 Nathan Lasnoski What’s made this even more complicated is I now have all this unstructured data that’s been part of that picture and all these partner relationships that enter into that consumed estate. 0:12:50.196 –> 0:12:53.326 Nathan Lasnoski So what I’m establishing my bronze. 0:12:53.336 –> 0:12:57.606 Nathan Lasnoski Silver, gold pattern that existed back at this. 0:12:58.466 –> 0:13:9.256 Nathan Lasnoski This sort of maturity curve as I’m moving into preparing data to do high capability workloads on the AI runs which you know is really a prerequisite for me to go up here. 0:13:9.766 –> 0:13:16.776 Nathan Lasnoski I now realize that that has broadened beyond the structured data state to now be inclusive of the unstructured data state as well. 0:13:17.406 –> 0:13:32.166 Nathan Lasnoski Brian, what have you seen as like the biggest challenge companies have established, serialized having to broaden their picture to include sets of data that they wouldn’t have even thought about before as being part of that broader ecosystem. 0:13:34.986 –> 0:13:36.706 Brian Haydin Happy is a tough one. 0:13:37.556 –> 0:13:49.986 Brian Haydin So, uh, where I’ve where a lot of these conversations around bringing in these disparate data systems have landed have been the lake house, you know, sort of architecture, right. 0:13:48.396 –> 0:13:48.616 Nathan Lasnoski Umm. 0:13:50.46 –> 0:14:0.756 Brian Haydin And being able to land your data into a lake and then be able to incorporate that into your workspace and fabric. 0:14:1.6 –> 0:14:4.46 Brian Haydin Umm, you know and do something meaningful with it. 0:14:4.736 –> 0:14:7.166 Brian Haydin That’s where you know, that’s where we’ve been using the tools. 0:14:7.806 –> 0:14:15.36 Brian Haydin Umm, you know, in terms of how do you like what other sources that people have access to? 0:14:15.346 –> 0:14:17.136 Brian Haydin I mean, to me, it’s just a Knowledge thing. 0:14:17.826 –> 0:14:25.976 Brian Haydin You know documents, you know that you normally wouldn’t have considered to be data sources, things with like picture elements and whatnot. 0:14:26.646 –> 0:14:32.296 Brian Haydin You know those are are now meaningful things like images like you can actually use them to do real work. 0:14:32.746 –> 0:14:35.576 Brian Haydin So I don’t know what that help. 0:14:36.46 –> 0:14:51.746 Nathan Lasnoski Yeah, I mean, you know use point which is one of the biggest data sources we’re seeing in the gender, the AI space isn’t just the document isn’t just the content that exists in the ERP system, but might be documents that are used for customer support. 0:14:51.846 –> 0:14:59.216 Nathan Lasnoski So like, we’re working with a customer right now, they have thousands of documents that represent their installation, build plans. 0:14:59.476 –> 0:15:7.486 Nathan Lasnoski And without those documents, they’re end customer can’t use their product, but it’s so is so poorly structured. 0:15:7.496 –> 0:15:13.736 Nathan Lasnoski It’s so like disparately deployed across their environment, but it’s difficult for those end customers to be able to find that information. 0:15:13.746 –> 0:15:18.226 Nathan Lasnoski So they seek solutions to be able to make it easier for them to do business with them. 0:15:18.556 –> 0:15:21.966 Nathan Lasnoski But it’s not like they’re just pulling it from an ERP system. 0:15:21.976 –> 0:15:24.996 Nathan Lasnoski They’re having to understand like, how do I leverage that data? 0:15:25.6 –> 0:15:28.286 Nathan Lasnoski That’s as important as what exists in our ERP system. 0:15:28.296 –> 0:15:36.436 Nathan Lasnoski From a customer point of view, but now needs to be surfaced and more usable approaches that I hadn’t considered in the past just because of the like. 0:15:36.646 –> 0:15:42.166 Nathan Lasnoski How we made it, you know, in a sense, like we just kind of accepted that it was somewhat unusable or difficult to take advantage of. 0:15:42.206 –> 0:15:44.206 Nathan Lasnoski So yeah. 0:15:42.216 –> 0:15:43.616 Brian Haydin Yeah, in real time data too. 0:15:43.706 –> 0:15:45.376 Brian Haydin Like, that’s another important part. 0:15:45.466 –> 0:15:51.536 Brian Haydin You know that that was difficult to use, you know, because models took, you know, so long to generate. 0:15:51.546 –> 0:15:53.256 Brian Haydin How do you how do you even think about it? 0:15:53.546 –> 0:15:55.936 Brian Haydin I’ve got millions of transactions coming in per minute. 0:15:55.946 –> 0:15:57.876 Brian Haydin From all these you know, IoT devices. 0:15:58.866 –> 0:15:59.706 Brian Haydin Yeah, for sure. 0:16:1.606 –> 0:16:15.516 Nathan Lasnoski So all of that fills in from a consumption standpoint, but then realize that as you are leveraging AI solutions, you are creating new data and you are creating data that you wouldn’t necessarily had in the past. 0:16:15.526 –> 0:16:25.476 Nathan Lasnoski The generated content that is produced via your AI system, especially in generative AI, you’re producing text exchanges with your customers that are present in transcripts. 0:16:25.736 –> 0:16:31.916 Nathan Lasnoski You’re producing responses code that might be built or even as Brian you said, the telemetry. 0:16:32.136 –> 0:16:43.676 Nathan Lasnoski Let me try data of how either not just the data that’s entering into the system, but the user interaction within that generative AI system and how I might apply and improve it in the future. 0:16:43.766 –> 0:16:59.256 Nathan Lasnoski This real time intermediary step of what I’m creating as I’m interacting with my customer is a new piece of data that I may not have considered the past because it was previously a human performing at work, or it might be just a human performing at work, not like a human in partnership with the AI system. 0:16:59.626 –> 0:17:5.956 Nathan Lasnoski So that exists as whole new space that I have to maintain as something I care about and then tied to that. 0:17:5.966 –> 0:17:41.456 Nathan Lasnoski Is this idea of who owns all this and and how do they think about the maintenance maintaining of that data, which brings us all back to that whole governance idea that from consumption what data is necessary to serve my use case to the creation of data that exists throughout its lifecycle to how I maintain and control what data is flowing through what system and how I think about customer data versus data that is coming from the next ternal partner to Dave that I don’t wanna expose any controls around to employ sensitive data, especially in Internal AI. 0:17:41.516 –> 0:17:42.306 Nathan Lasnoski Use cases. 0:17:42.536 –> 0:17:50.166 Nathan Lasnoski What don’t I wanna system providing information back to my end customer around all of this exists within the governor state. 0:17:50.866 –> 0:17:55.616 Nathan Lasnoski So when you’re building these AI systems, all of this exists. 0:17:55.626 –> 0:18:13.716 Nathan Lasnoski We have the knowledge that encompasses everything that we just talked about, but actions that are being performed by the AI system, that there’s data being produced as a result, and then even as we start to go up the stack, there’s sets of activities that are formatted in the context of instructions that are performed, even delegated sets of activities. 0:18:13.726 –> 0:18:21.926 Nathan Lasnoski So one of the futures I’ve been suggesting companies think about is this idea that right now when we think about a lot of AI agents, we’re so familiar with the like. 0:18:21.936 –> 0:18:22.666 Nathan Lasnoski I asked a question. 0:18:22.676 –> 0:18:30.46 Nathan Lasnoski I get information back pattern and that’s a very much the infancy of what AI systems are gonna be producing. 0:18:30.196 –> 0:19:1.216 Nathan Lasnoski Think about the future of AI systems as I’ve asked it to perform an action if performs that action, and it may even be a delegated set of AI agents that each performs subparts of that action, and for me to have trust in that system means I have trust in the data in trust in the successful execution of those sub parts of the action that are delegated to AI agents in a similar way to which if I was delegating to a set of employees to perform actions, I am trust that they can perform it, that they know. 0:19:1.306 –> 0:19:4.716 Nathan Lasnoski How to do it and that they’re working from the right data to get that work done? 0:19:5.426 –> 0:19:6.766 Nathan Lasnoski Same kind of idea exists here. 0:19:7.746 –> 0:19:18.476 Nathan Lasnoski So this is a very busy picture, but it really pivots to the the change and when I’m noted like how do I, how do I think about data? 0:19:18.486 –> 0:19:22.696 Nathan Lasnoski This is the way that we’re really urging you to think about data in your ecosystem. 0:19:23.46 –> 0:19:24.436 Nathan Lasnoski Start with the end in mind. 0:19:24.766 –> 0:19:27.276 Nathan Lasnoski So if there’s one major suggestion, I would have for you. 0:19:27.286 –> 0:19:28.616 Nathan Lasnoski It’s start with the end in mind. 0:19:29.6 –> 0:19:31.216 Nathan Lasnoski Don’t think about your data state as a monolith. 0:19:31.426 –> 0:19:40.886 Nathan Lasnoski Think about it as value streams that are leveraging data to accomplish good within your organization and that begins with this idea of value driven use cases. 0:19:41.446 –> 0:19:45.496 Nathan Lasnoski So inside of each of these value driven use cases and these are examples, right? 0:19:45.506 –> 0:19:49.666 Nathan Lasnoski Your business might have different value driven use case domains. 0:19:51.56 –> 0:19:58.906 Nathan Lasnoski Each of these relate to uh set of consumers that are that are leveraging data executive financial visibility. 0:19:58.996 –> 0:20:0.526 Nathan Lasnoski Where’s my truck? 0:20:0.716 –> 0:20:6.276 Nathan Lasnoski Personal optimization like Employees who’s what kind of turnover do I have? 0:20:6.286 –> 0:20:7.986 Nathan Lasnoski Where’s turnover more significant? 0:20:7.996 –> 0:20:19.286 Nathan Lasnoski What are things happening within that facility that might cause more turnover, demand, inventory optimization, manufacturing optimization, the actual OT ecosystem itself? 0:20:19.596 –> 0:20:21.46 Nathan Lasnoski Quality data or data? 0:20:21.386 –> 0:20:22.676 Nathan Lasnoski It’s online from your system. 0:20:22.686 –> 0:20:25.776 Nathan Lasnoski Video analysis from the real product you’ve delivered to your customers. 0:20:26.266 –> 0:20:30.246 Nathan Lasnoski Sales optimization data or even like digital optimization data. 0:20:30.606 –> 0:20:45.406 Nathan Lasnoski All of that represents that sort of for me, formation of use cases that are coming out of each of these different domains and then relating to that is this idea of the data domain that serves it. 0:20:45.466 –> 0:20:53.16 Nathan Lasnoski So you’ll underneath this underneath each of these are these ideas of data domains that are present in the business. 0:20:53.26 –> 0:21:1.886 Nathan Lasnoski And another thing to reinforce is that it may govern the secret system, but the business is the owner of this data. 0:21:2.976 –> 0:21:5.126 Nathan Lasnoski And sometimes organizations don’t think about it that way. 0:21:5.136 –> 0:21:13.906 Nathan Lasnoski They may think about it as an owner of the business truly is the one that actually understands the data, understands its purpose and its function within your organization. 0:21:14.436 –> 0:21:54.816 Nathan Lasnoski So underneath that exists owners and then centric to any one of those are consumers of that data and that could be business consumers of that data where there’s models produced and there’s a data product serving them and data scientists in AI scenarios that are more about giving access to the raw data so they can position a AI model that’s able to respond to a need and then across the bottom of this entire estate exists this idea of a certified data set that represents what the truth is, what data can be used as you are working to. 0:21:54.866 –> 0:22:7.16 Nathan Lasnoski Accomplish those value driven use cases and sometimes across this you’re exchanging data or leveraging data to accomplish greater needs like a customer 360 which might cross several data domains. 0:22:7.26 –> 0:22:15.696 Nathan Lasnoski So accomplish good and there’s more efficient models that exist now from a technical standpoint to reduce duplication of data to serve that goal. 0:22:15.846 –> 0:22:21.296 Nathan Lasnoski But from a sort of top view, this is how you used to start thinking about. 0:22:21.306 –> 0:22:22.856 Nathan Lasnoski You notice that I didn’t start Data up. 0:22:22.866 –> 0:22:25.836 Nathan Lasnoski I started use case down because we have to start with. 0:22:25.886 –> 0:22:40.586 Nathan Lasnoski What am I actually trying to accomplish with my data and how do I then think about data as an input to that that I either have right now in a prepared sense or I don’t and it positions me to focus energy to get that data ready. 0:22:41.646 –> 0:22:51.6 Nathan Lasnoski Our Brian, you know Howard, how are you seeing companies sort of succeed at at thinking about this picture like where we’re any thoughts that you have on this? 0:22:52.766 –> 0:23:2.756 Brian Haydin So I you know, I’ve been talking about fabric a lot and I think every time you called on me, I’ve said fabric but so but it it, I think it’s important really. 0:22:56.56 –> 0:22:56.276 Nathan Lasnoski Umm. 0:22:58.846 –> 0:22:59.106 Nathan Lasnoski Yeah. 0:23:3.546 –> 0:23:4.526 Brian Haydin So a lot. 0:23:4.536 –> 0:23:10.66 Brian Haydin You know, the customers that I’m talking with are beginning journeys, you know, from something to something, right. 0:23:10.76 –> 0:23:12.486 Brian Haydin And a lot of times it’s in the fabric ecosystem. 0:23:12.816 –> 0:23:19.586 Brian Haydin And so the conversations that I’m having with them right now are starting with that governance and the security at the beginning. 0:23:19.596 –> 0:23:33.886 Brian Haydin It’s so much easier to do it, and when you know that it’s gonna be needed and that those controls are already in place, then it is for you to, like, build up these data domains and then try to backtrack and figure it out. 0:23:34.276 –> 0:23:39.996 Brian Haydin So I’m having a lot of conversations around this and incorporating that into our delivery stack. 0:23:40.896 –> 0:23:44.466 Brian Haydin To make sure that you know security and governance is is a first class citizen. 0:23:45.996 –> 0:23:46.816 Nathan Lasnoski Yeah, that’s huge. 0:23:46.826 –> 0:23:58.246 Nathan Lasnoski I mean, just even talking fabric for a minute, like one of the coolest things about fabric is, you know, a lot of times when we talked about this structure, there’s a lot of like, publisher subscriber models. 0:23:58.256 –> 0:24:21.686 Nathan Lasnoski Like it duplicates the data to another domain to use it like fabric has this idea of like not moving the data but then making it available in another domain which is like a huge efficiency but still maintaining the the the data domain idea which goes right to your point of like governance and establishing a effective playing around for people to be able to do the work. 0:24:14.16 –> 0:24:14.276 Brian Haydin Umm. 0:24:22.756 –> 0:24:23.106 Brian Haydin Yeah. 0:24:23.116 –> 0:24:26.576 Brian Haydin And you know, put this in the context of, like, the AI story, right. 0:24:26.836 –> 0:24:35.826 Brian Haydin I mean, there’s so many different use cases from like, you know the HR chat bot that’s gonna serve up somebody’s salary because somebody didn’t, you know, think through the problem correctly. 0:24:36.696 –> 0:24:42.706 Brian Haydin But you know these like the AI technology is to some organizations it’s, you know, it’s the unknown. 0:24:42.716 –> 0:24:43.246 Brian Haydin It’s scary. 0:24:43.256 –> 0:24:54.746 Brian Haydin It’s a new thing, I guess I want to learn about this ohm and, you know, putting in like the putting the governance into the conversation at the beginning helps to allay some of those fears that people are having right now. 0:24:56.556 –> 0:24:56.866 Nathan Lasnoski Totally. 0:24:59.306 –> 0:25:12.496 Nathan Lasnoski So to that point of starting with the end in mind, we suggest that organizations, as they’re getting started, think about building a value analysis around the ways they’re going to leverage data. 0:25:12.506 –> 0:25:25.96 Nathan Lasnoski You notice that this is very much around like understanding a category of value, a name of that use case, a description of what you’re trying to accomplish, and it’s value category operations versus revenue. 0:25:25.266 –> 0:25:53.66 Nathan Lasnoski And then even how difficult it is to do and what you might find is that in some cases your data is already ready, but it hasn’t been governed such as like document data or your data may be ready, but you haven’t figured out how to use it yet, or it might be a situation where the data isn’t ready at all and you need to position a forward strategy to be able to make that data useful asset in the future, like we’ve worked with companies just like man, I wish we had better understanding of our customers so. 0:25:53.526 –> 0:26:10.6 Nathan Lasnoski We need to position a way for us to get that data as we sell them new products, which is starting from this sort of ground up like we need to change the way that we’re gathering data from our customers and it’s a journey to get there and we know that it’s a sort of long stick to be able to get to the end of that game. 0:26:10.976 –> 0:26:13.226 Nathan Lasnoski But we also know that that’s the right thing to do. 0:26:13.396 –> 0:26:16.266 Nathan Lasnoski So it might position us for some value. 0:26:16.276 –> 0:26:20.586 Nathan Lasnoski We’re gonna get in the future understanding that this is a goal of ours as an organization. 0:26:22.356 –> 0:26:23.106 Nathan Lasnoski So what? 0:26:23.116 –> 0:26:29.816 Nathan Lasnoski We’re gonna pivot into now is talking about to sort of if this was like act 1/2 and three, right? 0:26:29.826 –> 0:26:31.146 Nathan Lasnoski We just got through act one. 0:26:31.376 –> 0:26:37.366 Nathan Lasnoski Act two is now let’s talk about how data is used in building an AI solution and then act three. 0:26:37.376 –> 0:26:39.946 Nathan Lasnoski Of course, we’re gonna talk about some governance topics surrounding that. 0:26:40.476 –> 0:26:49.26 Nathan Lasnoski So when you think about a building, a copilot, and what I mean by a copilot is don’t just think about a copilot as like just Microsoft Copilot. 0:26:49.36 –> 0:26:51.516 Nathan Lasnoski OK, think about like a copilot generically as a platform. 0:26:52.156 –> 0:27:4.966 Nathan Lasnoski Realize that a copilot can leverage the diversity of information in that very little of this may be governed right now in your environment, and much of this might be content that lives in your SharePoint environment. 0:27:5.156 –> 0:27:12.176 Nathan Lasnoski So like Azure, starting even to roll out M365 copilot, one of the first conversations we’re having is do you know what Shared or not? 0:27:12.186 –> 0:27:15.146 Nathan Lasnoski You have 10 years worth of legacy SharePoint data out there. 0:27:15.196 –> 0:27:17.746 Nathan Lasnoski Do you even know what people’s access to that looks like? 0:27:17.856 –> 0:27:21.926 Nathan Lasnoski Do you remember when you turned on delve and what you could see or not see about a person’s usage? 0:27:22.366 –> 0:27:53.686 Nathan Lasnoski That’s like multiplied by 10 as you start enabling things like empty 65 copilot, but then also this accessibility of data that becomes available as your data verse as a position and especially data verse becoming really, really accessible with copilot studio and things like fabric surfacing that data more accessibly for us and you know where I was going especially is to think about the data as it flows through all those data sources and ecosystems. 0:27:53.696 –> 0:27:56.376 Nathan Lasnoski Think about as a flows through this this pattern. 0:27:56.486 –> 0:28:17.186 Nathan Lasnoski OK, so we have if we’re built maybe just to define Reg for a second, this is retrieval augmented generation, the idea that I have a static model that allows me to interact with my in customers, but ultimately the data that feeds the intelligence of the system is your data that it’s your data and it’s your data alone. 0:28:17.376 –> 0:28:27.446 Nathan Lasnoski But you need to prepare it for it to be effective, and that might be preparing a document, or it might be preparing a business system to be able to be integrated into a platform. 0:28:27.676 –> 0:28:34.976 Nathan Lasnoski So on the left hand side, you see the ingestion of that data from whatever raw data you have, and again that could be a document. 0:28:35.446 –> 0:28:39.446 Nathan Lasnoski It could be a, you know, sequel database, whatever that raw data is. 0:28:40.486 –> 0:28:46.816 Nathan Lasnoski And as you review that data, you realize ohh, that’s what we’ve been using like there’s a problem. 0:28:46.826 –> 0:28:48.746 Nathan Lasnoski We need to get to it. 0:28:48.826 –> 0:28:52.496 Nathan Lasnoski Being cleansed, the data that represents our end solution. 0:28:52.506 –> 0:29:10.196 Nathan Lasnoski So like even if we go to like the most basic rag pattern I’m building an HR chat bot and you look at your employee manual and you’re like, ooh, like, I gotta improve the quality of that manual because that’s the data that my HR chatbot’s gonna use to be able to interact with my end customer inside the business. 0:29:10.716 –> 0:29:13.886 Nathan Lasnoski And it’s not answering certain types of questions because I my data sucks. 0:29:14.136 –> 0:29:28.676 Nathan Lasnoski So like how do I improve that that cleanse data and that’s the accountability of the business you’re saying business, you need to apply intelligence to that data source, maybe in partnership with tech, to be able to get to that to that pattern. 0:29:30.446 –> 0:29:47.856 Nathan Lasnoski And then as you’re flowing through that, you are potentially chunking that data, meaning I’m breaking it up into smaller pieces that’s ultimately feeding a vector database and a vector database is essentially like a way of understanding data that shows a JSON sees between one word and another. 0:29:47.926 –> 0:29:59.976 Nathan Lasnoski So I think like boat and propeller or draft in depth, these ideas of these relationships between words that we understand because we are humans that know adjacency. 0:30:0.696 –> 0:30:14.836 Nathan Lasnoski Umm, but sometimes like if you think about like I’m learning another language, I might not know the relationship between one foreign word and another foreign word in a way that like a native speaker might know, that’s sort of how a vector database is starting to help us, right? 0:30:14.846 –> 0:30:25.866 Nathan Lasnoski So what’s happening is that data is flowing into the creation of a vector database that then serves the interaction that happens in retrieval response patterns. 0:30:26.76 –> 0:30:30.246 Nathan Lasnoski With that end customer and that end customer is I’m asking questions. 0:30:30.296 –> 0:30:30.856 Nathan Lasnoski I’m responding. 0:30:31.326 –> 0:30:33.216 Nathan Lasnoski I’m asking the the model to do something. 0:30:33.226 –> 0:30:35.486 Nathan Lasnoski It’s using the vector database to inform the action. 0:30:35.496 –> 0:30:40.916 Nathan Lasnoski That’s then taken and then it’s logging and maintaining information that happens in between us. 0:30:46.66 –> 0:30:46.706 Brian Haydin Battery easy. 0:30:48.296 –> 0:30:50.976 Nathan Lasnoski Uh, what part of this is the hardest from your perspective? 0:30:51.16 –> 0:31:0.246 Nathan Lasnoski Like if I’m a business that’s like trying to get started in AI understanding I have a huge diversity of data and maybe I’ve picked a use case like what’s the hardest part of this? 0:31:0.296 –> 0:31:2.936 Nathan Lasnoski Or maybe even like couple hardest parts from your perspective. 0:31:3.476 –> 0:31:14.876 Brian Haydin Yeah, I think the big challenge is for a lot of organizations is that first step going from raw data to cleanse data and you know we hear the protests from customers all the time. 0:31:14.886 –> 0:31:17.496 Brian Haydin Well, I can’t pull this out of my ERP right now. 0:31:17.786 –> 0:31:19.636 Brian Haydin It’s things are such a mess right now. 0:31:19.646 –> 0:31:20.536 Brian Haydin I just don’t even wanna. 0:31:20.716 –> 0:31:27.796 Brian Haydin I don’t even wanna mess with it ohm, but that to me it it’s almost a cop out. 0:31:28.476 –> 0:31:32.986 Brian Haydin The whole purpose is that your data is not going to be ready when you pull it in, right? 0:31:32.996 –> 0:31:39.326 Brian Haydin It’s up to the data owners to make sense of it and clean it so that it isn’t certified data set. 0:31:39.336 –> 0:31:40.386 Brian Haydin So you can start to use it. 0:31:40.756 –> 0:31:47.446 Brian Haydin So I would say like First off like that’s like overcoming that challenge with the organization’s definitely a lot of conversations happen. 0:31:47.906 –> 0:31:53.226 Brian Haydin And then in conjunction with this, in terms of starting these projects, it’s hard work. 0:31:53.526 –> 0:32:6.416 Brian Haydin You know, I mean it’s, you know, uh, imagine going into a storage unit that hasn’t been touched in 20 years and trying to, you know, I’m trying to find that notebook that you remember putting there that that’s what it’s like. 0:32:0.806 –> 0:32:0.966 Nathan Lasnoski Yeah. 0:32:6.966 –> 0:32:8.326 Brian Haydin So it’s a lot of work. 0:32:9.576 –> 0:32:12.726 Nathan Lasnoski I sort of feel like this use cases that are ready document driven. 0:32:13.556 –> 0:32:24.856 Nathan Lasnoski It’s like you built a document driven scenario intake all these documents and then you’re like in the response, you’re finding that it’s it’s answering with the document from 20 years ago. 0:32:25.276 –> 0:32:27.756 Nathan Lasnoski It’s like it’s like, why is this answering this wrong? 0:32:27.766 –> 0:32:39.886 Nathan Lasnoski Well, wait in that storage unit was a document from 20 years ago that is completely wrong now, not representing our product, but it’s used to answer the question like stewarding that data has become an interesting question problem. 0:32:40.556 –> 0:32:40.976 Brian Haydin For sure. 0:32:44.396 –> 0:32:48.726 Nathan Lasnoski Umm, so I I I’ll I find this fascinating. 0:32:48.736 –> 0:32:59.466 Nathan Lasnoski This this part of this conversation, which is like what are the three golden rules of building the data that’s surrounds building an AI application and the AI application model itself. 0:32:59.556 –> 0:33:1.46 Nathan Lasnoski And this is sort of AI. 0:33:1.56 –> 0:33:2.476 Nathan Lasnoski Application in a microcosm. 0:33:2.486 –> 0:33:4.766 Nathan Lasnoski OK, you have this idea of a prompt. 0:33:5.76 –> 0:33:6.966 Nathan Lasnoski It’s interacting with the AI application. 0:33:7.116 –> 0:33:11.836 Nathan Lasnoski There’s user content, there’s skills and resources that facilitate that. 0:33:12.76 –> 0:33:24.476 Nathan Lasnoski It’s ah that that AI applications ability to perform its task and then there’s content that you’re outputting back to the user, you know, as a result of that, there’s some rules that exist regarding doing this well. 0:33:24.686 –> 0:33:33.816 Nathan Lasnoski So the first of that is you have data that has access controls and it has boundaries surrounding it that someone is accountable for. 0:33:33.826 –> 0:33:41.376 Nathan Lasnoski Ideally, the system owner right most the first rule is to respect that data. 0:33:42.316 –> 0:33:58.236 Nathan Lasnoski So an example of that would be, let’s say that you have you have data that exists in SharePoint that is, uh, ingested into an AI application and that data that exists in SharePoint allows only certain people to be able to access that information. 0:33:58.686 –> 0:34:12.536 Nathan Lasnoski Your AI application should respect those rules for those individuals, and it should respect the labels surrounding those documents or those assets that are then translated into the consumer interacting with it. 0:34:12.746 –> 0:34:28.676 Nathan Lasnoski So that cuz sometimes people might ignore those intake them and then allow for a person to get access to data that they should know to have access to sort of the situation, separation of concern that exists regarding those documents needs to be maintained so the right people have access to the right things. 0:34:28.766 –> 0:34:29.876 Nathan Lasnoski So that’s rule one. 0:34:30.366 –> 0:34:38.996 Nathan Lasnoski Rule 2 is that your application should govern specifically what kinds of input outputs certain data you’re able to perform. 0:34:39.6 –> 0:34:57.766 Nathan Lasnoski So you’re building like a pattern or a set of controls here, and some of these are now available for Microsoft at like further control what you can and can’t do against your application, but facilitating very strict input output controls to the AI application through the vehicle of the prompt. 0:34:58.16 –> 0:35:20.496 Nathan Lasnoski But the prompt itself is actually governed by a set of guard rules, and then the third here is that there’s model controls that put guardrails around what the AI model itself can do, and essentially what will happen is that if your AI model has access to something, it’s going to give the information back to the application. 0:35:20.506 –> 0:35:26.926 Nathan Lasnoski So we need to establish guardrails that surround that picture so we’re able to provide the right data back. 0:35:27.766 –> 0:35:39.376 Nathan Lasnoski But you know, in a way which is sufficiently protected to enable the user to have the right, you know, the right access to the right data and not be able to compromise my model in a way that’s it really shouldn’t be. 0:35:39.606 –> 0:35:46.196 Nathan Lasnoski So that’s why this AI application you’re really exists before this spot because there’s so many ways to like. 0:35:46.206 –> 0:35:57.206 Nathan Lasnoski If I just get raw access to a model like any sort of large language ecosystem that has a rag pattern behind it, I’m gonna find that I can compromise it in ways that I really shouldn’t be allowed to. 0:35:57.896 –> 0:36:13.416 Nathan Lasnoski I’m a Brian from your perspective like, which has been the most evolving of these like like what in the think about the like the last year, OK, like when we first started doing rag to like now like, what do you think has changed the most and how you think about this over that last year? 0:36:15.366 –> 0:36:21.386 Brian Haydin The Ohh the tooling right now has become a little bit more ubiquitous and unified. 0:36:21.436 –> 0:36:27.76 Brian Haydin You know to be able to support security and governance, there’s still some discrepancies. 0:36:27.86 –> 0:36:35.526 Brian Haydin You know, I’m thinking of, you know, some of the copilot studio governance that doesn’t have the synergy that the other platforms would have. 0:36:35.406 –> 0:36:35.646 Nathan Lasnoski Hmm. 0:36:36.196 –> 0:36:47.996 Brian Haydin You know you you have some ability to govern, but I think we’re getting closer to being able to establish the three golden rules pretty easily and out of the box, in whatever medium. 0:36:48.66 –> 0:36:54.36 Brian Haydin Uh that you want to explore with AI, so I think you’re gonna see that come together a little bit this year. 0:36:54.746 –> 0:36:57.556 Brian Haydin Certainly people don’t pay enough attention to it. 0:36:58.286 –> 0:37:12.676 Brian Haydin The other thing too is that I’d like some of the features that are coming out in the Azure AI studio, like the groundedness detection and you know, really starting to elevate safety and risk, you know, with some of the with these technologies. 0:37:14.6 –> 0:37:18.966 Nathan Lasnoski I totally agree on that groundedness detection thing like and that is truly an evolving space. 0:37:18.976 –> 0:37:37.226 Nathan Lasnoski But like one of the bigger problems that existed was like the AI model responding with content from a source document or database that actually you didn’t come from that database, or actually from that document and grounded this, really representing this idea that like we’re gonna double check. 0:37:37.236 –> 0:37:38.6 Nathan Lasnoski We’re gonna do discount. 0:37:38.16 –> 0:37:42.316 Nathan Lasnoski Double check on the data to make sure you actually responded with something that actually existed. 0:37:42.936 –> 0:37:43.326 Nathan Lasnoski Umm. 0:37:43.556 –> 0:37:48.336 Nathan Lasnoski Essentially a trust validation, but that being part of the platform, not something you actually have to go build now. 0:37:48.926 –> 0:37:49.136 Brian Haydin Yeah. 0:37:51.426 –> 0:37:53.736 Nathan Lasnoski OK, so each of these represent layers of defense. 0:37:53.746 –> 0:37:58.216 Nathan Lasnoski They represent controls and uh activities. 0:37:58.226 –> 0:38:0.46 Nathan Lasnoski We have to build into our application platforms. 0:38:3.546 –> 0:38:11.166 Nathan Lasnoski OK, so each of these sort of layer into this idea of AI content safety as we think about our governance. 0:38:11.456 –> 0:38:15.306 Nathan Lasnoski The first is how do I train my user to interact with my system? 0:38:15.556 –> 0:38:19.826 Nathan Lasnoski So what are the ways that I ask them to interact? 0:38:19.876 –> 0:38:26.506 Nathan Lasnoski How do I prompt them appropriately to get the data they need and that’s both a security thing and A like ease of use thing. 0:38:26.516 –> 0:38:30.566 Nathan Lasnoski So like AI systems that start off with like, here’s the way that you can work with me. 0:38:30.576 –> 0:38:49.656 Nathan Lasnoski And here’s kinds of questions you can ask, and here’s ones that I won’t respond to, giving them clear guidance as to how you can interact in the AI system is a just a best practice in general, but then also building policy around that initial intake to say, Nope, you’re not getting through, you can go through this spot. 0:38:49.726 –> 0:38:54.176 Nathan Lasnoski But I’m not letting you go through over here because that’s not what my system is built to do. 0:38:54.226 –> 0:38:55.46 Nathan Lasnoski Like you’re not. 0:38:55.566 –> 0:39:15.66 Nathan Lasnoski My system doesn’t exist to perform that function for you and then building and place those data access controls that further, like allows to go down certain routes and then that also then existing in the context of the application itself and the AI model, both represented as ways to be able to provide that protection and governance has to cross all of those. 0:39:16.516 –> 0:39:26.796 Nathan Lasnoski So as we sort of go into the next stage of this, we’re going to talk about like how does data Flow into this type of application model? 0:39:27.466 –> 0:39:31.36 Nathan Lasnoski And there’s two big domains we always talk about when we talk about AI. 0:39:31.86 –> 0:39:34.786 Nathan Lasnoski And it’s also super relevant in the context of managing that data. 0:39:34.796 –> 0:39:39.956 Nathan Lasnoski One, it’s really this Management: between commodity and mission driven data and AI use cases. 0:39:40.306 –> 0:39:46.136 Nathan Lasnoski So commodity mean really representing many of the platforms I use are going to have AI platforms. 0:39:47.306 –> 0:40:2.506 Nathan Lasnoski Lesser interesting thing to say, but like they if I have officer 65, M 365 copilot, Salesforce Dynamics ERP platforms, you name it, they’re all gonna have a type of copilot that gets lit up with their platform just because it’s what everyone’s doing. 0:40:2.516 –> 0:40:2.736 Nathan Lasnoski B. 0:40:2.746 –> 0:40:7.906 Nathan Lasnoski Naturally, it’s a way for us to drive value inside that commodity platform. 0:40:7.916 –> 0:40:19.636 Nathan Lasnoski We have to realize that a certain amount of data governance is gonna happen or need to happen in the context of what it’s delivering to our customers before it even gets to like bigger level use cases. 0:40:19.886 –> 0:40:40.956 Nathan Lasnoski So if I think about M365 copilot, an example of what I might use to govern the data that’s available to it is SharePoint permissions and labeling, but the overarching technology that I might look at is something like purview for Office 365 that then forces the application of labels, understands what labels exist, where they exist, how they’re being used to. 0:40:40.966 –> 0:40:47.556 Nathan Lasnoski Think about that broader commodity ecosystem and then on the sort of mission driven side, we have the same challenge, right? 0:40:47.566 –> 0:40:49.436 Nathan Lasnoski We have this need for a certified data set. 0:40:49.486 –> 0:41:0.46 Nathan Lasnoski We have this need for establishing the owners of that data and that is more highly structured data sources or sort of pools of less structured data. 0:41:0.56 –> 0:41:2.206 Nathan Lasnoski But it’s not a SAS platform per se. 0:41:2.216 –> 0:41:13.926 Nathan Lasnoski Might be something that’s more like living in a data lake in some domain, so we also have a need to manage that ecosystem, and sometimes this can cross like it concerns like the people who probably care about this. 0:41:13.976 –> 0:41:20.416 Nathan Lasnoski The commodity side for especially for off 365 right now are probably different than the people right now who care about data. 0:41:20.666 –> 0:41:34.396 Nathan Lasnoski You know, in the mission driven side like more highly structured data and realize that like your your concerned about that data needs across both of those domains, it needs to think about all of it and it needs to cross the entire ecosystem. 0:41:35.626 –> 0:41:54.136 Nathan Lasnoski So I showed this this diagram before I’m bringing it back because I want you to remember like each of these functional towers represent diversities of domain, that diversities of data that serve it, and there’s a Governance flow for each of those towers. 0:41:54.316 –> 0:42:21.646 Nathan Lasnoski So we’re gonna sit on this for a second because I think this is a good way for us to think about the overall picture is we have a source system and that source system is usually like a business system or it could be outputted data from a connected device and Alt, platform or it could even be data created by an individual user like a document that sits in this source system and that’s consumed by something it’s consumed by that modern data platform. 0:42:21.936 –> 0:42:51.766 Nathan Lasnoski And at modern data platform like being something like a fabric or something and that sitting then into movement into that modern data platform and an automated way, not just a one time move, but something that establishes that this lineage of trust that this flow is something that’s ongoing, that it is updated, it’s respected, it’s coming from this source system because if I cut off the source system from the modern data platform and I only move it once, it’s only as good as the last time I moved it. 0:42:52.186 –> 0:43:8.26 Nathan Lasnoski Now I’m thinking about it in a continuous flow which gets to this critical point which many companies don’t have yet, which is you have a certified data set and I don’t mean like spend the next two years building your like your mapped data platform, right? 0:43:8.36 –> 0:43:10.406 Nathan Lasnoski I’m thinking talking here specifically about. 0:43:10.416 –> 0:43:24.306 Nathan Lasnoski I’ve got a use case I need to understand what data is necessary to certify in order for me to have trust that I’m executing that’s that’s certified use case well and have certified data that serves it and that’s an agreed schema. 0:43:24.556 –> 0:43:31.246 Nathan Lasnoski It’s published for the use by the consumers and then I have an understanding of how it got there is that ELT is an eltel. 0:43:31.256 –> 0:43:33.386 Nathan Lasnoski Like what process did it go through? 0:43:33.396 –> 0:43:46.436 Nathan Lasnoski What lineage and existed for it to get from point A to point B which then moves me into this idea of who can access the data, which is potentially the actual rights that existed here on step one, and moving that into? 0:43:46.446 –> 0:43:47.736 Nathan Lasnoski How is it being used? 0:43:47.946 –> 0:43:50.476 Nathan Lasnoski Is it a data science use case where I’m truly just? 0:43:50.526 –> 0:43:53.946 Nathan Lasnoski I’m using doing prediction and there is there’s no intermediary step. 0:43:54.906 –> 0:43:56.66 Nathan Lasnoski Is it a big data use case? 0:43:56.206 –> 0:44:8.676 Nathan Lasnoski It connectivity between people and it’s consumed or it’s is it dashboards self service which then has its own scoped access and then at the end of this pattern exists this overall flow of? 0:44:8.686 –> 0:44:46.456 Nathan Lasnoski How I bringing that on an ongoing basis to my customers in Brian earlier you mentioned preparing the data being like one of the more difficult challenges like even just getting from source system to you know the the platform itself being one of the more more difficult parts of that picture from your perspective like how does that relate to like certification of data and like am I what am I, how do I judge like what part of that I’m actually certifying as or like I guess explained to me like the relationship between that that preparation period and. 0:44:46.466 –> 0:44:47.746 Nathan Lasnoski The certification of the data. 0:44:50.96 –> 0:44:50.646 Brian Haydin Yeah. 0:44:50.656 –> 0:44:53.366 Brian Haydin So OK, let’s take like a new customer. 0:44:53.616 –> 0:45:3.46 Brian Haydin You know, as an example, so my sales team might consider a new customer to be somebody that hasn’t done business with us in the last 12 months. 0:45:3.586 –> 0:45:7.746 Brian Haydin But and and I have like another department. 0:45:7.756 –> 0:45:18.146 Brian Haydin Maybe it’s my compliance department that considers like anybody like a new customer that’s ever like a customer that’s ever done business with us is not a new customer like it’s just if this is the first time. 0:45:18.596 –> 0:45:20.326 Brian Haydin So, like who owns that data? 0:45:20.846 –> 0:45:43.676 Brian Haydin You know, to determine what the custom, you know what defines the new customer first, like that’s a challenge getting like different, you know, people just say, OK, I own this and make the decision and then obviously it’s it becomes like preparing that data you know and then the certified data set is what’s actually delivered to the data consumers. 0:45:43.926 –> 0:45:48.196 Brian Haydin So you mentioned before, like you know, there’s the business consumers that seems to be pretty easy. 0:45:44.6 –> 0:45:44.266 Nathan Lasnoski Umm. 0:45:48.466 –> 0:45:58.966 Brian Haydin I’m gonna use my silver, gold kind of medallions, you know, probably for some of the the visualizations, but the data scientists, they’re gonna want something a little bit more raw. 0:45:58.976 –> 0:46:1.126 Brian Haydin They’re probably gonna want, like, something in the bronze layer. 0:46:1.826 –> 0:46:10.396 Brian Haydin Umm, you know, in order to do their their work and bronze might, you know, might actually have conflicts of like what defines new customer. 0:46:10.786 –> 0:46:23.546 Brian Haydin So like you know that to me is like where those like delineations are, you know, the certification is not only in terms of a readiness, but also kind of geared towards who the consumer is. 0:46:25.826 –> 0:46:28.876 Nathan Lasnoski Huge point that especially like preparation of data. 0:46:28.886 –> 0:46:30.926 Nathan Lasnoski What it needs to go through for a person to use it right? 0:46:29.946 –> 0:46:30.206 Brian Haydin Yep. 0:46:33.626 –> 0:46:33.806 Nathan Lasnoski So. 0:46:35.106 –> 0:46:49.556 Nathan Lasnoski Umm, So what you’ll notice is I took that picture, I turned it on side and I was as I was preparing the slide, it was sort of like what are you doing with this slide? It’s. 0:46:49.266 –> 0:46:49.746 Brian Haydin My quote. 0:46:51.716 –> 0:46:51.966 Nathan Lasnoski No. 0:46:51.976 –> 0:46:52.486 Nathan Lasnoski Makes sense. 0:46:52.766 –> 0:46:57.246 Nathan Lasnoski Uh, so it really looks like a verticalized stack there, right? 0:46:57.256 –> 0:47:9.586 Nathan Lasnoski You’re starting from something to get to that consumed asset for a domain and maybe even you have to combine that together to accomplish an end game or outcome. 0:47:9.776 –> 0:47:16.146 Nathan Lasnoski But you’re turning outside and flowing up through that cycle for a consumer to be able to leverage it. 0:47:16.156 –> 0:47:18.566 Nathan Lasnoski And the question is that that’s a lot of work. 0:47:18.576 –> 0:47:19.106 Nathan Lasnoski Holy cow. 0:47:19.806 –> 0:47:30.26 Nathan Lasnoski But how do I trust this data if I don’t go through that pattern and some of your data might be closer to that level of preparedness than others, but it’s the hard work that needs to go. 0:47:30.36 –> 0:47:36.686 Nathan Lasnoski Your team needs to go through to be able to trust the data to deliver something that really matters, and that’s really what we’re getting to here. 0:47:36.696 –> 0:47:47.356 Nathan Lasnoski Like some, there’s some things in your business that just kind of matter, but there’s a lot of things in your business that really matter, and sometimes the most important things that create new revenue or operational savings, they’ve really matter. 0:47:47.486 –> 0:47:58.6 Nathan Lasnoski It means that your data needs to matter and the accuracy precision of your data needs to matter needs to go through a process that enables it to be as effective and useful as possible. 0:47:59.216 –> 0:48:8.706 Nathan Lasnoski So all of that is needing some surrounding tooling, so this last piece of this I’m just going to cover a couple tools that I think would be useful for you to think about. 0:48:8.716 –> 0:48:15.526 Nathan Lasnoski And I want you to really ready your questions if you have any here at the end, I will have a few minutes I think to cover a few of these. 0:48:15.596 –> 0:48:21.886 Nathan Lasnoski So one of the tooling that I would encourage you to look at Azure examining leveraging things like fabric is Microsoft purview. 0:48:21.936 –> 0:48:28.596 Nathan Lasnoski I talked about that just like recently in if you have an offshore 65 environment, you probably already know that this is part of your picture. 0:48:29.306 –> 0:48:32.796 Nathan Lasnoski But if you have a like a cloud data state, you may or may not be using this. 0:48:33.106 –> 0:48:37.786 Nathan Lasnoski This is a tool set that like or or like even something roughly comparable to it. 0:48:37.796 –> 0:48:41.256 Nathan Lasnoski You need something that gives you the picture of the picture of the data state. 0:48:41.526 –> 0:48:51.736 Nathan Lasnoski We’ve referred to this also as like the picture of the elephant, like you’re here the term a bunch of blind people go up to an elephant and it’s a rope or it’s a tree trunk or it’s a it’s a hose or whatever. 0:48:51.746 –> 0:48:52.706 Nathan Lasnoski Well, it’s elephant, right? 0:48:52.716 –> 0:48:54.216 Nathan Lasnoski But they all have different pieces of it. 0:48:54.506 –> 0:49:4.126 Nathan Lasnoski This sort of picture of the elephant is also something that, like a platform like this really is, is oriented around it’s goal, is give me that picture. 0:49:4.636 –> 0:49:8.146 Nathan Lasnoski Also allow me to apply security to the data. 0:49:8.256 –> 0:49:9.446 Nathan Lasnoski Allow me to govern it. 0:49:9.456 –> 0:49:20.686 Nathan Lasnoski The data map the picture and then the like understanding the risk and compliance ecosystem that surrounds it and kind of going into purviews probably a conversation in and of itself. 0:49:20.696 –> 0:49:24.516 Nathan Lasnoski It’s its own sort of deep dive realize that it’s a stack. 0:49:24.526 –> 0:49:31.486 Nathan Lasnoski That’s evolved substantially over the last several years, and it’s one that has become more and more integrated just in the fabric itself. 0:49:31.496 –> 0:49:50.176 Nathan Lasnoski So, I mean just talking fabric for a second, if you’re a power BI user, you’re already a fabric user like Power BI is basically now an extension of fabric or fabric is an extension of power BI, and it’s essentially making every power BI user a data user, a data, a data preparedness user. 0:49:50.386 –> 0:49:57.416 Nathan Lasnoski So a lot of the tooling from purview is getting integrated right within that state, like lineage and certified data sets. 0:49:57.426 –> 0:50:16.676 Nathan Lasnoski So the idea of endorsing a set of data that data being certified by a person as the trusted set of data that’s used by a set of business consumers that are producing power BI reports super useful like one of the most challenging things for self service is like I see 4 versions of this same table. 0:50:16.686 –> 0:50:19.236 Nathan Lasnoski Like which one is the one I should actually consume from? 0:50:19.526 –> 0:50:32.176 Nathan Lasnoski Noting that this is certified, noting that it exists in a certain level of medallion so people know what to choose and what to use, especially in scale environments like sometimes smaller organizations like I only have one report writer. 0:50:32.186 –> 0:50:34.76 Nathan Lasnoski It’s all in it like, OK, I get it. 0:50:34.706 –> 0:50:35.436 Nathan Lasnoski Organization. 0:50:35.446 –> 0:50:37.556 Nathan Lasnoski That’s much more scaled out than. 0:50:37.706 –> 0:50:40.456 Nathan Lasnoski You don’t have time to have like one person building all your reports. 0:50:40.466 –> 0:50:43.416 Nathan Lasnoski You’re scaling this out across the organization, and that’s truly the idea. 0:50:43.426 –> 0:50:45.516 Nathan Lasnoski Like, how do I democratize data access? 0:50:46.196 –> 0:50:56.566 Nathan Lasnoski So your democratized data access by saying build which you need, but do it on this do it on the certified data set and that is based on this idea of lineage. 0:50:56.626 –> 0:50:59.356 Nathan Lasnoski The lineage being like, where did you come from? 0:50:59.506 –> 0:51:0.616 Nathan Lasnoski We all have lineage. 0:51:0.626 –> 0:51:2.656 Nathan Lasnoski From a personal standpoint like. 0:51:3.176 –> 0:51:3.866 Nathan Lasnoski I. 0:51:4.56 –> 0:51:6.86 Nathan Lasnoski But there’s data lineage too, right? 0:51:6.96 –> 0:51:7.386 Nathan Lasnoski It came from this source. 0:51:7.476 –> 0:51:8.706 Nathan Lasnoski It flowed through these steps. 0:51:8.716 –> 0:51:9.906 Nathan Lasnoski It got to this point. 0:51:10.96 –> 0:51:15.646 Nathan Lasnoski It’s one of the most common things in companies that don’t have lineage in data and are back at that maturity level. 0:51:15.656 –> 0:51:18.126 Nathan Lasnoski One are I don’t trust your data. 0:51:18.136 –> 0:51:19.546 Nathan Lasnoski Is this really the truth? 0:51:19.556 –> 0:51:21.176 Nathan Lasnoski Is this where did this come from? 0:51:21.186 –> 0:51:23.206 Nathan Lasnoski What did you do to get it to look like this? 0:51:23.456 –> 0:51:30.986 Nathan Lasnoski And there’s sort of a lack of alignment between the this is truly representative of the truth of the environment. 0:51:31.136 –> 0:51:35.306 Nathan Lasnoski Data lineage helps us to establish that truth both by knowing. 0:51:35.456 –> 0:51:36.426 Nathan Lasnoski Here’s the source. 0:51:36.436 –> 0:51:37.566 Nathan Lasnoski Here’s how it got here. 0:51:37.576 –> 0:51:38.896 Nathan Lasnoski Here’s how I certified it. 0:51:38.976 –> 0:51:43.86 Nathan Lasnoski Here’s the people accountable for for signing off that that data is accurate. 0:51:43.196 –> 0:51:48.616 Nathan Lasnoski And then uh, and aligning individuals within the business to be able to provide that access. 0:51:48.626 –> 0:51:51.426 Nathan Lasnoski Hey, attestation of the data itself. 0:51:51.506 –> 0:51:59.276 Nathan Lasnoski So lineage is kind of like one of those nice things, and the more that’s just integrated into what I do as opposed to like this extra step is super important. 0:52:0.116 –> 0:52:3.776 Nathan Lasnoski So OK, so we’ve got about 8 minutes left. 0:52:4.6 –> 0:52:6.826 Nathan Lasnoski We’ll love to answer any additional questions that you have. 0:52:6.836 –> 0:52:8.596 Nathan Lasnoski Let’s kind of balance over to the chat here. 0:52:8.606 –> 0:52:9.956 Nathan Lasnoski I think, OK. Yes. 0:52:9.966 –> 0:52:10.956 Nathan Lasnoski Oh, Amy, thank you. 0:52:11.216 –> 0:52:19.136 Nathan Lasnoski I’m gonna cover this before we like Azure preparing any questions, but we do have some things you can take as next steps, so ways that you can work with us. 0:52:19.346 –> 0:52:34.746 Nathan Lasnoski We would love to do a free data governance assessment to take a look at how you’re thinking about your data state and position that for you to be able to make forward steps and that may be like you already understand your States, you’re not sure how to secure it and govern it. 0:52:34.756 –> 0:52:39.786 Nathan Lasnoski Maybe you’re not even sure how you’re gonna use your data in the future, and maybe you need to back up and think about the end in mind. 0:52:40.36 –> 0:52:48.746 Nathan Lasnoski So ways that we can help with that are if the data governance assessment we can also step into that end in mind with our AI and copilot and visioning sessions. 0:52:48.976 –> 0:52:52.866 Nathan Lasnoski We’ve run over 50 of these with executive teams in the last six months. 0:52:53.76 –> 0:53:6.86 Nathan Lasnoski This Is Us helping you to think about how does the mission of my business become augmented by the availability of artificial intelligence and data to be able to help me scale and in create more revenue and operational savings within the organization. 0:53:7.576 –> 0:53:10.26 Nathan Lasnoski That’s something that we invest in for you. 0:53:10.76 –> 0:53:20.26 Nathan Lasnoski So these are all things that we invest in to help create an effective partnership, but even more so help you to take advantage of data to accomplish powerful outcomes within your organization. 0:53:20.36 –> 0:53:25.896 Nathan Lasnoski So as you closed down Phil, the survey, may we want to know if this is useful to you. 0:53:25.986 –> 0:53:26.186 Nathan Lasnoski B. 0:53:26.196 –> 0:53:27.616 Nathan Lasnoski We love to work with you again. 0:53:28.156 –> 0:53:30.56 Nathan Lasnoski Indicate what those things could be. 0:53:30.146 –> 0:53:33.276 Nathan Lasnoski If you have suggestions as to how we could do this better, give us those suggestions. 0:53:33.286 –> 0:53:37.26 Nathan Lasnoski We want to get your feedback or if you love it, tell us what you loved about it. 0:53:37.36 –> 0:53:37.946 Nathan Lasnoski So we can do more of it. 0:53:38.416 –> 0:53:42.166 Nathan Lasnoski And with that, we would love to take any outstanding questions you have. 0:53:42.226 –> 0:53:43.536 Nathan Lasnoski So thank you. 0:53:44.66 –> 0:53:45.506 Nathan Lasnoski Go ahead and drop any questions you have. 0:53:45.516 –> 0:53:48.286 Nathan Lasnoski We’ll stick around and hang out. 0:53:57.136 –> 0:53:59.46 Amy Cousland This slide I thought questioned about the slides. 0:53:59.56 –> 0:53:59.846 Amy Cousland They will be available. 0:54:0.526 –> 0:54:1.196 Nathan Lasnoski Thanks, Amy. 0:54:1.266 –> 0:54:4.16 Nathan Lasnoski Yep, we’ll make sure that that’s available to everyone if, uh. 0:54:4.56 –> 0:54:5.526 Nathan Lasnoski Make sure to reach out if you want those slides. 0:54:11.776 –> 0:54:14.416 Nathan Lasnoski Awesome, everyone have a wonderful afternoon. 0:54:14.496 –> 0:54:15.646 Nathan Lasnoski Thank you for your time. 0:54:16.96 –> 0:54:19.986 Nathan Lasnoski We really enjoyed presenting this to you as a great topic for us to dig into. 0:54:20.616 –> 0:54:22.526 Nathan Lasnoski Please reach out. Follow up. 0:54:22.596 –> 0:54:24.726 Nathan Lasnoski We’d love to have more conversations and have a great day.