Azure cost optimization that actually sticks—start with tagging, policy, and RI

Are you aware that a significant portion of cloud environments is often overlooked, leading to unnecessary costs and security vulnerabilities? It’s time to address this issue and optimize your Azure environment for better efficiency and security!

We’re excited to invite you to our upcoming webinar, where we’ll delve into the critical aspects of ‘spring cleaning’ your Azure environment. Discover how you can identify and eliminate waste, enhance FinOps practices, and establish robust governance guardrails to support both FinOps and DevSecOps patterns.

Key topics we’ll cover include:

Understanding the scope and impact of waste in cloud environments (20-40%)
Strategies for identifying and eliminating unnecessary resources
Best practices for establishing governance guardrails in Azure
Leveraging FinOps and DevSecOps principles for enhanced efficiency and security

Whether you’re a cloud architect, DevOps engineer, or IT manager, this webinar will provide valuable insights and practical tips to optimize your Azure environment effectively.

If Azure costs are creeping up, start with the fundamentals: consistent tagging, budgets, and reserved instances. In this webinar, Concurrency’s Microsoft MVPs and cloud architects show you how to spring-clean your Azure environment—replace legacy tools, design micro-segmented networks, enforce MFA/PIM, and build landing zones that scale. Serving teams in Chicago, Milwaukee, and Minneapolis, we share practical steps to reduce spend and strengthen governance—plus recovery and redeployability patterns for IaaS and PaaS. Watch now and download the transcript to execute with confidence.

WHAT YOU’LL LEARN

In this webinar, you’ll learn:

A proven tagging strategy to unlock accurate cost analysis and budgets.
How to right-size and shift VMs to reserved instances for fast savings.
Why to replace legacy backup/monitoring/patching with native Azure.
How to design verticalized, micro-segmented networks using NSGs.
Practical recovery patterns for PaaS services and IaC redeployability.

FREQUENTLY ASKED QUESTIONS

How do we start Azure cost optimization if tagging is inconsistent?

Begin by defining 6–8 required tags (owner, app, environment, cost center, classification). Enforce via Azure Policy so new resources cannot deploy without tags. Then backfill tags on existing resources and create budgets by environment (prod/non-prod) to target high spenders.

What’s the quickest way to reduce Azure VM costs—RI or right-sizing?

Do both in sequence: right-size to common SKUs, then purchase reserved instances across those SKUs. This combination is repeatedly cited as the highest single action savings during cleanups—especially when moving pay-as-you-go VMs to RIs.

How should we design micro-segmented networks for legacy apps in Azure?

Move from flat, lateral networks to verticalized application segments. Place components per app in dedicated resource groups/subscriptions, control traffic with NSGs, and require access through governed layers. This limits lateral movement and aligns with cloud-consistent security.

What’s the recommended recovery approach for Azure PaaS services?

Use Azure Backup where covered and pair it with deletion protection. For PaaS, emphasize redeployability via infrastructure as code (bicep/terraform) so services, networks, and containers can be rebuilt quickly, while data restores cover state.

How many subscriptions should a landing zone strategy use?

Subscriptions aren’t the enemy—lack of structure is. Follow the Cloud Adoption Framework pattern: separate identity, connectivity, and landing zone subscriptions under management groups; vend subscriptions per workload domain to scale governance cleanly.

ABOUT THE SPEAKERS

Concurrency is Concurrency’s Chief Technology Officer and a long-time Microsoft MVP in Azure. Will Parker is a Technical Architect specializing in cloud and DevOps, with prior cybersecurity experience. Together, they guide organizations through cost optimization, governance, and landing zone design to ensure Azure environments are secure, efficient, and ready to scale.

EVENT TRANSCRIPT

Transcription Collapsed Transcription Expanded

Concurrency 0:11 OK, everyone, welcome to spring cleaning your Azure environment. This came about as a result of me doing some spring cleaning in my house and I thought, wow, there’s one thing that many people don’t do regularly enough. It’s take a look at the organization of their Azure environment and just like we move those boxes from the old house to the new house, they’ve been sitting down there for the last 10 years and you finally get that dumpster and Chuck everything in it. This is your opportunity to have that same conversation around your Azure environment. So we’re really happy to be here today with you. We have two really nice opportunities here. We’re gonna break down. Not only how to think about your spring cleaning environment, we’re also going to give you some tips about things to look for and kind of going into different details inside of each domain. So I’ll introduce myself. My name is Naples noski. I’m concurrencies chief technology officer. I’ve been with concurrency for 22 years, man, that sounds like a while, but Microsoft MVP for 14 almost 15 now and my Microsoft MVP domain is in Azure. So this is right up my alley. I’m really excited to talk about this today and this is actually going back to some stuff that we presented at Microsoft Ignite recently. So looking forward to sharing that a lot of that content and then we’ve updated a little bit too keep this fresh. So OK, I also want to introduce Will Parker, will you wanna just talk a little bit about yourself and your background? Will Parker 1:44 Yeah, absolutely. Hi everyone. I’m Will Parker, technical architect here at concurrency, primarily in the cloud and DevOps space. Been working in Azure for about about 8 years now. Prior to that, I was. I was pretty full into department defense. Ciber Security work had a lot of a lot of experience with trying to help keep you know, a lot of these types of things. We’re gonna talk about with cost optimization and and and those sorts of things that kind of help like once you have your Azure environment now, how do we really make it tighten up the loose ends and make everything, make everything work, work the way it should. So excited to be here. Have to talk about this. Thanks Nathan. Concurrency 2:21 Awesome. Cool. Alright, so for our audience, uh, there is a chat feature and A Q&A feature. We would love for you to liberally use that we love answering nuanced questions as we’re having these conversations, so don’t be shy. Make sure you put your questions in the chat as we go along and you’ll notice that some of the slides will just kind of like go through relatively quickly and that’s just simply because we are way too much content for an hour. So you’ll see some things that are gonna be very detailed and we’ll just kind of flash it up. You’ll get the deck afterward, and we would also love to sort of talk with you after this webinar is done to be able to help you make this real. OK. So when we’re going to start this conversation with, is this idea of a cloud maturity curve? And I think it’s important to understand where you’re at in order to be able to improve it. And rather than just going into the 10 different domains that we’re going to talk about today, I want you to just think about your environment as a whole and how it fits into some of these buckets. So I have resume a lot of you who have uh have been. Here are are, are, are people who have workloads in Azure already. That’s what we’re talking about spring cleaning and have a need to optimize that environment. And the question sometimes gets to is did I ever optimize in the 1st place? Sometimes we move into the new house and we never really moved in. Well, like we just kind of and stuff ended up, we’re ended up my basement is just messed up as it was on the day that I moved in and I never really built the right operational capabilities around that environment or I did. And I sort of broke down the operational part of that. Maybe I built it the first time, but never really kept care of it. And now, as the process of me going back around and looking at it, so as we’re talking about this today, I want you just think about our you an organization that is just starting as scattered cloud across your organization, your Azure environment, it’s operating in a way that never really got the governance capabilities applied to it. Or did you get to this position where you put those governance capabilities in place, but you know that like, yeah, it just hasn’t been perfect. We haven’t gotten the right optimization out of it. Maybe my cost is still out of whack or I put it in place and it just lost steam like I never really followed through on the swing of operationalizing that environment and by clearing both of those blockers. In a sense, like returning back to what has to happen here, this is what allows us to do these things higher on the maturity curve. Well, to be an effective streamlined cloud product development organization to be an innovation differentiator to our business in a way that is efficient in a way that’s secure and has safety applied to it. And then in a way that allows us to know that we’re getting the most of our investment, this is the conversation to have that spot is, are we getting the most error investment? Are we doing it right? So there’s a couple different domains that might be present in your Azure environment today. If you move from a legacy on premise data center, you probably are operating some workloads that horizon too, which is is this idea that I have virtual machines? There are Windows Active Directory joined. They are connected to my Corp net and they’re operating my Azure environment. I may be using diversity of tools with them. I might be backing them up or managing them with things that are like third parties or older products, or maybe I didn’t really implement the right controls when I move to Rizon 2, so maybe you have some security gaps or we haven’t implemented the resource governing or classification or tagging properly. So this state is one that many companies find themselves in because they moved into the cloud, but they didn’t necessarily take all the right steps as they moved to horizon too. But this conversation isn’t only about Horizon 2. It’s also about Horizon three, so if your organization is building applications that are cloud native, you’re doing data, workloads, AI workloads, and these are workloads that are not things you migrated from on premise, but maybe you have a team that you don’t even own building applications in the cloud that’s outside of it and it’s a little bit of chaos out there and we need to put some guardrails around that environment and recapture the cat that had gotten out of the bank. I’m so our conversation today is inclusive of both the Horizon 2 state and the Horizon 3 state and the context of building in Azure Environment that operates well. So I’m going to build this out and if you want to see it all in one shot, these are our top ten domains that we’re going to be covering in this session today. I’m gonna leave this slide very quickly, but if you wanna just have one screen shot that captures everything here, it all is all at once. But now what? We’re going to do is go into each individual domain. So if I was doing spring cleaning an Azure environment, not surprisingly, the first thing I’m going to do is ensure I have a methodology for cost reviews and organizational discipline surrounding it. This is the number one thing that organizations don’t do very well, and it’s simply they they don’t know how to optimize Azure costs. They aren’t applying resource limits. They aren’t doing reserved instances, they are selecting the wrong sizes of resources. There’s so many opportunities that exist within this particular domain. So, but the most important thing to be able to do that is to be able to analyze those costs, so tagging, applying, tagging across the environment, this is something goes back to how you define your policy, but you need to be able to judge every one of your resources upon some of these contexts. So who’s the owner, person or group? The business unit it’s associated with or functional domain. The application it’s supporting the technology as well as dev product QA because that then allows us to be able to make intelligent choices around those costs and you have a variety of options to be able to do that slicing and dicing, but Azure cost management does a pretty darn good job of allowing us to look at those costs. But if you can’t look at the cost and the context of these domains, then this is sort of job one because the number one thing you find out if you haven’t haven’t put these these constraints down is it’s really difficult to be able to optimize Azure costs if you don’t know who owns them or what they do. So it’s difficult to then say, well, you know this is under consumed, but I don’t who do I talk to about it or like what is it doing and and do I need this level of resourcing. It’s hard to have that have a conversation if you’re not the one who’s directly working with that software of that platform. So having these these constraints and context around it is really important. So in doing so, you then gain the ability to be able to start slicing it based upon different different controls. So resource group name or the service name, but even more so notice here an example. Can you delineate your environments between production and nonproduction? That would be a very kind of concrete example of where you A tag would be really important. They also might be how you then say, well, this is a subscription for nonprofit subscription for prod and then of course there are licensing benefits associated with non prod environments that you can start to apply. So for example Cosmos database, any subscription can have a non prod environment for free. So you might be paying for Cosmos databases that are, uh, in your production subscription that haven’t been applied to the free license and you just continue to run the register. So these are opportunities where we always need to make sure, like if I can take advantage of free licenses or licenses available in my different non prod environments or turning them down huge benefits also being able to slice the mist upon business unit. You just notice that, like, hey, this is just tagging. Yeah, it’s just tagging then applies that to ability to make effective decisions. Also, setting budgets associated with the environment and then controlling those costs by targeting high spenders. Having intentional conversations around overprovisioning around planning cost models around the environment to rationalize and then making intentional choices about how many reserved instances do I have? How am I applying them and where can I turn down resources or do overnight resources that get shut off rather than being turned on? Now everybody talks about these things. What I’m calling you to do is to do them because this is the number one way that you save money in your Azure environment just by applying these kinds of controls. Will Parker 11:20 On that, on that note, Nathan, no, definitely with with these tags, something I that I that I see happen a lot and a lot of these engagements, uh with different clients is you know we propose a set of tags you know and and what you showed is like is usually a good balance like having too many is hard to manage having too little you don’t get enough granularity and it’s important to set those standards as soon as possible because as the environment grows it becomes harder and harder to go back and tag these things and things. Concurrency 11:20 So go ahead man. Will Parker 11:49 Like that, but ultimately, and also controlling those things via policy, because what ends up happening is you’ll have an instance where where you know, like maybe an organization has 600 virtual machines and it is and then it looks like trying to find a needle in a haystack just to find, like something that belongs to a specific application or something like that. If we’re trying to, depending on how costs are controlled and making sure those things are enforced by policy, so just what ends up happening is, you know, a resource wants to make a needs to make a new resource in Azure virtual machine, something to that effect. And it’s like, I don’t know the business owner, I don’t know this and having having some standard documentation just to have that information readily available. So it doesn’t become a nuisance to the entire organization as well it can it? It really is, when prepared properly, for it. That is a huge force multiplier in terms of understanding costs and controlling them and keeping them down, which is one of the number one concerns with every organization in terms of cloud in general is just the sheer cost. Concurrency 12:49 Toilet. So building on that is this idea of optimization, which is applying reserved instances which is applying automatic downsizing and essentially shifting your costs associated with any of these resources to the actual business unit that is spending that cost. So if there was a like top quartile delineation on cost management, the most effective organizations are taking the PNL cost of that resource and shifting it to the business area that is spending that resource and then making them accountable, not necessarily responsible but accountable for the actual costs. And then giving them an A reason why they want that optimized right. So if that it’s just sitting in it, it’s always difficult to get the business to care about that cost. If you start to shift that across the organization, it becomes a very real and practical part of what it means to be optimized. So that’s something that we’ll be you want to talk about as well. So this is an example of a slide that we’re going to go through, but I will if you want to just kind of hit on a couple of these that you think are really meaningful before we move on. I think that would be awesome. Will Parker 14:11 Absolutely. To two things to to point out definitely here is is the right sizing and reserved instances like it’s. It’s a it’s a pain, especially if there’s a lot of workloads in the environment already to China to to kind of like, OK, what can we, what can we, what skew, what common skews can we agree on so we can actually purchase some of these reserves instances. It is worth the upfront effort because out of 10 times out of 10, every time we go into some environment like the number one biggest low hanging fruit that’s going to be the largest cost savings in a single action is to move VMS from a pay as you go model to a reserved instance model. Like every it it is that that is where the majority of savings are to be had. In addition to that, things like these things like Azure Container Services is more cost effective than asks a lot of times Kubernetes is has been a big, umm, kind of like shiny object for a lot of organizations and things like that. In the past few years and it while it is a great target to get to, it is a tool with a specific purpose, and if there isn’t a, if you’re environment isn’t necessarily just set up for running many many microservices at once that require complex forms of ingress and egress, it may be worth using something as simple as like Azure container instances or something like that just to reduce the overhead of all these managed services that AKS provides you that your organization may not need at the time. It’s always possible once you to move Docker images and things like that to asks and deploy them in that fashion. But if you don’t need the actual scaling capabilities of Kubernetes or interconnectedness of Kubernetes and may be worth looking at some of these similar similar options that that could that could provide significant cost savings. Concurrency 15:48 O awesome. Thank you. I say if I was gonna add one more, it would be. Actually, make cost optimization a core activity of one of your team members. That’s another big reason why this never gets done is people are really busy and they don’t apply the necessary time to make it happen. So they they end up getting stuck and just like never really becomes a priority, make it a priority for a person and it will pay off. They will be able to find that cost optimization. OK. So #2 uh, domain man, we are going to have to move faster because we are, uh, we made it through one and and it’s 1117. All right. So #2 relieve dependency on legacy tools and approaches. So if I was going to optimize and environment one of the things we find that gives so much complexity is when companies pull legacy tooling over to Azure as because it’s what they used before. So they’re legacy backup tools. They’re load balancers. They’re, you know, any abstraction platform that like sits in front of native tools that deploy things. The monitoring platforms that you might be using the patching platforms you might be using there is a tremendous opportunity to be able to move to what’s native in Azure to be able to drive cost optimization of your licensing costs, supporting the environment, but even more so just simplification of the ecosystem surrounding the environment. Now I know backup is probably the big question mark. There’s definitely some reasons why people choose things like rubric in the environments, but on flip side, those aren’t configured via infrastructure as code and leads in some other challenges later. So legacy tooling for legacy tooling sake, it’s the bad news bears approach. Put yourself in a position where you can leverage what’s right in the platform. A lot of times, load balancers, firewalls and other good example that you’re paying extra license fee just to run something that worked on premise when you could be just using the native Azure tooling that does the same job for the for a lot less cost. So think about these in the context of where you can optimize the cost associated with using those tools. Note that they’re like there is a diverse set of capabilities that apply to most of the operational activities that you would be performing in your Azure Environment, inclusive of patching. So if you’re doing still using configuration manager to patch your servers that exist in Azure, there’s an opportunity to move to things like Azure Patch management and by inclusion of all the rest of the tooling. And then uh note like get good at using the resource graph like the resource graph is a vehicle to be able to understand what’s in deployed within the environment and be able to report that back out because it is the live environment. Like you’re not having to ask questions of some spreadsheet you have, like you can ask the environment itself to understand what’s deployed at any given point. OK, #3 clean up the mess in the basement and keep it clean. Umm, so this is kind of a general domain one, but I think it’s important to say like here’s the things that we find in Azure environments that are just done poorly and need to get addressed. Check for public endpoints validate access to storage accounts. If a resource isn’t tagged or not used, shut it down like get to a point where, like, UNTAG things aren’t allowed in your environment. Log analytics is a huge cost sync like am I actually using that effectively. Or is that something that I need to move over to something else, run through Azure advisor or defender to be able to do an analysis of the environment so all of these end up being significant opportunities to be able to just optimize the environment but put it in a better position? I’d say if you do nothing, just do this as a starting point Will. What’s your opinion? What sticks out to you? Will Parker 19:40 Primarily the public endpoints. Just because one it if every every single major compromise you see, especially for for cloud or hybrid based organizations like oftentimes it is something like a like an IAS based service or PAS based service that wasn’t fully understood and had a public endpoint or something like that. And that’s on a reason to be afraid of public endpoints. But to understand where they are and what access people have to those as well as the subscriptions one is is another piece because it’s easy for for old leftover resources to be left out. And I don’t think I’ve seen an organization that doesn’t have a management group full of retired subscriptions or something that effective, something to be taken care of later and things like that. And it’s good to just clean those things up. So you don’t get rogue resources being spun up and you have some service costing you quite a bit of money every month. That kind of gets passed by, so those would be my choice. Concurrency 20:28 Umm ooh, OK, #4 adapt security controls to a cloud consistent world. So this might take a little explanation. So the number one way that organizations are compromised is via lateral movement. So what do we mean by that? It means a person is trying to compromise your environment, will take advantage of a end user computing workstation typically, and then use that vehicle to be able to move laterally inside the environment to other workstations until they find something that gives them access to a server environment that lets them move between server environment elements. The problem is that we’ve made this relatively easy inside of our environments because there’s very little protection because everything kind of talks to everything and it’s been really difficult to implement micro network segmentation in our legacy infrastructures. The important change that needs to be implemented when you move to an Azure infrastructure, and this is particularly important for horizon too because it’s the most unnatural to people, is to implement a verticalized network design where applications are truly segmented from each other and then possess a security layer that governs what talks and what talks out. So on the left you can see a legacy infrastructure where most things talk to most things and on the right you can see a more modern horizon 2 architecture that represents applications that are verticalized and little bit better. Look at that. I think is this one where you have sort of application 1/2 and three. Each of these applications contain resources inside of the application, but in order to talk to each resource, they have to traverse through the network security groups to be able to have that interaction between not only resources, but also having that interaction with the end user computing devices. This is probably like the number one miss that I see organizations make as they’re moving legacy resources, the cloud. So if I’m moving like server resources from my legacy environment into Azure, super important that this is the opportunity to imply those resource governing over the network security groups. If you haven’t even implemented things like network security groups or you haven’t delineated applications by resource group or even more so by subscription, you’re missing that opportunity. So this is the clean up that really needs to happen. If your resource is already exists in this state or it needs to be a new definition of what’s important to be deployed. If you are building a new environment. And then, well, a pause there for a second. Will any sort of additional comments on that? Will Parker 23:28 No, it’s just a really good point that, you know, like this is one of those areas where there’s a fundamental difference between on Prem and and in cloud networking where before the firewall was our edge and now Identity is our edge. And oftentimes, you’ll have your spoke networks or something like that that have their own means of ingress and things like that, especially for cloud native applications and that’s an OK pattern to adopt. But it does feel incredibly unnatural. So it is worth looking at, especially as you’re making that move instead of after the move has already been complete. Just because reorganizing network resources can be quite a lift if if there’s if there’s a lot of workloads in there already. So if there’s a great point, Nathan. Concurrency 24:04 Awesome. Umm, in that kind of touched on here on in the context of identity, right? Like it’s it’s interesting. There’s always a statistic that comes out from Microsoft. The number of Azure accounts dot Azure admin accounts that don’t have multi factor authentication enabled and it’s always in the single digits of ones that do have an enabled general like. What like that? 90% of accounts don’t have MFA. You’re like? Yup, that’s. I mean, that’s really sad. You can understand why it’s so easy to compromise Azure environments. Oftentimes, because people haven’t applied the the username and password is really all that’s governing, whether they actually have access to or not. So super important that you have multi factor auth but then also using privileged access management. I don’t know. Will you want to talk a little bit about why a privileged access management is something that especially for Azure admin accounts, should be enabled? Will Parker 25:00 Yeah. So typically like the the the model that IT administrators have used is to is to have separate admin accounts, things like that which is a pattern that works in Azure. It just becomes less secure where privileged access and identity management comes into play could be even be for resource governance and things like that. If let’s say you have IT administrators that need to have administrative control over the environment, just not all the time. Or let’s say a lot of that’s governed by DevOps processes are automated processes where we have service principles hooked up with these things. We don’t necessarily want IT. Administrators have contributor or change rights within the environment. We wanna keep everything the code for those instances where they need to get in there and fix something or things like that. We can assign them roles to be able to access those roles when they need them, and also be able to audit those things and keep the attack surface lower by not having these all governing all knowing admin accounts out there out there in the wild that are, you know, high value targets for for adversaries. Concurrency 26:00 Great point. Yeah, they I think the the idea of I have a separate account for admin duties is sort of starting to recede, right? Like is. Is that really beneficial or is it better for us to apply much more concrete security around privilege access management? For most, you know most functions, umm conditional access the the idea that my environment in Azure should only be accessed via a managed device. That’s really a conditional access, not not like being it idea of conditional like what are the conditions that I am a. I’m a user that is not been compromised of low risk from a low risk location that’s been accessed from a managed device such as Intune, managed so that allows me to know, OK, this isn’t being used from some device that I have no control over and God knows what is running on right? This is an opportunity for us to be able to make sure that what is actually being used to access my environment as one that is safe or at least mitigate the extent to which it’s safe, and then of course our back applied. Certainly lots of issues associated with like how people have applied security controls inside of their Azure environment to do this. To do this well, OK. So a couple other critical security controls. We hit this multi factor auth on accounts conditional access on all accounts learn about public endpoints, micro network, segmentation, deletion protection. Really important thing we have had or I mean interesting enough. We have seen organizations that have been compromised and had ransomware. Of course, all of us probably have. But uh, we’ve seen organizations that have had that happen to their on premise environment and the Azure environment was protected in a sense because it was micro segmented. So the ability to move into that environment was weakened. But for those environments that were compromised in the Azure environment, deletion protection was really important because if deletion protection was turned on and it required PIM and enabled additional layers of protection that we’re able to mitigate some of that movement inside of the Azure environment itself. Will Parker 28:12 The the deletion protections even good for accidental things like there’s there’s like in terms of like like your your YOUR hub firewall like that should never ever ever be deleted. Concurrency 28:15 Ah yeah. Will Parker 28:21 But you know things happen and that could cause a huge outage and and just for those things we’re like, hey, if someone really does need to go in and sort of modify and core resources like that, it’s an extra step where they have to go. OK. Yes, I am explicitly turning this off so I can go in and manage this thing. So nothing, no accidentals even happen. So that’s definitely the benefits are multiple in that. Concurrency 28:41 There’s huge point. Yeah, just delete accidentally lesion, right? Not even a like someone intending to cause some mistake. Alright, uh recovery strategy. Many organizations are in position where they don’t know how to recover. They Azure environments if it was deployed well in the 1st place or not. So enforcing Azure backup for any service it covers doesn’t cover everything. So there’s certain ones that are like SAS enabled backups like Cosmos database for example. And especially this one in the middle here ensure all paths services have an understood recovery pattern. And sometimes we’re really good at, like, recovering the virtual machines, but getting the PAS services back is like a big mystery. The data environment, the AI workloads, the IT container based workload, and that’s where redeploy ability is really the thing in addition to the ability to recover the actual data itself. But the ability to redeploy the container environment, the PAS services, the AI services that are, you know being deployed through code rather than something you had to click, click, Click to deploy. So really important that you know how to get those back because if you don’t sometimes it can take a really long time or you’re rebuilding it from scratch. If you haven’t built a pattern to be able to do that effectively. Make sure you can audit that a state to ensure that everything is protected and also that your backup your backups are stored with protection enabled with them so that way the backups can’t be deleted. So really important that you have both the understanding of Azure backup service covered as well as then redeploy ability associated with those PAS services and redeploy ability is also important in the context of networking and your Corp and management management interfaces that you may have initially configured by click, click, click rather than deploying the infrastructure as code. Any things call here well. Will Parker 30:44 I honestly just just I wanna point out the top left there and force Azure backup for any service that covers. Like if you if you have production workloads like it is very trivial to enable these things and it’s a very easy thing to overlook and assume that you’re covered because of looking at the backup center and seeing a lot of things and stuff. A simple policy just to enroll everything in it as they are created or migrated is the perfect policy to have. Concurrency 31:09 Mm-hmm. Uh, important understand different types of recovery environments. What you are covered against what you are not covered against like this is something that most people find out after something bad happened rather than before. So understanding the movement from different recovery tools and where you should be using, what is a conversation in and of itself. OK, #6 and this could probably have been earlier in the deck, but I think it’s important to, you know, cover regardless where it is implementing a consistent and detailed tagging structure that’s represented of organizational policy goals, policy and tagging go together. So policy is the way you enforce the controls you need in the environment, and tagging is the way that you understand what’s in the environment. And sometimes that understanding is aligned with a certain type of policy that needs to be deployed. So uh first thing I say is how the naming structure. Uh, this is 1 type of that naming structure, but just have one. If you go out to your environment, you see Nathan’s lab, you probably have a problem, right? You need to put yourself in a position where your your naming standards give you an understanding of what’s out there and can quickly identify what kind of resource you’re looking at. Now I’ve heard some people say, well, that gives people have compromised your environment, the ability to quickly find things that are important and we should use other naming. So I just, if you’re at that point, you are as already too far gone. Like you, it’s much more important for you to understand your environment. So you can protect it than it is for you to like, make the curvy streets of Waukesha so you can avoid people from being able to find where they need to go. So very important that you name things well, but even more important that you have tags that allow you to understand the environment. So owning team, we talked about this a second ago, owning team business unit, application name, moniker Cost Center. But when I didn’t talk about earlier was this idea of classification and classification is essentially saying what kind of like what kind of criticality of resource am I staring at here and what does the security profile and that’s where you can use that tag to be able to say this is the policy that will be applying to that classification. Uh, sorry, this, this, this, this type of classification will have this policy applied to it that would then apply to the actual resource itself. So it’s a high priority, high security resource versus like the lunch menu application. They might have different controls applied to them, or different requirements for recovery that will be reviewed as a component of that policy. So the classification is essentially like enabling you to say like all of these things aren’t the same. Some of them require more protection and controls than other things can require. I’m based upon the kind of application it is, so this is a shot at, umm, kind of an exhaustive list of what those tags could be. Your mileage will vary. Usually when we have these conversations, we’re determining like what actually makes sense for an environment, but somewhere between like this whole list and maybe half of it is where most people land. If you only have like 1 tag and it’s maybe application name or something, you’ve got to start, but there’s definitely opportunity to grow on that and get better. If you have no tagging, then you really have to do some clean up and it’s gonna take some time. Not slowly like environments that don’t have tagging applied, it just takes a lot of work to retrofit that. That’s like one of the only things that you can’t fix and relatively short period of time like tagging just requires a lot of legwork to get right. If you didn’t do it to begin with, anything in here that you call out Will that you think is like an under, uh, under like appreciated tag that you think is important to talk about? Will Parker 35:14 So from the terms of my data security, that’s one I point out. So in in addition to kind of having like Internal: levels of classification like like confidential, Secret, top secret, those types of things also like what types of policies are those systems beholden to? Are you a Fedramp organization or do you need those? What regulatory compliance standards are you held to and this kind of especially important when you’re delineating between your developers being able to have freedom of movement in a dev environment versus a prod environment where you have very strict data classification protocols which will require reviews and all those sorts of things. But if they need to work on systems that are dev related instead of it being everything being grouped under that particular tag, but maybe they’re not handling live HIPPA data or something like that, you know those sorts of things. It can be very important to delineate those things. So you can be 1 to audit them at any point in time if you need to and not have to be scrambling. And also so you can enable your dev teams and things like that to be able to move more freely in these restricted environments. Concurrency 36:13 Totally. OK, so let’s talk about provisioning. Umm, nobody likes sitting in line, but we also want there to be a structure as well. And it’s some point. There’s a balance between enabling people and to be effective in the new world and governing to ensure it’s done right and somewhere in this balance we need to figure out the the actual picture. Now there’s a good way, and there’s a bad way in the context of Azure. So bad assumption is that everything needs to be done in the portal. Many environments that are difficult to recover, difficult to manage our unorganized in the nature of their environment are ones that just kind of didn’t click OPS in the context of the portal and that was the vehicle where they did all their deployment. Now, for many people this is like super obvious understanding, but for many we’re starting to use Azure like they did everything with click OPS and the on premise environment. So why wouldn’t you do the same thing in Azure? And maybe they didn’t have those existing skills. So one of the things we see most often is the movement to this kind of environment, which is Azure portal is largely view only with the exception of the sandboxes on and source code is then a vehicle. So source code in this case being like infrastructure as code not even like app is code. It’s just the infrastructure is code is deployed through the control plane to the environment and as the case for initial provisioning for service modifications and for service procedures like deletion, all should flow through a release management pipeline. The reason why it’s so important is because over here exists the versions of the code that’s being deployed, so I can always go back to my old, my old version that I had deployed before if something was not being, uh, if something broke when I deployed it, unlike a lot of click OPS scenarios where like, wait, why did we deploy it that way and how did I deploy it that way? And was that the right way to do it? A lot of times just people that person built. It doesn’t even work here anymore. I don’t know why you configured it that way, so super important for you to think about moving to this environment now. Of course, this is new skills, so learning bicep or learning terraform. These are skills that you’re P team may not yet have, but it’s a really important opportunity for you to be able to bring their skills along for the ride. So when you get to the Rizon 3 and everyone’s deploying cloud native applications, that of course wouldn’t be deployed through click OPS. Then, like your people are relevant, like they’re working with, they have the necessary skills to work within that ecosystem. I’m Will any any practical like what’s your been your practical experience with like people moving through that skills development to be able to work within this this ecosystem. Will Parker 39:17 Yeah. So I would say there’s there’s definitely stages to it. Well, the the ultimate end goal is to have an entire CI CD environment with versioned versions of your separate modules. Things like that. Having your own like Terraform Repository, bicep repository or whatever it is, but ultimately just taking the steps to just start deploying with infrastructures code in the 1st place. Whether that’s a more primitive means but actually defining your entire environment code, I think is incredibly important and anyone with. Concurrency 39:35 Mm-hmm. Will Parker 39:42 Certain with a a A level of scripting knowledge should be able to pick up something I terraform fairly quickly and and just being able to the ease of mind to be able to deploy your entire prod environment again your entire dev environment or entire region with a matter of a couple of command line instances instead of going through and typing out just the insanity that click OPS in the portal can be is huge because it’s a matter of having a weak to redeploy your environment versus 20 minutes and just just being able to define your environment. Currently in into in Terraform or some sort of ice of your choice is huge and it doesn’t necessarily take the entire it takes time to get to having fully automated cicd pipelines and automatic deployments when things are updated like that’s that’s a level of maturity that’s down the road. But just just adopting Terraform in the 1st place is huge. Is the right step in the right direction. Concurrency 40:37 Here’s why. OK, so in this context it’s important that you build out the right structure and sometimes this is an opportunity for organizations to sort of retrofit to an appropriate structure. If they’re finding they’re running the scale problems or management problems, or they can’t apply policy in the way that they like to, So what you can see here is an example of sort of a bad structure and why this is a bad structure is everything is consolidated. One subscription or a small number of subscriptions, so this was sort of an anti pattern that came about as a result of this, which was when people started building Azure environments very originally. They have a lot of subscriptions with no structure and no naming, and there was very hard to find out where things were and a lot of lot of sort of subscription creep without any subscription governance. Umm. And you need to land sort of somewhere in between and what I mean by that is, subscriptions aren’t the enemy subscription scale is not a problem really, but doing so without governance is really what creates the bigger issues. Trying to consolidate everything underneath those small numbers subscriptions leads to inability to manage and certainly leads to an anti pattern for how even Microsoft has really talked about how scale works and that’s something called the cloud adoption framework that includes inclusive of this design. So what you can see in this picture is the management group structure which the cloud option framework typically prescribes, which is this idea of the platform functions being in their own sets of subscriptions. As you can see below, identity management and connectivity connectivity, meaning this is what is talking to the on premise environment. This is the VPN landing zones with being where you’re workloads actually go. So if people say like this is a really interesting thing, like most people say I need to deploy a landing zone for my Azure landing zone. Is plural, right? Like landing zones means like different workloads get different subscriptions based upon level of like segmentation we want and this is a vertically scaling element of the architecture. So it was important is that you have the attached to the tree. You have a methods of the madness of how you’re doling out subscriptions and then and then you have a subscription vending process that does that in organized way, either with lots of subscriptions or a moderate number based upon what’s works for your organization. But it’s separate from identity management and connectivity. It’s separate from the decommission subscriptions, and then it’s especially different from the sandboxes which are subscriptions which are not connected to your Corp net and are used for testing stuff out. Any other learnings that you’ve experienced Will by helping build out a structure like this? Will Parker 43:42 Yeah. So I mean it it may, it may come across as something that’s like that, that may be pretty arduous to especially if you’re already in Azure and you want to move to this model. But I I have to say, when it comes to managing resources, especially growing as an organization, the biggest immediate payoff to adopting a structure like here in the cloud adoption framework is just policy application. You can apply blanket policies to your connectivity subscriptions, your connectivity management group, especially if you have multi regions and things like that where you have very specific lockdown policies on how what can and can’t be done in your network. You know your network or your connectivity subscription that would be crippling and A landing zone subscription where you need people to be making changes to be making deployments to be doing things like that. So it really helps you delineate what policies you need at a very high level. You don’t have a million policies all out of subscription or resource group level, and forgetting did that get applied to this one or not it just it’s all inherited just like from a group policy angle. And while it is fundamentally different from group policy, you can kind of draw a lot of parallels there and how how, thing and and how to set this up and this and this helps enable that ultimately. Concurrency 44:47 Yeah, I I I agree. I I’ve actually similarly used group policy as like a synonym for this right? Like uh, it’s it’s got a lot in common. You know the way people thought about group policy has a lot in common to how you build out management group structures. Umm, we won’t go into this more detail, but just know that like each of these are connected to their requisite parts, right? The idea that, like the identity subscription, contains your domain controllers that would exist in Azure to be able to facilitate authentication of the virtual machines that you have in Azure and etcetera. There’s a lot of secondary build out of this. OK. #9 this really goes to like how you organize your team associated with doing cloud well. Uh, and building out a cloud governance backlog so most effective teams the A they have like a center of excellence team or a set of architects that focus on optimizing their Azure environment and governing and enabling it in Service necessary balance. But then they also build out a backlog that contains activities that are performing against those domains. So they have a backlog epic for governance, migrations, cloud, native activities that are being supported, cloud data activities that are being supported and anywhere within these are PBIS or of sub tasks that are being applied to certain team members that are then performing those activities. And I find that many, many, many cloud management teams, they, they just they know some of this stuff has to happen, but they never articulated that these are action items that need to be performed and then stack ranked and then put them into a backlog and have someone actually have to make a choice to do them or not do them. They just sort of say like Andrew have time available. So I don’t really have time to go audit my backups or to build the policy appropriate for the classification of workloads. Umm, don’t make that on. You know, make that on the organization to make appropriate choices or on what gets prioritized in that stack. But the first thing you need to do is kind of itemize the work. Like what are the things we need to get better at so the most effective teams are really rigorous about having that be a thing that exists in building it out well. And then there’s supported by this last item for our conversation today, this idea of establishing a cloud product team and this is containing us a couple of different elements. So if I’m building up an operational product team now, this can be like one person doing all this. We should be difficult or it could be a number of people delineating between teams. So Team one or a group functional functional domain one is this idea of legacy data center operations. Let’s say that I’ve got a whole bunch of stuff happening in Azure, but then I’ve got a bunch of workloads running on Prem. Well, you need to have somebody still caring for all that legacy datacenter operations that is running in horizon one. It’s likely a different set of tools than what you are now running in Horizon 2, or should be and then certainly different set of tools to what’s happening in Horizon 3 and that needs to continue to operate. But assuming that continues to operate, you then need to build out these other functions, so governance being another function, being that it builds out the governor operational state, it maintains that backlog. It performs cost management activities. It builds out policies and maintains them and enables with new application teams being onboarded within the environment by having the right policies in place, which then leads to a group that is are doing adoption activities. So adoption activities would be, umm, enabling a new application team to be able to use functions within Azure or enabling them to light up new data workloads or performing engagement with dev centric skills associated with the dev team. This adoption functionality that has to happen to be able to make them effective within the environment, so that’s this balance, right? There’s the balance between governance and the balance between adoption that needs to exist. But then there’s also this operational function, like if you have migrated a bunch of virtual machines into Azure, you need to have a team that essentially is doing. You know, if this is 1234. Someone doing task one and someone’s doing task 4IN different horizon States and those are pseudo different sets of skills. Or maybe the same group as doing both, but that would be really difficult. So you have to have that operations function to a degree that’s understands how the migrated virtual machines are operating in the modern state capable of recovering them, knowing whether or not they’ve been compromised. All of those understanding the networkings and outs and being able to troubleshoot that, and that’s different than like core governance and core adoption functions. So know that that’s his own function in of itself. And then the last function is this idea of application Cloud DevOps. So, like, let’s say I’ve got a team that’s building a ground up application in the cloud and most of the time that Team kind of has all the things it needs to be effective. Like you’ve given it subscription and you give it the ability to deploy into the Azure environment under certain constraints and they’re kind of often running like you might have a DevOps team member that’s kind of interjecting into that, helping them with onboarding, but like largely they’re self maintaining. So these application Cloud DevOps teams like they need to have the ability to build and deploy and work with the center of excellence to be able to do that work. Now sometimes the DevOps function exists in the centralized group and works with the an application team or the application team might have enough work to do that. It has its own person and it has its own controls. So why is this important for clean up? And why is this spring cleaning? Well, it’s hard to do things well unless you have structured yourself well. So if you need to think about how your cloud product teams are organized and skills are applied and how people go to do their work, this is a great starting point to think about in the context of that. So as we go through this today, I want you to think about that Corp. IT has responsibility for the cloud platform itself. Security may be a different group and application teams will sort of bit by bit as the business takes on more and more control of the ecosystem. Those application teams will become more and more self self managing and that’s element of how we actually go about making that real. So as we try to do this well, think about your Azure environment as a road. And if your Rd is bumpy and it’s difficult to drive down, then you’re gonna have trouble going fast. And if there’s one thing that I understand about going on roads, it’s how to go fast on a bike. And if this isn’t me, but it’s an example of something I like to do, the only reason that this person can go fast is because right on a flat Rd and they know with confidence that what’s in front of them is going to be able to be supported by the bike because it doesn’t turn very well and it’s it’s easy to fall off and all those kinds of things. But you sure can go fast once you’ve gotten yourself organized. And what you might be finding is that some of the things we just talked about are getting in the way of you being able to be effective at that journey and that’s where we want to help you clear. So think about all these things that we talked about today. All of these are relevant to how you take next steps and what we would love to offer as you leave today is a complementary Azure optimization assessment. So if you think any of these might be presents in your environments and you’re like, oh man, I wish someone would take a look at what I’ve got and give me some forward prescription on how to make it better and talk about it, maybe they can help me out. That’s what we would love to do. So we love to have a conversation with you about making this real in your Azure environment and helping you to either get cost optimization or to drive new opportunities in the environment and appropriate way or just structure it for the first time. Maybe you’ve got stuff out there and you’re like, we’ve never structured this well and now is the time for us to do it effectively. So all right, so we got there. We got there with time to spare and just confirming there aren’t any additional questions, so go ahead and drop those in the chat if you have any questions for Will or I and we would love to answer them and if not, we would hope you have a great rest of your day. So let’s just pause seconds and if anybody has those questions. Awesome. All right. Well, I hope you all have a great day. Thank you for joining us. Make sure you fill out the survey and we will see you next time. Thank you. Will Parker 53:54 That’s all. stopped transcription