20240730 UEFI Secure Boot Issue

    11:53PM Aug 2, 2024

    Speakers:

    Rob Hirschfeld

    Claus Strommer

    Keywords:

    system

    bios

    secure boot

    keys

    firmware

    boot

    patch

    updates

    signed

    kernel

    trusted

    vendor

    os

    flash

    hardware

    bootloader

    downtime

    talk

    thought

    faster

    Rob Hirschfeld, hello. I'm Rob Hirschfeld, CEO and co founder of RackN and your host for the cloud 2030 podcast. In this episode, we explore the eufy certificate issue in which secure boat is potentially compromised because the certificates that are included in most juvie BIOSes have been used in ways or have been compromised in ways that could be easily used as for an attack vector, very significant flaw and something that should be on your purview and radar to fix and patch. We're going to talk about what the issue is, why it's important, how you how secure boot works, and what you can do to mitigate this problem in your own infrastructure. A really, really important episode for anybody running or managing desktops, data centers or any infrastructure of any type. So enjoy it.

    I was hoping to start today with this. You feed you fee boot. I key compromise question, and if it's interesting enough, we could go the whole time. Yeah,

    I did. I did see this article? Or did this headline crop in my news feed? Yeah, the other thing the article that I saw a little bit before this one also was a news article that reminded me of you actually about A It was basically saying that a significant number of of enterprise users don't use a secure put at all

    is very true.

    Yeah, I do remember. I did remember you're talking about that and one of these sessions, and so, I mean, the first thing that I'm thinking of here, like, in the context of, again, like Secure Boot being broken, and at the same time, a lot of people not using it. It's like, Well, okay, who is affected by this? It's a

    good question.

    I think probably so

    let's, let's let me back up, just just to make sure that everybody's on the same page and I'm understanding what the problem is correctly, because it's worth, it's worth going through like, what is secure boot and how it works, because there's, there's layers in this. It's not, it's not simply using HTTPS to do the boot. If you're doing secure boot, what you're supposed to be able to do is in secure boot mode. The BIOS actually enforces whether or not the operating system is a trusted operating system. So the operating system images or kernels are signed in such a way that when they when the BIOS boots them, it can check those signatures, and if they're if they don't match, it won't allow them to boot. If they do match, then it'll proceed with the boot process. That's in a nutshell. That's That's my understanding of what when we talk about secure boot, that's what we're saying. Is that fair from a for the racking team? Like, did I get it right? Highly simplistic,

    and everybody's still on mute. You did okay?

    I mean, the idea is that it's the the nuances that each layer is supposed to verify and maintain the chain of trust as it goes through so the BIOS verifies the bootloader, the bootloader then is supposed to verify the next layer, and so forth and so on. Okay? So in so much as all the layers then build up to a layer of trust, then that process is maintained. Usually the primary elements that drive, that are the BIOS itself, is supposed to be signed by key known by the core system, and then the initial bootloader be that like grub. Or a network bootloader, or even the kernel right all those transition through various things, and each of them are supposed to be signed, where you get into some tricky kind of things, or things like, did the Linux kernel implement secure boot for kernel modules, and signing kernel modules, and those kind of things then become follow on actions that the OS has to then implement to validate or like ESXi, they make rules about we're going to enforce that all of the software being deployed in ESXi has to come from a signed VIB that matches a chain of trust, so that then it follows the the chain of trust, right? So each layer then is responsible for adding to the chain. But that's the basis,

    does? Um, and one of the reasons why I think people don't turn this on is that if, if you can't provide, if you can't start that process, the systems basically are gonna hang until, until you can't, like, how did what happens if you say, I want UV, insecure mode, and then you can't supply, start the chain, chain going from that perspective, just,

    it just locks up, yeah, it prints a nasty message in various shapes and forms that say this is not a trusted layer and execution is stopping,

    okay? And then does it? Does it boot loop? Or does it require then somebody to reset? Varies

    by location, but yeah, and bio settings and stuff like that. So like, if it were a network boot failure, then it might choose to boot loop. If it were a hard disk failure, it may choose to just permanently stop requiring you to intervene, right? Yeah,

    the system, at least in my experience, that the system behaves as if there is no bootable disk of interest, points for user input,

    which, which in a in a environment like an edge or something like that, where you're you know, might be hard to actually fix that. You now have removed a backup path to say, hey, wait a second. No, no, no, just boot regular. Please, please. You're out of luck until you either reset the bios, which, if you're doing this right, probably now is a password on it, so it's not so easy to reset. Or, you know, somebody actually shows up with a trusted OS, right? It's a pretty significant lift from a recovery perspective or resilience perspective, from that

    right and go ahead and in certain situations, it is Certainly the desirable behavior to minimize the chance of casual intrusion. But yeah, there's, it's, it was certainly not designed for cloud architectures in mind.

    All right, so now, now we're now, we're into the oops, what did I just do to myself? Or at least the reason why now the benefits are high, right? It's if, if, without this, then anybody could show up in your network potentially and and stream in untrusted OS and take over your machine. So I it's sad that people don't turn this on for the reasons we just named, but it's probably a really rock solid thing to do. I guess you could, you could install a trusted OS for malicious reasons. I guess so a hacker could show up with a trusted kernel, you know, reinstall, reimage the system, and then use, you know, you know, all their almalinux signed on the Linux, but then do untrusted things in it. It's a little bit more sophisticated of an attack to show up with a unsigned OS. I mean, how great is am I underplaying that? How big of a deal, or how likely would it be for somebody to bring in an untrusted OS? Or why would somebody do that?

    If you're

    sorry, go ahead. You wouldn't need to do that at all. All that secure boot on modern Linux distros will check is that the kernel is signed and the init ID is signed, because that's what the bootloader loads. What they do after that based off of the kernel command line is not inherently protected. That's, in fact, how we make Sledgehammer do its thing

    on our systems, where we are loading a completely trusted kernel, right? And pretty much also a trusted in it already, but we are,

    but once it is loaded, you know, we can do whatever we want to do with the system in sledgehammer, even though

    we're building our own OS in sledgehammer, although it's signed, is what you're saying, it's still a trusted OS, yeah.

    Fundamentally, the kernel that we load with the bootloader is still signed, so it doesn't break the UEFI Secure Boot chain of trust. The fact that we've the fact that we've completely rewritten the init rd, you know, doesn't matter, because that's not checked by the kernel right now, there are some proposals to make it do that. I'm sort of leery about that, mostly because that will reduce flexibility enough that what we do will become a much more convoluted path. If you want to provision anything over the network, you know, at some point you have to, you know, relinquish control of what the system is going to do to user space,

    right? We could end up with a fat, I mean, we could do that with a full version of of any OS Sledgehammer is just optimized for speed and and minimal, minimal footprint.

    All that UEFI gives you is just a guarantee that what you've booted is signed by a trusted key. Yeah, but just like with SSL, just because you have the padlock in your address bar doesn't mean that the site is trusted,

    right? Yes, they can still do malicious stuff. It's just, you know, going to do it while making sure that no one in the middle can sniff your traffic.

    So I mean, y'all aren't selling. Me why people should turn on secure boot, which is maybe why people aren't doing it

    well. I mean, for the for consumer this, this is not going to make much of a difference, but when Secure Boot came out, there was an actual, believable threat, as I say believable, not not not practical. I would say that time about the BIOS being compromised and unloading essentially persistent malicious code,

    I would say that the what Secure Boot is optimized to protect against is targeted more towards the needs of regular Windows users, okay, where they don't want to customize their boot process. They don't want the ability to, you know, inject custom stuff at startup. They they actually don't know enough about the systems. Know what they want or not they expect it to be able to boot into Windows, and you know, for it to not be taken over by whatever you know boot sector virus du jour is going around, which used to be

    a big issue. Ah, okay, so stops

    those sorts of attacks cold.

    So the issue isn't necessarily that I'm I'm breaking the boot chain here and injecting a new boot loader. The issue is that somebody hacks my Windows or Linux or VMware. There's actually another bug to talk about with a VMware hack, yeah, but, but that somebody facts that modifies the kernel because they get privileged access, which is entirely an easy to imagine scenario. It's not

    just DCC. Imagine it used to be my bread and butter when I worked desktop support at Dell.

    So they get there, the system reboots, and now they have silent kernel access because they've, they've modified the kernel in such a way that now they have a back door, or they've had, okay, this is so my, my misunderstanding here is that secure boot was protecting you during the boot process. It's, it's actually making sure that somebody hasn't, hasn't taken a running system and hacked into that and then added a back door that you that you aren't, that you're not aware of.

    It's a tripwire, yeah,

    okay, that's an important miss, because I usually think of secure boot, and think of it as the boot process getting secured. Funny about how I could infer that from the name I don't know

    it's well, it is that, and it does that. It just right. The threat model that it targets is not the general sort of. It's not tailored towards, basically what we do for those of us who want to be able to switch off. Rating systems and be able to provision stuff over pixie and potentially change on the fly what's being loaded when that makes it backwards to not break those chain of trust, just because we want to boot into Sledgehammer one minute and to ESXi the next. But if you're

    if you're rebooting on a frequent basis, your threat level of like, Oh, somebody's going to take over. My kernel goes down because you're literally reinstalling a trusted kernel from your own sources over and over again. This is somebody who's

    if you're reinstalling on a regular basis, not just rebooting, yeah, reins, sorry,

    correct. Reinstalling the

    and the enterprise, the the the kind of environments where secure put gives you the most bang for the buck is when you when you're dealing with confidential computing, which is where, basically, like you want to be able to guarantee like all the way down To the hardware, that the components are trusted. So you're able to trust the component, the hardware, the processor. You're trusting your storage, you're trusting your memory, you're trusting your bootloader, which is where secure boot comes in. You're trusting your kernel again, and then the all of those things built upon each other to then guarantee that your environment where you're actually running your code is protected.

    Yes, that also works as well, but that's usually the domain of very savvy end users who actually go to the bother of doing their own you know, key enrollment and cert management for UEFI Secure Boot, yeah, and,

    and that is actually the part I wanted to get to next. Is that poor delay person, or at least, or even for the for most, IT people, e management for secure boot is a completely foreign, ununderstood black box, and clearly not user serviceable. It's not

    something that I would have an expectation that a normal user would be able to manage, because it's really easy to mess up. Yeah,

    it's something I wouldn't even expect, that a normal IT team to be able to manage, even though they should be able to, for secure good, to actually work as intended. And I think that that is one of the biggest problems, which also goes back to again this, this article that triggered the whole conversation is that in most situations, you end up just trusting the keys that come Out of the box and that's it.

    And we lost operational

    model is, you know, booting to, you know, whatever window, whatever Windows wants to have, pre signed, or whatever your upstream Linux distro, provided you know that model works, yeah, I

    mean, it also didn't. Didn't help that when secure good came out, there was an incentive for for Microsoft to exclude everyone else and make it difficult to replace those keys, because what would that guarantee them a stronghold on the boot sector?

    Yeah, there's some of that too, but I think that's mostly historical these days. Yeah,

    yeah, it's,

    it definitely hasn't seen as much usability research as TLS, for example.

    Speaking of TLS, those announced yesterday that Let's Encrypt is dropping what's CSP support in favor of just certificate derogation lists. I

    Sorry, it looks like Rob dropped off.

    Yeah, which is why I'm trying. To keep the conversation alive here,

    oh yeah, Let's Encrypt. Yeah. Certificate Revocation is tricky. That thing, yeah, yeah. I saw that going by, um,

    but, but, like, the article came to mind because, like, I was thinking, like, well, but Secure Boot needs is a CRL on. It doesn't have that.

    It does have that. The problem is, to update the CRL, you basically have to flash something exactly, yeah, and we all know that regular firmware updates aren't exactly on everyone's schedule.

    Yeah, good

    luck. Even decades, they can't be updated as fast as Let's Encrypt does, which you know your key is good for 30 days, and then it's dead. Hope you have a new one by then,

    it's understandable as well. Like a system may be offline for years at a time before it's brought back online. Well,

    that's the difference between online in the OS and online in the EFI firmware.

    Actually, that does bring in, uh, bring up a interesting scenario, but like is, does Secure Boot cause problem for like hardware museums, or will it cause problems for that in the longer term? Let's say a system stays offline for, let's say, two decades, and then is found and someone's addict can't boot it because the Secure Boot keys have expired,

    possibly, but that's only on systems where you can't just go and turn it off.

    Well, we see plenty of those in industrial systems,

    where you can't turn secure boot off, or industrial systems, is that they're old enough that secure boot wasn't ever a thing. And the issue is that they're running out of whatever mid 80s to mid 90s hardware they're running on top of, you know, they haven't sourced everything from eBay, and that stuff's going away.

    Oh, you meant turning secure, but I thought you meant turning the system off. Oh, yeah, sorry. We're just actually another scenario, let's say, a system, by some miracles, running for for a couple decades without firmware updates on Android boots, and then it's not bootable anymore because the key has expired.

    Yeah. I mean, that's another failure mode that secure boot can have. I

    anything else in the news recently,

    see if Rob's gonna be back online.

    Rob lost his internet. Yeah, is what he said. So with regard to the you fee one, they published key. They included keys that were compromised. It sounded like

    there was a there was a repo out there that included private, totally unencrypted private keys that managed to wind up in some firmware. And it was found that, like AMI and a few of the other big UV firmware vendors, kind of the some of the test private keys that they had were kind of an open secret and were protected by four character long passwords. And anyone can get a hold of those, can trivially brute force decrypt them and then sign whatever they want for packages or bootloaders or whatever,

    because there's one problem of getting the keys out there right, or having keys that have been compromised. The other is you have to have the private key of that public key to then take advantage of it. Yeah. And so what I wasn't sure of is how available those private keys were. I totally get, you know, those are deprecated, and we threw them away. But you know, when I did well, those were unknown as the unknown. But,

    yeah, that's not necessarily that. It said that these were test keys, and they clearly said test key do not use in the kind of the one of the. Key description fields, but they got into firmer anyways, and have been there for decades. And it turns out that it's pretty easy to find the matching private key and decrypted.

    That's the part that I wasn't sure of, right? Because that's easy. Yeah, yeah. Well, you know, if they're internal test team keys, right? Okay, great, right? Or there's something that got lost or eaten or thrown away, or so they were but they were leaked. Is the issue.

    And then that,

    that, to me, is, that's the bigger problem, right? Is that these test keys have their private keys available for any, any shmo to pick up. That makes it much more problematic, yeah, for my problem is the remediation is going to be almost impossible without touching each of those by hand.

    Yeah, right. Like there's going to be enough systems the affected, and I mean a lot of those, the hardware render only provides like firmware patching for certain operating systems. Let's say if the if the system is is too old, it cannot use newer operating system, or uses another operating system, like swapping windows for Linux that, again, doesn't support or that is not supported by the hardware vendor. And then, yeah, that in terms of like systems effect that not all terms are going to be patchable because we know the hardware vendors only support certain OSS there might be hardware that is past the official support time of the vendor. So they're not going to release any firmware Freda in any way or or they're not likely, at least.

    So did, we did, I mean, on this, so we're, are you talking about the defect now? Sorry about losing, about dropping off,

    about the the keys I got leaked.

    Okay, it changed. It changed my our conversation, or, you know, before I dropped off, it changed my understanding, right? This means it's much easier to add in a kernel or change something and then make it look like it's fine on the systems. It's not about interrupting the boot process, but so and you were saying that we're not likely to have people distributing new keys.

    All right, I would imagine that some vendors are still going to distribute new keys, but I can also see allow systems being practically unpatchable. Um, because, for example, if the vendor distributes like firmware patching only for Windows, well, what if you're using Linux or BSD or or some other ways like you, lose the capability of flashing the firmware because the keys, or in many cases, can only be revoked by flashing The firmware. Right your Sol there.

    I mean, it's, there's no there's no easy I mean, there's no easy fix from that perspective, as far as I can tell. I mean, if Is there anything, from the racking team's perspective, is this, is this something that we could fix from within the OS, like food into into sledgehammer, or do out of band management. To update the keys,

    generally, you have to flash something to update the keys,

    right? But don't we? Don't we flash the BIOS all the time? Or is this a different, a different bios.

    I mean, we flash the BIOS based off of whenever a customer runs the flash the BIOS workflow

    so they could replace they could update their keys. They could remove the risky keys here,

    generally not them, their vendor, with the whoever released the bias would do, right, right?

    Sorry, right. No, if they, if so, but this could be a path, right? So, so here's, this is, I know, when we've talked about, I've heard like for military people or one extreme trust, they sign their own OS, they they have their own. Keys. They remove all the keys, but the ones they use to sign their OS, and then they have literally, that machine can only boot the whatever OS they've signed. But my understanding was that that process to change the keys on on a trusted system, and maybe I'm thinking TPS, and this is different than TPS. It would literally requires you to be in front of the machine and there's, you know, the equivalent of a retinal scan to make that change. Or am I misunderstanding what, how the you know what, what this would be. Or maybe it's different than the TPS chips. I mean, that's, maybe, that's where my misunderstanding is, Victor is, right? Is this different than than TPS, um, a trusted platform system, right? Where the signatures are carried in a, you know, protected Root of Trust.

    There, it's different, Spot related, okay?

    I mean, conceivably, from that perspective, you could hackers could reinsert their keys right

    if they had already compromised the system, yeah.

    That assumes a sufficiently clever hacker, and usually, unless you're a state actor, that's usually not the case. Okay.

    This a mess.

    I uh, yeah, kinda

    so. So, as a minimum, people should be planning well, they should be doing this. So here's, I'm back to all the, all the hardware vendors should be putting out a patch that eliminates these, these trust these keys, right, assuming so at least revokes them. But yeah, folks, them, okay, oh, okay, so real, I get do BIOS patches include Certificate Revocation

    lists, because, yes, they do, okay,

    all right. So, so as a minimum, now, okay, as a minimum, the Certificate Revocation List should be updated. That's still a firmware patch, but that should be,

    you know,

    thus yet another reason why regular firmware patching is actually a very good idea.

    That's it. I you know, I think, including me, I'd never thought I've been living in the BIOS and firmware world. I'd never thought about a Certificate Revocation List as being part of what you get when you're patching that firmware. I usually think of it as bugs, you know, potential vulnerabilities, but, but that's a that's a pretty basic one to be making sure you're invoking along those lines. Wait, so. So this, this security issue could be as simple as the assumption is that the UV BIOS does correctly enforce the revocation of a certificate list or a Certificate Revocation List, and you need to push a patch that includes the current revocations.

    Well, quote, simple.

    Yeah, that's simple. Sense, you hit a hot you hit a lot of process oriented heavy lifting behind the simple thing. Yes, simple to flash the firmware. However, you know, we have many customers where flashing the firm on a regular basis is an entirely foreign concept to them, or that they will only do it if you know their OEM tells them to, or they will own or they're only allowed to flash to specific versions for whatever workload related reasons, and until you know the list of acceptable firmware or whatever app they're running is updated to contain the firmware that fixes the vulnerability, they're not going to flash to it,

    or they they may be using a system that should not be restarted. Yeah, I.

    Or they can't afford the downtime, or any one of the myriad reasons, okay,

    right? Well, if they Yeah,

    or it's a unique system that they cannot risk flashing it

    aren't aren't using, uh, new fee Secure Boot anyway, but maybe they, maybe they can,

    can't afford the downtime of flashing. But can they afford the downtime of an attack like, I mean, something that could have been prevented by flashing, that depends on on the comparative risk of system compromise, like, if it's not flashed, versus the risk of downtime from flashing itself again, like if you look in industrial systems, like five minutes of downtime can be billions of dollars and cost. If you're looking at, for example, power plant control systems, yeah, you cannot have downtime, yeah, but I mean five minutes of downtime costs millions of dollars. How much downtime does a ransomware attack cost like that would probably cost hours, right, or days even absolutely I'm not saying this is absolutely rational. I'm saying that when you present this to your boss, your your boss is going to go back to you and say, like, we can't do this right now because I don't want to be responsible for millions of dollars of downtime right now. I

    mean, different orgs and different people have different sort of, you know, they have a different, you know, calculus for the risk reward of doing this sorts of stuff. Yeah,

    uh, it's easy to talk also about like redundancy and have a narcissism to fall in place. Realistically, however, not, not every industry has this, and particularly when you're dealing with legacy

    systems. And some of the biggest concern, where we started with this is it's in desktops. And so you know desktops, you better be planning to do regular reboot and updates on them. Or embedded systems, you better be planning to do updates, you know, reboots and reframes, CrowdStrike teachers, you better have a process for being able to do a reboot on those systems,

    or 15 reboots. Yeah.

    Perfect, exactly. Yeah. Good. Well,

    the nice thing is that this

    is, this is solvable by hygiene. It's a significant risk for people, but at the end of the day, it's, it's something that you would, you would take on as a hygiene fix,

    assuming that, again, that that the vendor gives you the software on your desktop that works on your desktop, right, right. If you're using, if using a run of the mill window system, you're probably fine. Now, if you, if you're using, let's say, a system that cannot upgrade to Windows 11 because the hardware doesn't meet its requirements, and the vendor, for some reason, releases software that only works from Windows 11. What do you do? Right?

    But I mean, even like the desktop systems I have, it does look like I get BIOS updates from the vendor, I guess that maybe it's coming in that's a different software path, like my newer systems get feel I feel like I get notices that, you know, Dell patched past a patch. And do I want to patch my BIOS or not? As part of what it looks like to me. It looks like part of the Windows update system. My am I misreading that

    it depends largely on the vendor. Some vendors build it or integrate it with the Windows update other ones have their own tools. Study either need to run manually or run as a demon, in which case also there's questionable quality about those as well. And then you have people like me who are entirely Linux systems, right? And it's always a gamble to see like a. If the firmware, if the Linux firmware update tool isn't, includes your your hardware or not.

    To be fair, as an enterprise, you probably have better control over that. But it's not always something that you actually consider as an IT team.

    And, you know, on a lot of these, a lot of goodness, it's just, it's interesting to me. These are, these are layers of all these pieces that that a lot of you know when we hear, Oh, it's, I just, you know, every once a while I should pass my bios, or I need you know that you don't worry about it. You wait until there's a crisis. Some of this stuff, this is making the news, but some of these things could actually be handling much more quietly, right? This is, this is old actually, let me go all the way back. This is conceivably old news. These certificates should already be in the revocation lists. Yeah, if everything's going to plan, then the vendors should already have it in the revocation list. Although victory, the one who said earlier, if people can even even support the older gear, right, it could be your gears old enough that it's not going to get a BIOS patch anytime soon.

    Yeah. I mean, that's always a that's always a risk.

    Yeah, I mean it mean Hindsight is 2020, of course, but like, if these, if these keys were, were test keys only, and we're never supposed to actually go into production systems going with what you were saying, Rob, it does make sense that this should have been in the replication list in the first place, and then the only difference is that the test environment does not include those particular revocations, I suppose so. The other way around is where you we accept them anywhere, and you have to remember to revoke them in production.

    That would have been it. Yes, it's it should have been logical that if your test certificates are reasonable, um, cool for this process, and then having them in the revocation list would make the right would make the most sense, because then you're also testing the revocation sorry, right? Have we? Have we learned that revocation lists don't actually work as one of the challenges.

    And these are lessons that we've learned in nit much faster because our environment moves faster than hardware.

    The old clay, the bold claim, but yes, not that it doesn't move faster than it's been that the lessons are

    learned both on the good side on on the bad side, like we it moves faster because the attack open opportunities are faster and because the bugs get introduced faster as a result, like we need to patch things more often. Yeah, but yeah, like going back to the previous comment about again, desktop systems and it being common science to patch frequently. Well it that might be starting to become common system. Common come like reasonable now, but you got to remember that until not very long ago, when you came to hardware and on firmware, the common wisdom was, if it ain't broke, don't touch it.

    Yeah, that's true.

    We see that quite a bit. Actually, it was when we were talking about this, what Victor I asked you or no, was this in the last call patch where we said the balance and maybe this is the survey question, right? Stay lag behind or patch often, no Victor. You and I, Victor and I were talking about this one on one, right? It's, you know, there's two schools of thought. You can say, I'm not going to take patches until they're proven and mature and or you can live on the bleeding edge and take whatever's coming out as fast as you possibly can. Sort of hard. There's not much middle ground. And Victor, Victor, um. I came back with, live on the bleeding edge. Take, take the latest with, with all the risks, right back to the CrowdStrike risk that was maybe a little bit more bleeding edge than people thought they'd signed up for.

    Yeah, well, and part of it is that, you know, if you establish a policy that you're going to live on the bleeding edge and, you know, take changes from upstream basically as fast as they occur, then in parallel, you're going to be forced to develop a policy on how to handle fixing things when something inevitably breaks, as opposed to the more reserved, you know, with policy, or which is the more commonly implemented policy of never take updates unless you're forced to which generally doesn't encourage a good, you know, breakage mitigation strategy.

    The expectation is that by updating often, you you mitigate the risk of updating because your updates are not as big, and then most cases, does absolutely fine.

    Your updates are not as big, you keep up to date with all the bug fixes, you know, all that other fun stuff, as opposed to, you know, living on code that is four or five years old has a slew of known bug fixes and, you know, it's just basically a ticking time bomb waiting for one of them to go off.

    Yeah, but even, but even if your regular updates, like the failure modes are smaller by nature, or, like statistics, there is still a chance of a big failure, and you still need to have a DR strategy. When you do slow updates, your DR strategy becomes more pressing on the effort to do an update essentially becomes very close to the effort of doing a DR but yeah, it's unfortunately, many teams end up forgetting about that part and end up moving fast and not having a rollback, any thought about rollback or restoration.

    And to be fair, like in some scenarios, the cost of having rollback capabilities was unconceivable, which is to use that word, and it shows that later on, the technology became fast enough or flexible enough to support that, but that means that you still have a large portion of legacy systems that are not supported, or you're essentially stuck with either manual recovery, as we've seen with the with the issue with CrowdStrike, or manual updates, as with the EFI key, the firmware issue here,

    interesting. It seems to always come back to your ability to keep up with have a have a strong process by which you are ingesting and applying patches and changes to your systems. Oh, that's, that's like, right? That's the answer almost every case, dear. All right, well, it's been, if this has been a good discussion, I, as usual, learn something new and really corrected my impression of why it was important to turn on secure boot, and we should actually be working to make sure that people do that even harder than we're making sure of it. Hopefully this will help the conversation will help people. I appreciate everybody's time and your patience with my technical glitch. Next week, we are off, and then we'll go back to the regular schedule. Eventing whilst you were right and we didn't even get through it, eventing versus logging for two weeks from now. Thank you. Everybody. Talk to you later. Thank

    you. Thanks, guys.

    Bye. Well, what a fantastic discussion. I don't know anywhere else in the internet, in the in the pod universe and the Internet, that people are having this type of pragmatic, accessible conversation about really significant. Again, infrastructure, security and operational topics. So I hope you enjoyed this conversation. If you were interested in this, this is our bread and butter. Please check out our other episodes. You can find out what we've done and where we are going at the 2030 dot cloud. I hope you enjoy it and see you soon. Thank you for listening to the cloud 2030 podcast. It is sponsored by RackN, where we are really working to build a community of people who are using and thinking about infrastructure differently, because that's what RackN does. We write software that helps put operators back in control of distributed infrastructure, really thinking about how things should be run, and building software that makes that possible. If this is interesting to you, please try out the software. We would love to get your opinion and hear how you think this could transform infrastructure more broadly, or just keep enjoying the podcast and coming to the discussions and, you know, laying out your thoughts and how you see the future unfolding. It's all part of building a better infrastructure operations community. Thank you.