CrowdStrike’s Falcon Sensor linked to Linux crashes, too • The Register

sabreW4K3@lazysoci.al · 4 months ago

CrowdStrike’s Falcon Sensor linked to Linux crashes, too • The Register

rem26_art@fedia.io · 4 months ago

they seem extremely competent at writing bad software

Wooki@lemmy.world · 4 months ago

Line mus go up

baatliwala@lemmy.world · 4 months ago

That line isn’t going to recover for a while now

Possibly linux@lemmy.zip · 4 months ago

But the publicity

SayCyberOnceMore@feddit.uk · 4 months ago

Not sure if it’s the devs to blame when there’s statements like:

Kurtz therefore has the possibly unique and almost-certainly-unwanted distinction of having presided over two major global outage events caused by bad software updates.

So, I’m guessing it’s the business that’s not supporting good dev->test->release practices.

But, I agree with your point; their overall software quality is terrible.

rem26_art@fedia.io · 4 months ago

true true. If the general business pressures are not conducive to proper software release practices, no amount of programming skill can help them.

eee@lemm.ee · 4 months ago

“The most secure system is a system that’s not live. Crowdstrike, bringing you the best-in-class security.”

Possibly linux@lemmy.zip · 4 months ago

“I don’t test often but when I do it is in production”

highduc@lemmy.ml · edit-2 4 months ago

Ofc it is. And can’t do any updates because Crowdstrike doesn’t support newer kernels. Apparently security means running out of date packages. 🤡

Bitrot@lemmy.sdf.org · 4 months ago

That first issue was triggered by falcon, but was legitimately a bug in Red Hat’s kernel triggered by bpf.

JWBananas@lemmy.world · 4 months ago

Nobody:

Crowdstrike:

Responsabilidade@lemmy.eco.br · 4 months ago

Difference between open source software and closed source software:

CrowdStrike bad coding make Linux crashes -> sysadmin has control over the system and can rapidly fix the issue by disabling CrowdStrike module -> downtime is limited
CrowdStrike bad coding make Windows crashes -> sysadmin has limited control over the system and rely on Windows/CrowdStrike people to fix the issue -> the demand is too high cause the issue happened with many computers around the world at the same time -> huge downtime while few people on Microsoft and/or CrowdStrike fix the issue one by one manually

Bitrot@lemmy.sdf.org · 4 months ago

This is a laughably bad take.

You do realize sysadmins were fixing the Windows issue and not just waiting on Microsoft and CrowdStrike - right? They just had to delete a file.

Responsabilidade@lemmy.eco.br · 4 months ago

Oh! That’s why the outage could demand long time to recover! Just delete a file takes so long!

I’m glad you said it!

superkret@feddit.org · edit-2 4 months ago

You have no idea what you’re talking about.
The fix is to boot into safe or recovery mode, delete a file, reboot. That’s it.

The reason it takes so long is because millions of PCs are affected, which usually are administered remotely.
So sysadmins have to drive to multiple places, while their usual workloads wait.
On top of that, you need the encryption recovery keys for each PC to boot into safe mode.
Those are often stored centrally on a server - which may also be encrypted and affected.
Or on an Azure file share, which had an outage at the same time.
Maybe some of the recovery keys are missing. Then you have to reinstall the PC and re-configure every application that was running on it.
And when all of that is over, the admins have to get back on top of all the tasks that were sidelined, which may take weeks.

Bitrot@lemmy.sdf.org · edit-2 4 months ago

Uh, yes. Physically touching thousands of computers to boot them into safe mode and delete a file is time consuming. It turns out physically touching thousands of machines is time consuming anywhere, especially when it is all of them at once.

Which is why your take is laughably bad. Stick to the tech and not zealotry next time, and maybe not CNN for tech news.

JWBananas@lemmy.world · 4 months ago

Sysadmin here. Wtf are you talking about? All we did was “rapidly fix the issue by disabling Crowdstrike module.” Or really, just the one bad file. We were back online before most people even woke up.

What do you think Crowdstrike can do from their end to stop a boot loop?

SquigglyEmpire@lemmy.world · 4 months ago

…what?

A busted kernel module/driver/plug-in/whatever that triggers a bootloop is going to require intervention on any platform no matter whether the code happens to be published somewhere out on the internet or not. On top of that, Windows allows you to control/remove 3rd party kernel drivers just like on Linux, which is exactly what many of us have been stuck doing on endless devices for the last three days.

I fully advocate for open-source software and use it where I can, but I also think we should do that by talking about its actual advantages instead of just making up nonsense that will make experienced sysadmins spit out their coffee.

MangoPenguin@lemmy.blahaj.zone · 4 months ago

The fix on windows was just removing the bad file, there was no reliance on crowdstrike to fix the initial issue that I know of.

Rentlar@lemmy.ca · 4 months ago

I’ve kept having to make this point repeatedly every time someone writes “It’s not a Microsoft/closed source problem, it happened to Linux too”.