• A fault with an update issued by CrowdStrike led to worldwide outages on Friday. CrowdStrike is a cybersecurity vendor that develops software to help companies detect and block hacks. It uses cloud technology to apply cyber protections to internet connected devices. The software requires deep access to systems to scan for threats. A fix has been issued, but it could be hard to implement - engineers will have to go into each individual data center running Windows to apply the fix, entering complex encryption keys manually for encrypted machines.

    Monday, July 22, 2024
  • On July 19, CrowdStrike released a sensor configuration update to Windows systems that triggered a logic error resulting in system crashes and blue screens. The issue was caused by an updated Channel File 291 (a config file) that controls how Falcon evaluates named pipe execution. CrowdStrike has corrected the error and updated the file, but billions of systems are still affected.

  • CrowdStrike caused a global outage by pushing a faulty configuration update to its Falcon product, resulting in the crash of 8.5 million Windows machines. The update aimed to enhance threat detection but contained a logic error that caused the CSAgent.sys process to crash the operating system. The recovery process was slow and manual, requiring physical access to each impacted machine. While CrowdStrike is primarily responsible, Microsoft's inability to restrict third-party software from running at kernel level due to a 2009 agreement with the European Commission also worsened the situation.

  • During and after Crowdstrike's global outage last week, the brand swiftly implemented a comprehensive communications response. It created a ‘Remediation and Guidance Hub' on its blog, including a letter from the CEO, technical details, and additional resources, with a website banner directing users to the hub. CrowdStrike provided regular updates on X and LinkedIn, conducted 2 broadcast interviews on the morning the issue occurred, and added support content on YouTube. The messaging across all channels was tight, specific, empathetic, and apologetic.

  • CrowdStrike's software update on July 19th caused a massive IT outage that impacted air traffic, particularly for major airlines like Delta, United, and American, which experienced widespread cancellations. While Southwest remained largely unaffected, Delta faced the most severe consequences due to its reliance on Windows systems and a lack of effective disaster recovery plans. Despite widespread disruptions, the impact of the outage was systematically analyzed using ADS-B Exchange data to compare takeoff rates before, during, and after the update, providing insights into the scale and duration of the disruption.

  • An update to a sensor configuration by CrowdStrike on July 19 caused the largest IT outage in history, significantly impacting the US aviation industry. Delta was hit the hardest, followed by United, with American and Southwest experiencing lesser or no effects. Delta's recovery took an extended period, resulting in thousands of flight cancellations.

  • A report says that CrowdStrike prioritizes speed over quality, which may be a reason behind the huge failure it had recently, which paralyzed airlines and caused significant financial losses. Former employees claim they repeatedly raised concerns about rushed deadlines, insufficient training, and increasing technical problems, but their warnings were ignored. The company disputes these claims, stating that it is committed to quality control and that the information came from disgruntled former employees.