Anthropic’s Project Glasswing Flags Thousands of Critical Software Flaws in First Month

Anthropic and about 50 partner organizations used its Claude Mythos Preview model under Project Glasswing to scan systemically important software during the initiative's first month, identifying over 10,000 vulnerabilities rated high or critical. While discovery rates outpaced expectations, Anthropic says progress is now constrained by the pace at which issues can be validated, disclosed to maintainers, and patched.

MSFT

Summarize with

ChatGPT Perplexity Claude Grok Gemini

Key Points

Project Glasswing and partners identified over 10,000 high- or critical-severity vulnerabilities in the first month, with discovery now limited by the capacity to validate and patch findings.
Major providers including Cloudflare, Mozilla, Palo Alto Networks, Microsoft, and Oracle reported substantially higher detection or patch volumes, affecting cloud, security, browser, and enterprise software sectors.
Anthropic has disclosed hundreds of high- or critical-severity bugs to maintainers, launched Claude Security in public beta, and partnered with OSSF Alpha-Omega to help process reports.

Summary: In its initial 30 days, Project Glasswing leveraged Claude Mythos Preview across critical software stacks and open-source repositories, producing more than 10,000 high- or critical-severity vulnerability findings. Partners reported dramatic jumps in diagnostics, but Anthropic says follow-through - validation, disclosure, and remediation - is the main limiter on progress.

Anthropic launched Project Glasswing with roughly 50 partner organizations, applying Claude Mythos Preview to evaluate code and infrastructure deemed systemically important. The company reported that the program detected in excess of 10,000 vulnerabilities classified as high or critical in its opening month of operation.

Several participating companies described sharp increases in the rate of bug discovery. Cloudflare reported uncovering 2,000 bugs across systems it considers critical to its operations, of which 400 were categorized as high- or critical-severity defects. Cloudflare also said its bug-finding rate rose by more than tenfold during the testing period. Other partners conveyed comparable uplifts in detection volumes.

The UK-based AI Security Institute noted that Mythos Preview became the first model to fully solve both of the institute's cyber ranges end to end, signaling advances in the model's capability to reason through complex security challenges. In parallel, Mozilla used Mythos Preview in tests that resulted in 271 vulnerabilities being found and fixed in Firefox 150 - a count that the company said is more than ten times the number identified in Firefox 148 when tested with Claude Opus 4.6.

Anthropic provided aggregated data on open-source scanning as well. Using Mythos Preview, the company reported scanning more than 1,000 open-source projects and detecting an estimated 6,202 high- or critical-severity vulnerabilities out of 23,019 total findings. Of the 1,752 high- or critical-rated vulnerabilities that independent security research firms subsequently assessed, 90.6% were confirmed as valid and 62.4% were verified at high- or critical-severity levels.

Among individual discoveries, Anthropic identified a vulnerability in wolfSSL, a widely used cryptography library. That issue would have allowed an adversary to forge certificates and host fraudulent websites impersonating banks or email providers. The vulnerability was tracked as CVE-2026-5194 and has been patched.

To date Anthropic said it has disclosed 530 high- or critical-severity bugs to project maintainers. Of those, 75 have been patched and 65 have been accompanied by public advisories. Anthropic indicated the typical time to patch a high- or critical-severity bug found by Mythos Preview averages about two weeks, underscoring that the principal constraint on the program's throughput is not discovery but the downstream steps of verification, disclosure, and remediation.

Vendors across the enterprise security and software stack have adjusted release and remediation patterns in response. Palo Alto Networks said its most recent update included more than five times the number of patches it usually ships. Microsoft indicated that the total volume of new patches it issues will continue to trend larger. Oracle reported that it is discovering and fixing vulnerabilities across its product portfolio at multiples of its prior pace.

Anthropic also highlighted internal tools and partnerships supporting remediation. The company released Claude Security in public beta to Claude Enterprise customers three weeks ago; Anthropic says that tool has been used to patch over 2,100 vulnerabilities leveraging Claude Opus 4.7. To help maintainers cope with increased incoming reports, Anthropic established a collaboration with the Open Source Security Foundation's Alpha-Omega project to aid maintainers in processing bug reports.

Anthropic has not made Mythos-class models publicly available, stating that further safeguards are needed to reduce misuse risks. The company framed the current phase of Project Glasswing as one where discovery has outstripped the capacity for validation and repair, making remediation pipelines the critical lever for reducing exposure in systemically important software.

Key takeaways:

Project Glasswing and partner scans found over 10,000 high- or critical-severity vulnerabilities in the first month, with discovery now limited by verification and patching throughput.
Cloudflare, Mozilla, Palo Alto Networks, Microsoft, and Oracle reported material increases in vulnerabilities discovered or patches released, signaling broad effects across cloud, browser, security, and enterprise software providers.
Anthropic has disclosed hundreds of high- or critical-severity bugs to maintainers and is supporting remediation through Claude Security and a partnership with the Open Source Security Foundation.

Risks and uncertainties:

Remediation bottlenecks - The initiative's pace is constrained by the time required to validate, disclose, and patch vulnerabilities, which could prolong exposure for critical infrastructure and enterprise systems.
Dependency on maintainers - Effective reduction of risk depends on maintainers and vendors applying fixes; delays or resource constraints among maintainers could leave vulnerabilities unpatched.
Scope of impact - Widespread high-severity findings in systemically important software suggest potential implications for cloud services, enterprise security, and consumer-facing applications if patches are not applied promptly.

Risks

Remediation bottlenecks: progress is constrained by verification, disclosure, and patching speed, posing continued exposure for critical infrastructure (affects cloud and enterprise software).
Maintainer dependency: successful mitigation depends on maintainers and vendors applying fixes in a timely manner, which may be limited by resources (affects open-source and enterprise vendors).
Widespread impact potential: high- or critical-severity vulnerabilities in systemically important software could affect multiple sectors if patches are delayed (affects cybersecurity, cloud services, and consumer applications).

Menu

Anthropic’s Project Glasswing Flags Thousands of Critical Software Flaws in First Month

Key Points

Risks

More from Stock Markets