is an artificial intelligence research and security company dedicated to building reliable, explainable and controllable artificial intelligence systems.
Work with Mozilla to improve Firefox security
Artificial intelligence models can now independently identify high-severity vulnerabilities in complex software.As we recently documented , Claude discovered more than 500 zero-day vulnerabilities (security flaws unknown to software maintainers) in well-tested open source software.
In this post, we share details of a collaboration with Mozilla researchers in which Cloud Ops 4.6 discovered 22 vulnerabilities in two weeks.Of these, Mozilla has classified 14 as high-severity vulnerabilities — nearly a fifth of all high-severity Firefox vulnerabilities built in 2025.In other words: AI enables detection of serious security vulnerabilities at a much faster speed.
As part of this partnership, Mozilla received a large number of reports from us, helped us understand the types of results that warranted submitting a bug report, and sent fixes to hundreds of millions of Firefox 148.0 users.Their collaboration and technical lessons learned provide an example of how AI security researchers and managers can work together to address this urgent situation.
From template reviews to security partnerships
At the end of 2025, we noticed that Opus 4.5 is close to solving all the problems in CyberGym, a sign to test if LLM can reproduce the problems.We wanted to create a more accurate, more accurate search that has a high level of technical difficulties, such as those present in modern internet searches.So we created a database of Firefox Common Vulnerabilities and Exposures (CVEs) first to see if Claude could reproduce it.
We chose Firefox because it is both an integrated code base and one of the best testing and open source security projects in the world.This makes it more difficult to test AI's ability to find new security vulnerabilities than the open source software we used to test our models.to protect their safety.
Our first step was to use Claude to find previously identified CVEs in previous versions of the Firefox codebase.We were surprised that Opus 4.6 was able to reproduce a high percentage of these historical CVEs, given that each of them required significant human effort to detect.But it was still unclear how much we should trust this result, because it was possible that at least some of these historical CVEs were already in Claude's training data.
So Claude worked on improving Firefox-bugs when finding new vulnerabilities that by definition could not be reported before.We initially focused on Firefox's JavaScript engine but then expanded to other parts of the browser.The first step is the JavaScript engine: this is an independent segment of the Firefox code that can be tested separately, and it is important to be safe, due to its extensive attack (the process is injected into external code when users browse the web).
After twenty minutes of observation, Claude Opus 4.6 revealed that a gratuitous (a type of memory vulnerability that could allow attackers to overwrite with malicious content) was exposed after using data in the JavaScript engine. One of our researchers confirmed this bug in an independent virtual machine running the latest version of Firefox.I forwarded it to two other researchers, who also confirmed the bug. We then filed a bug report on Mozilla's issue tracker, Bugzilla, with a description of the vulnerability and a proposed patch (written by Claude and approved by the reporting team).
By the time we took validation and submitted this first vulnerability to Firefox, the cloud had received fifty more unique crash inputs.When we attempted to fix this crash, a Mozilla researcher contacted us.After having a technical discussion about our respective processes and sharing some vulnerabilities that we had manually verified, he encouraged us to present all of our findings collectively without validating each one, although we were not certain that all crash test cases had security implications.By the end of this effort, we had scanned approximately 6,000 C++ files and sent a total of 112 unique reports, including the high- and medium-severity vulnerabilities mentioned above.Most of the issues have been fixed in Firefox 148, and the rest will be fixed in future releases.
When looking for these types of bugs in external software, we are always aware of the fact that we may have missed something critical about the code base that would make the discovery a false positive.We try our best to correct ourselves, but there is always room for error.They took care of them (although not all are security related).Mozilla researchers have since begun experimenting with the cloud internally for security purposes.
From identifying vulnerabilities to writing primitive exploits
To measure the limits of Claude's networking capabilities, we also developed a new assessment to determine if Claude was able to exploit any bugs we discovered.In other words, we wanted to understand if Claude could develop the kinds of tools a hacker would use to exploit these bugs to run malicious code.
To do this, we gave Claudius access to the vulnerabilities, sent them to Mozilla, and asked Claudius to create exploits targeting each one.To prove that he was successfully exploiting the vulnerability, we asked Claudius to demonstrate a real attack.Specifically, we require that it read and write local files on the target system, just as the adversary does.
We ran this test hundreds of times with different starting points and spent about $4,000 in API credits.Even so, Opus 4.6 can actually turn a vulnerability into a vulnerability in two cases.This tells us two things.First, the cloud is much better at finding these bugs than it is at exploiting them.Second, the cost of identifying security vulnerabilities is much cheaper than exploiting them.The fact that the cloud can automatically succeed in developing rudimentary browser exploits, however, is only relevant in some cases.
"Crude" is an important caveat here.The exploit that Claude wrote only works in a test environment, which deliberately removes some of the security features found in modern browsers.This includes, most importantly, a sandbox, which aims to reduce the impact of these vulnerabilities.So, Firefox's "defense in depth" will be effective in mitigating these particular exploits.But vulnerabilities that escape the sandbox are unknown, and the Claude attack is an essential component of end-to-end exploitation.You can read more about how Claude developed one of his Firefox exploits on the Frontier Red Team blog.
What's next for AI-powered cybersecurity?
Early signs of AI-enabled exploit development indicate the importance for defenders to speed up the detection and remediation process.For this,We'd like to share some of the technical and procedural best practices we discovered while conducting this analysis.
First, as we explore "remediation agents" that use LLMs to develop and validate bug fixes, we've developed some methods that we hope will help administrators use LLMs like Claude to collate and process security reports more quickly.1
In our experience, Claude is most useful when he can verify his own work with another tool.We call this class of tools a "verification of work": a reliable way to verify whether the AI agent actually achieves its goal.Proof of work gives the candidate any accurate information while searching for a brand foundation, allowing them to drill down to success.
Task verifiers helped us detect the Firefox vulnerabilities described above2, and in a separate study we found that they were also useful for fixing bugs.A good patch agent should verify at least two things: that the vulnerability is actually fixed and that the intended functionality of the program is preserved.In our work, we built tools that automatically checked whether the original bug could still be triggered after a proposed fix, and separately ran test suites to identify regressions (changes that accidentally break something else).We expect maintainers to know better how to build these verifiers into their own code bases;Most importantly, providing the agent with a reliable way to test both of these properties greatly improves the quality of its results.
We cannot guarantee that every proxy-produced patch that passes these tests is ready to be merged.But project verifiers give us more confidence that the patch produced fixes a particular vulnerability while preserving the functionality of the application and achieves what is considered the minimum requirement for a reliable patch.In fact, we recommend that maintainers use the same judgment when reviewing AI patches that they would apply to any other patch created by an external author.
Reducing the process of submitting bugs and patches: We know maintainers are under water.So our approach is to provide maintainers with information they can trust and to verify reports.The Firefox team identified three components of our submissions that were key to confidence in our results:
With minimal test cases.
- Detailed proof of concept
- Candidate patches
We strongly encourage researchers to use LLM motor vulnerability research tools to include evidence such as validation and replication when submitting reports based on the output of such tools.
We have also published our Coordinated Vulnerability Disclosure Operating Principles, where we describe the processes we will use when working with maintainers.Our processes here follow industry standard standards for the time being, but as the model improves, we may need to adjust our processes to keep pace with capabilities.
The urgency of the moment
Frontier language models are now world-class vulnerability researchers.In addition to the 22 CVEs we identified in Firefox, we used Claude Opus 4.6 to discover vulnerabilities in other important software projects such as the Linux kernel.In the coming weeks and months, we will continue to report on how we leverage our models and work with the open source community to improve security.
Currently, Opus 4.6 is better at identifying and fixing vulnerabilities than exploiting them.This gives the defenders an advantage.And with the recent release of Claude Code Security in a limited research analysis, we are bringing risk discovery (and patching) capabilities directly to open source customers and developers.
But judging by the progress that has been made, it is unlikely that the gap between the discovery of the species' weaknesses and the potential for exploitation will last long.If such language violates this anti-harassment protection, we will need to consider other safeguards or other actions to prevent malicious actors from abusing our form.
We urge developers to take advantage of this window to redouble their efforts to make their software more secure.For our part, we plan to significantly expand our cybersecurity efforts, including working with developers to find vulnerabilities (following the CVD process outlined above), developing tools to help maintainers review bug reports, and directly offering patches.
If you're interested in supporting our security efforts—writing a new framework for identifying vulnerabilities in open source software;triage, patch, and report vulnerabilities;and developing robust CVD processes for the AI era—apply for jobs here.
- All tips shared here are based on Claude's usage, but should apply to your chosen LLM.
- Patched by Mozilla independently.
