
Bugcrowd has launched reinforcement learning environments built entirely on authentic open-source vulnerabilities, enabling frontier AI teams to train cybersecurity models on real software flaws instead of synthetic simulations.
Bugcrowd has launched Reinforcement Learning (RL) Environments designed to help AI developers train models capable of finding, exploiting, and fixing real-world software vulnerabilities using authentic open-source software.
Built using technology acquired through Mayhem Security, the platform includes hundreds of thousands of training environments derived exclusively from open-source vulnerabilities, complete with real source code and verifiable outcomes. Bugcrowd said no customer data or security researcher data is used at any stage of training.
The launch addresses a major limitation in current AI cybersecurity training, where many systems rely on synthetic datasets that fail to accurately reflect real software vulnerabilities. According to the company, models that perform well in controlled testing environments often struggle against actual software flaws.
The RL Environments allow AI agents to interact directly with vulnerable software through bug discovery, exploitation, and remediation tasks while receiving immediate scored feedback through reinforcement learning cycles.
Bugcrowd said the infrastructure significantly reduces development timelines for frontier AI teams, allowing them to begin training within weeks instead of spending years building enterprise-grade environments internally. The product is already being used by leading LLM providers to develop more security-capable AI models.
“The gap between what AI agents are trained on and what they encounter in the real world is where security breaks down,” said Dave Gerry. “Our RL Environments give frontier teams the infrastructure to build AI that learns security from real vulnerabilities, not approximations of them.”














































































