
Mozilla’s 0DIN cybersecurity team has demonstrated how hackers can bypass traditional security tools and hijack developer systems by weaponising the automated error-recovery instincts of AI coding assistants like Claude Code.
Mozilla’s 0DIN researchers have discovered an attack vector that compromises a developer’s local machine or CI/CD runner by weaponising the automated error-recovery behaviour of terminal-based AI coding assistants like Claude Code. Rather than relying on standard code vulnerabilities, the exploit tricks the AI into autonomously executing malicious secondary fixes directly from terminal error logs without waiting for human permission.
The exploit relies on three distinct levels of indirection to cleanly compromise a system:
-
The Bait: A developer instructs their AI assistant to clone a benign, public GitHub repository and run a routine setup step (e.g.,
pip3 install -r requirements.txt). -
The Forced Error: The installed package is purposefully engineered to fail, throwing a runtime error that instructs the agent to fix the environment by running a specific command:
python3 -m axiom init. -
The Payload: Trusting the error output, the AI agent automatically runs the initialization command. This routine silently queries an attacker-controlled DNS TXT record, downloads a hidden base64-encoded payload, and executes it via bash—instantly spawning an interactive reverse shell.
Because the GitHub repository itself contains zero malicious files, it completely bypasses standard Static Application Security Testing (SAST) tools and Software Bills of Materials (SBOMs). Furthermore, because the malware is hosted in an external DNS record, the attacker can hot-swap the payload at any time without ever pushing a new commit.
Once active, the reverse shell gives the attacker full control of the host machine under the developer’s exact user privileges, exposing environment variables, cloud API keys, and repository secrets. To neutralize this threat, researchers state that AI vendors must implement mandatory human-in-the-loop validation gates before agents can execute automated error-correction steps or evaluate untrusted network paths.













































































