DeepReinforce Open Sources Ornith-1.0 Coding Models

0
1
deepreinforce logo
deepreinforce logo

DeepReinforce has open sourced its Ornith-1.0 coding model family, releasing model weights and research on Hugging Face to help developers build, study and extend self-improving AI coding agents.

DeepReinforce has open-sourced Ornith-1.0, a family of agentic coding models spanning from a compact 9B Dense model to a flagship 397B Mixture-of-Experts (MoE) model, alongside 31B Dense and 35B MoE variants. The company has also released the model weights and technical report on Hugging Face, allowing developers and researchers to run, study and build on the models.

Ornith-1.0 introduces a reinforcement learning (RL) approach that enables the model to generate both coding solutions and the RL scaffolds that guide them, removing the need for manually designed training harnesses. During each RL step, the model first proposes a refined scaffold for a task before generating a solution conditioned on it. Rewards from successful rollouts improve both the scaffold and solution, allowing task-specific coding strategies to emerge automatically.

To reduce the risk of reward hacking, DeepReinforce employs a three-layer defence comprising a fixed trust boundary that isolates the execution environment, a deterministic monitor that detects attempts to modify verification mechanisms or access protected paths, and a frozen LLM judge that overrides the verifier when gaming behaviour is detected.

According to DeepReinforce, the 397B flagship achieves 77.5 on Terminal-Bench 2.1 and 82.4 on SWE-Bench Verified, matching Claude Opus 4.7 while outperforming open-source peers including MiniMax M3 and DeepSeek-V4-Pro. The 35B model reportedly surpasses similarly sized Qwen and Gemma models, while the 9B Dense variant delivers competitive coding performance on resource-constrained hardware. Built on pretrained Gemma 4 and Qwen 3.5, Ornith-1.0 extends DeepReinforce’s open RL research following CUDA-L1 and the IterX optimisation loop for code agents.

LEAVE A REPLY

Please enter your comment!
Please enter your name here