Google DeepMind has open sourced Gemma Scope 2, a massive interpretability toolkit that lets researchers trace how AI models think.
Google DeepMind has released Gemma Scope 2, a comprehensive open source interpretability suite designed to map and trace internal reasoning circuits across the entire Gemma 3 model family. Positioned as a microscope for large language models, the tools allow researchers to inspect how decisions form inside AI systems rather than treating them as opaque black boxes.
The release enables researchers to trace internal circuits linked to hallucinations, jailbreaks, and deceptive or unsafe reasoning, supporting root-cause debugging instead of surface-level mitigations such as reinforcement learning from human feedback. Google describes the project as its most ambitious transparency effort to date.
According to the Language Model Interpretability Team at Google DeepMind, “To our knowledge, this is the largest ever open-source release of interpretability tools by an AI lab to date.”
Gemma Scope 2 is fully open source, with interpretability model weights released on Hugging Face and an interactive visualisation demo hosted on Neuronpedia. The suite covers every layer and sub-layer across all Gemma 3 models, ranging from 270M to 27B parameters.
The scale is unprecedented. Google DeepMind stated: “Producing Gemma Scope 2 involved storing approximately 110 Petabytes of data, as well as training over 1 trillion total parameters.”
At its core, the suite introduces JumpReLU Sparse Autoencoders, replacing traditional TopK methods with dynamic, learnable thresholds that filter noise while preserving high-fidelity signals. Combined with cross-layer and skip-transcoders, the tools shift interpretability from single-layer snapshots to full circuit-level tracing across model layers.
By open-sourcing model-wide safety diagnostics, Google is positioning Gemma Scope 2 as shared public infrastructure for AI safety research. However, extreme compute and storage demands mean practical use remains largely limited to well-funded research labs and academic institutions.














































































