Netflix Senior Engineer Tejas Chopra’s open-source Project Headroom is helping organisations cut AI costs by reducing redundant LLM context, reportedly saving users US$700,000 while reclaiming 200 billion tokens.
An open-source tool created by Netflix Senior Engineer Tejas Chopra is gaining traction for tackling one of enterprise AI’s fastest-growing challenges: soaring LLM usage costs.
Called Project Headroom, the software reduces AI token consumption by compressing context before it reaches a model. Since its January release, the project has reportedly saved users an estimated US$700,000 and recovered about 200 billion tokens for other workloads.
The need is significant. Chopra estimates that up to 90% of tokens sent to large language models can be redundant, increasing costs without improving results. By stripping unnecessary context while preserving important information, Headroom aims to lower spending and potentially improve model performance.
Headroom is fully open source and currently at version 0.22. The project has already attracted around 2,000 GitHub stars and more than 120 forks, with adoption spanning multiple Netflix teams and external projects.
“A lot of our users are people who have been really burned by token costs, more than anything else,” said Chopra.
A key differentiator is its lossless, reversible compression approach. Running locally as a proxy on developer machines, the tool compresses server logs, MCP tool outputs, database outputs, file trees, documentation chunks and JSON responses. It can reportedly eliminate up to 90% of server-log data and around 70% of redundant JSON data through techniques such as CacheAligner, AST compression, JSON compression, DOM compression and Compress Cache and Retrieve (CCR).
Future plans include support for financial datasets, audio, image and video workloads, alongside a forthcoming open-source project called Headlight for tracking token origins across AI workflows.














































































