
AI.cc’s 2026 infrastructure report shows open source AI models rapidly becoming the backbone of enterprise AI, driving major cost reductions, multi-model adoption, and a shift away from single-provider dependence.
Open-source and open-weight AI models captured 38% of enterprise token volume in Q1 2026, up from 11% a year earlier, according to AI.cc’s 2026 AI API Infrastructure Report analysing 2.4 billion API calls across more than 8,000 enterprise and developer accounts globally.
The report identifies open-source pricing disruption, multi-model routing, and aggregation-scale pricing as the key forces behind a 67% year-over-year drop in enterprise AI token costs, which fell from $18.40 per million tokens in Q1 2025 to $6.07 in Q1 2026.
Models including DeepSeek V4-Flash, DeepSeek V3.2, Qwen 3.5 9B, Gemma 4, GLM-5.1, Llama 4 Maverick, and Mistral Small 4 emerged as major enterprise infrastructure layers for cost-efficient AI deployment.
“The headline finding is unambiguous: enterprise token costs fell 67% year-over-year,” the report stated.
AI.cc said multi-model deployment has “crossed from experimental to default architecture across virtually all enterprise customer segments”, with enterprises increasing average model usage from 2.1 models per account in Q1 2025 to 4.7 in Q1 2026.
The report’s “Tiered Intelligence Stack” architecture, now dominant across 64% of enterprise accounts by token volume, routes high-volume workloads to lower-cost open-source models while reserving premium proprietary models for advanced reasoning and coding tasks.
Enterprises fully implementing these architectures achieved median blended costs of $2.31 per million tokens, compared to $18.40 for frontier-only deployments — an 87.4% reduction in effective AI costs.














































































