Nebius has introduced Token Factory, a platform built to deploy and scale open source AI models, directly challenging the closed cloud ecosystems of AWS, Azure, and Google Cloud.
Nebius has launched Token Factory, a new platform designed to deploy, optimise, and scale open-source and custom AI models with enterprise-grade reliability and control. The platform is positioned as a direct challenger to closed cloud ecosystems offered by AWS, Microsoft Azure, and Google Cloud Platform, alongside emerging AI infrastructure providers such as Fireworks and Baseten.
Token Factory supports more than 60 leading open source models including DeepSeek, OpenAI’s GPT-OSS, Meta’s Llama, Nvidia’s Nemotron, and Qwen. The platform also enables enterprises to host and manage their own custom models, offering flexibility to avoid vendor lock-in and diversify model portfolios.
The company says Token Factory is optimised for high performance, offering sub-second latency, autoscaling throughput, and 99.9 per cent uptime even for workloads that exceed hundreds of millions of requests per minute. Existing Nebius AI customers will receive an automatic upgrade to the new platform.
Nebius emerged as a “neocloud” provider after its separation from Yandex following sanctions in 2024, and currently operates data centres across the United States, Europe, and Israel. While cloud providers can increase profit margins by layering software services over infrastructure, Nebius leadership emphasises customer expansion and platform depth over margin prioritisation.
“Simply having infrastructure is far from enough. We want to become a large enterprise, but we do not wish to be merely a utility company,” said Roman Chernin, CEO and Chief Business Officer of Nebius. He added that enterprises are shifting to diversified model portfolios, and the company has built a scalable platform to help clients “seamlessly switch from whatever they started with to what they need at scale.”


