Sarvam.ai Brings 20% Indian Data To Global Open Source AI

0
18
Sarvam.ai Unveils Open Source-Led Sovereign AI Push With India’s First Foundational LLM Launch
Sarvam.ai Unveils Open Source-Led Sovereign AI Push With India’s First Foundational LLM Launch

Sarvam.ai is set to launch India’s first foundational LLM early next year, built with 15–20% Indian-origin data to strengthen the nation’s open, sovereign AI ecosystem.

Sarvam.ai will launch India’s first foundational large language model (LLM) by early next year, marking a major milestone in the country’s effort to establish sovereign AI while advancing open source innovation rooted in Indian data. The India AI Mission has selected Sarvam.ai as the first startup to build the nation’s foundational AI model, signifying strategic importance and national trust in the company’s technology leadership.

The indigenously developed model will feature 120 billion parameters, trained on over 17 trillion tokens. Crucially, between 15 and 20 percent of the training dataset originates from India,  a massive leap from the less than one percent representation seen in current open source models. This high concentration of Indian data aims to create far more localised and culturally aligned intelligence compared with existing global large language models.

Vivek Raghavan, Co-founder of Sarvam.ai, stressed that foundational control over such technology is non-negotiable for India’s future. “This technology is so important that if you don’t know how the core of this technology works by starting from scratch, you risk being left behind completely,” he said.

Raghavan, who previously contributed to India’s digital stack including Aadhaar while working at AI4Bharat with Infosys co-founder Nandan Nilekani, believes open, sovereign AI will allow India to define its own rules of innovation.

Sarvam.ai also plans to collaborate with Indian enterprises to co-develop domain-specific models, leveraging enterprise data while ensuring data sovereignty. The launch positions India to influence global open-source AI direction with multilingual depth, cultural nuance, and independent capability.

LEAVE A REPLY

Please enter your comment!
Please enter your name here