Launched by Sarvam AI, Sarvam 1 LLM is Trained in English and Ten Indic Languages

Launched by Sarvam AI, Sarvam 1 LLM is Trained in English and Ten Indic Languages
Sarvam AI Launches Sarvam 1 LLM Trained in English & Ten Indic Languages

On October 24, Sarvam AI, an artificial intelligence (AI) firm supported by Lightspeed, unveiled Sarvam 1, a Large Language Model (LLM). According to a tweet on X (previously Twitter), the business says it is India's first indigenous multilingual LLM, trained from scratch on domestic AI infrastructure in ten Indian languages and English.

Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, and Telugu are among the ten major Indian languages that Sarvam 1 supports in addition to English. The LLM uses a two-billion-parameter language model and is trained on Nvidia's H100 Graphics Processing Unit (GPU).

Sarvam AI uses Nvidia services and AI4Bharat's open-source technology

In order to optimise and implement conversational AI agents with sub-second latency, Sarvam AI also makes use of a variety of Nvidia services and products, including its microservice, conversational AI, LLM software, and inference server.

In addition to Nvidia, the LLM made use of AI4Bharat's open-source technology and language resources, as well as Yotta's data centres for computational infrastructure. According to a blog post by the AI startup, Sarvam-1's strong performance and computational efficiency make it especially well-suited for real-world uses, such as deployment on edge devices.

In specifics, Sarvam 1 clearly beats Gemma-2-2B and Llama-3.2-3B on a number of common benchmarks, such as MMLU, Arc-Challenge, and IndicGenBench, while attaining comparable results to Llama 3.1 8B, the company stated.

Functioning of Various LLM Models Launched by the Company

India's first Hindi LLM, Open Hathi, was introduced by the AI firm in December 2023. The Llama2-7B architecture from Meta AI, which has 48,000 token extensions, served as the foundation for the model. However, a training corpus of two trillion tokens is used to develop Sarvam.

Because of its effective tokeniser and unique data pipeline, which can produce diversified and high-quality text while preserving factual correctness, the LLM has two trillion tokens of synthetic Indic data. In addition to being four to six times faster during inference, Sarvam claimed that the most recent model from their stable meets or surpasses much larger models like Llama 3.1 8B.

The process by which a trained model predicts or deduces from fresh data using the patterns it discovered during training is known as inference in artificial intelligence. Compared to current Indic datasets, the companies' pretraining corpus, Sarvam-2T, supports eight times as much scientific material, three times as high quality, and two times as long documents. The total number of Indic tokens stored by Sarvam-2T is around 2 trillion. Apart from Hindi, which makes up over 20% of the data, the data is distributed nearly evenly among the ten supported languages.


AI Firm Sarvam Unveils Blend of Open Source and Enterprise Products
AI firm Sarvam unveils a new GenAI platform featuring a mix of open source and enterprise products, with support for 10 Indian languages.

WIDGET: questionnaire | CAMPAIGN: Simple Questionnaire

Must have tools for startups - Recommended by StartupTalky

Read more

Jio Financial Services Limited Q3 FY26 Consolidated Total Income at INR 901 crore, up 101% YoY; Pre-Provisioning Operating Profit at INR 354 crore, up 7% YoY

Jio Financial Services Limited Q3 FY26 Consolidated Total Income at INR 901 crore, up 101% YoY; Pre-Provisioning Operating Profit at INR 354 crore, up 7% YoY

The Board of Directors of Jio Financial Services Limited (“JFSL”, also referred to as the “Company”), at its meeting held in Mumbai today, approved the unaudited financial results for the third quarter of the financial year 2025-26, ended December 31, 2025 (Q3 FY26).  Key financial highlights for the quarter ended

By StartupTalky News
Daily Indian Funding Roundup & Key News – 15th January 2026

Daily Indian Funding Roundup & Key News – 15th January 2026: Emversity Raises $30 Mn, GrowthPal Secures $26 Mn, Microsoft Buys Soil Carbon Credits & More

India’s startup ecosystem continues to demonstrate strong momentum in January 2026, with significant funding across edtech, fintech, climate tech, proptech, deep tech, and logistics sectors. Notable investments include Emversity raising $30 million in Series A funding to expand its skill-based higher education programmes, GrowthPal securing $26 million to scale

By StartupTalky News