Sarvam AI Beats ChatGPT & Gemini: India’s Sarwam AI Guide 2026

February 10, 2026
Sarvam AI

Sarvam AI Beats ChatGPT & Gemini: How India’s Sarwam AI Is Redefining Global Benchmarks


In February 2026, something unprecedented happened in artificial intelligence. A Bengaluru-based startup nobody had heard of published benchmark results proving its AI model outperformed Google Gemini and OpenAI’s ChatGPT. That startup is Sarvam AI, and it’s redefining what “made-in-India” artificial intelligence means globally.

This isn’t hype. This is a strategic inflection point in India’s technological independence.


What is Sarvam AI? The Sovereign AI Platform Outperforming Global Giants

Sarvam AI is a Bengaluru-based AI startup founded in 2023 by Pratyush Kumar and Vivek Raghavan with a singular mission: build foundational AI systems for India, not adapted for India.

The company operates on a radical principle. Rather than retrofitting global AI models like ChatGPT or Gemini to handle Indian languages, Sarvam AI engineered proprietary vision-language and speech models from the ground up, specifically optimized for India’s unique challenges.

The result? Sarvam AI now outperforms both Google and OpenAI on critical benchmarks.

The Benchmark Victory That Shocked the AI World

On February 9, 2026, co-founder Pratyush Kumar announced Sarvam Vision’s benchmark results:

olmOCR-Bench: Sarvam Vision achieved 84.3% accuracy, surpassing:

  • Google Gemini 3 Pro
  • DeepSeek OCR v2
  • OpenAI’s ChatGPT

OmniDocBench v1.5: Sarvam Vision scored 93.28% accuracy, among the highest documented scores for document understanding globally.

For context, Optical Character Recognition (OCR) is one of the most demanding AI tasks—reading text from images with varied quality, multiple languages, degraded scans, and complex layouts. The fact that a 2-year-old Indian startup outperformed trillion-dollar companies on this task was groundbreaking.

Tech commentator Deedy Das, who was initially skeptical of Sarvam’s approach, publicly reversed his position: “I was wrong about Sarvam. They have the best text-to-speech, speech-to-text, and OCR models for regional languages, and that’s actually really valuable.”


Meet the Founders: Pratyush Kumar & Vivek Raghavan

Understanding Sarvam AI requires understanding its founders. They’re not Silicon Valley transplants chasing trends. They’re India-first technologists with global credentials and local conviction.

Pratyush Kumar: The Architect of India-First AI

Pratyush Kumar is CEO and co-founder of Sarvam AI. His credentials are formidable:

  • PhD from ETH Zurich (Swiss Federal Institute of Technology)
  • B.Tech from IIT Bombay (India’s premier technology institute)
  • Research experience at Microsoft Research and IBM Research
  • Adjunct faculty at IIT Madras

But his most important credential is this: Kumar co-founded AI4Bharat and PadhAI, platforms dedicated to advancing language AI and affordable education across India.

Kumar’s Vision: Global AI models are structurally misaligned with India’s reality. They fail because:

  • They weren’t trained on Indian document formats
  • They don’t understand the chaos of degraded government forms
  • They can’t parse handwritten text in regional scripts
  • They ignore 22 official Indian languages as “niche markets”

Under Kumar’s technical leadership, Sarvam Vision was engineered to solve these exact problems. A 3-billion-parameter state-space vision-language model, it excels at:

  • Image captioning in regional languages across South Asia
  • Scene text recognition across multiple writing systems (Tamil, Telugu, Kannada, Bengali, and others)
  • Chart and table interpretation from degraded, water-stained, and low-quality scans
  • Complex document parsing combining mixed languages in single documents

Kumar’s public disclosure of benchmark results was bold. Publishing 84.3% accuracy directly against Gemini and ChatGPT invited industry scrutiny. But the results held. This transformed Sarvam from an experimental lab into a credible foundational AI builder on the global stage.

Vivek Raghavan: The Strategist Bridging Research to Real Impact

If Kumar is the technical architect, Vivek Raghavan is the systems thinker translating breakthrough research into national-scale utility.

Raghavan brings deep expertise in:

  • AI systems architecture
  • Data infrastructure
  • Public sector technology
  • Enterprise deployment

His influence at Sarvam AI is visible in relentless focus on usability, pricing, and deployment readiness. While many AI startups chase abstract benchmarks for academic papers, Sarvam prioritizes where AI actually meets citizens: banks, government offices, courts, schools, enterprises.

This philosophy manifests in products like Bulbul V3, Sarvam’s text-to-speech model. It’s not built for demos. It’s built for adoption. With 35 voices across 22 regional languages, Bulbul V3 works for:

  • Voice-enabled government services
  • Regional banking applications
  • Educational technology platforms in regional languages
  • Accessibility tools for people with visual impairments

The Shared Conviction: AI is National Infrastructure

What binds Kumar and Raghavan is a core belief: AI systems are becoming national infrastructure. Language models, vision systems, and speech engines will shape how governments operate, how citizens access services, and how entire economies scale.

If India doesn’t control these foundational components, India outsources its digital future. Sarvam AI exists to prevent that.


Sarvam AI’s Products: Sovereign Technology for India

1. Sarvam Vision: The OCR That Beats Global Giants

What it does: Sarvam Vision is a vision-language model capable of understanding and interpreting images with text, charts, tables, and handwritten content.

Key capabilities:

  • Multilingual OCR: Recognizes text in all 22 official Indian languages
  • Degraded scan handling: Works with faded documents, water-stained papers, low-quality scans
  • Script diversity: Handles Tamil script, Telugu script, Kannada script, Bengali script, Oriya script, Punjabi script, Gujarati script, Marathi script, and 14 other regional writing systems
  • Complex layouts: Parses government forms, bank statements, legal documents, court filings
  • Chart and table interpretation: Understands visual data representations
  • Historical document processing: Works with old, poorly preserved documents

Why it matters: Global OCR models like those in Google Lens or Azure Computer Vision were trained predominantly on English and European languages. Indian documents are fundamentally different—they feature water damage, handwriting, mixed scripts, and unusual layouts that global models struggle with.

Sarvam Vision achieved 84.3% accuracy on olmOCR-Bench—outperforming Gemini—because it was trained on Indian data from the start.

Real-world applications:

  • Banking: Digitizing customer KYC documents, loan applications, check deposits
  • Government: Digitizing land records, court documents, voter registrations
  • Insurance: Processing claim documents, policy records
  • Education: Digitizing school records, exam papers, historical archives
  • Legal: Converting court filings, contracts, property deed documents to searchable formats

Pricing: Affordable per-page pricing with a clean, intuitive API. According to Deedy Das, Sarvam’s pricing is “very reasonable” compared to enterprise solutions.

2. Bulbul V3: Text-to-Speech for Regional Languages

What it does: Bulbul V3 converts text to natural-sounding speech in regional languages across South Asia.

Key specifications:

  • 35 voices across 22 regional languages
  • Linguistic authenticity: Voices span from historical dialects to modern contemporary language variations
  • Multiple quality tiers: Options for different use cases and bandwidth requirements
  • Natural prosody: Understands language-specific stress patterns, intonation, and rhythm

Why it matters: Global TTS (text-to-speech) engines like Google’s or Microsoft’s are optimized for English, Spanish, and Mandarin. Regional language TTS is deprioritized because the market isn’t profitable by Silicon Valley standards. Yet 900+ million people in South Asia primarily speak regional languages.

Sarvam’s Bulbul V3 fills this gap. It’s not a nice-to-have feature. It’s essential infrastructure for accessibility and inclusion.

Real-world applications:

  • Voice-enabled government services: Citizens in regional languages can access government portals through voice
  • Banking apps: Regional language support for financial services
  • Educational technology platforms: Learning content in students’ native languages
  • Accessibility: Visually impaired users accessing digital services
  • Customer service: Banks and insurance companies serving customers in regional languages

Deedy Das’s assessment: “The pricing is very reasonable. The website is not only beautifully designed but very easy to use.”


Why Sarvam AI Matters: The Sovereign AI Inflection

Sarvam AI’s rise signals something larger than one startup’s success. It represents India’s transition from AI consumer to AI creator.

The Geopolitical Significance

When foundational AI models are controlled by US and Chinese companies, those companies also control:

  • How information is processed
  • What languages and cultures are prioritized
  • Which problems are deemed important
  • How governments and enterprises depend on foreign technology

Sovereign AI means India building, controlling, and evolving its own foundational AI systems. This isn’t isolationism—it’s agency.

Sarvam AI demonstrates that India can:

  1. Compete globally on technical merit
  2. Solve locally with deep market understanding
  3. Scale independently without relying on OpenAI or Google APIs
  4. Control critical infrastructure (language, vision, speech) that affects billions

The Strategic Advantage

Sarvam AI’s approach reveals a counterintuitive insight: By solving India’s hardest AI problems, you build globally competitive models.

Why? Because:

  • Regional languages are structurally complex: Handling Tamil or Bengali’s linguistic richness teaches AI systems robust language processing.
  • Indian documents are challenging: Water-stained government forms, handwritten ledgers, and mixed-script content are harder than pristine English PDFs. Solving for this generalizes to any difficult OCR problem globally.
  • India’s scale is massive: Training on India’s data means training on 1.4+ billion people. Models become more robust, diverse, and universally applicable.

Sarvam Vision outperforms ChatGPT on OCR not because Sarvam is building for 300 million Americans. It’s because Sarvam is building for 1.4 billion Indians with vastly harder technical problems.


Sarvam AI’s Impact: Who Benefits?

Government Agencies

Digitizing legacy records—land deeds, court documents, voter registrations—requires OCR that understands regional languages and degraded scans. Sarvam Vision is purpose-built for this task.

Banks & Financial Institutions

Processing KYC documents, loan applications, and checks requires multilingual OCR and regional language support. HDFC, ICICI, and smaller regional banks are already evaluating Sarvam’s solutions.

Insurance Companies

Claim processing, policy document extraction, and fraud detection require fast, accurate OCR. Sarvam AI reduces manual processing time and operational costs significantly.

Education & EdTech

Digitizing school records and creating regional language learning content requires both OCR (for historical materials) and TTS (for accessible learning). Sarvam’s technology stack handles both effectively.

Court documents, contracts, and regulatory filings often feature regional languages and unusual formatting. Sarvam Vision’s ability to parse complex layouts is transformative.

Healthcare & Diagnostics

Patient records, medical reports, and prescriptions often feature handwriting and regional text. Sarvam AI enables digital health infrastructure.


The Technology: What Makes Sarvam AI Different?

State-Space Models vs. Transformers

Sarvam Vision uses a 3-billion-parameter state-space vision-language model, not a standard transformer architecture. This matters because:

  • State-space models are more efficient
  • They handle long sequences better
  • They’re smaller and faster to deploy
  • They require less computational overhead

For a sovereign AI company serving India’s government and enterprises, efficiency and cost matter. Sarvam’s architecture choice reflects this pragmatism.

Training Data: India-First

Sarvam Vision was trained on:

  • Real Indian government documents
  • Banking and financial documents from Indian institutions
  • Court documents from Indian courts
  • Educational records from Indian schools
  • Regional language text and handwriting samples

This India-first training approach is why Sarvam outperforms models trained on primarily English data.

Continuous Improvement

Sarvam AI isn’t a static model. The company continuously:

  • Collects user feedback
  • Improves performance on emerging document types
  • Adds support for edge cases
  • Optimizes for new languages and scripts

This feedback loop ensures Sarvam’s models improve as they’re deployed, unlike closed-source models like ChatGPT that update infrequently.


Sarvam AI’s Business Model: Sustainable & Scalable

Sarvam AI operates on a SaaS API model with:

  1. Per-page pricing for vision services (OCR, document parsing)
  2. Per-character pricing for speech services (TTS, STT)
  3. Volume discounts for enterprises
  4. On-premise deployment options for sensitive government and financial data

This pricing model is sustainable because:

  • Indian enterprises can afford it (unlike $50/1M tokens for ChatGPT)
  • Government agencies have budget for per-use pricing
  • Banks and insurance companies see ROI quickly
  • Startups can build on Sarvam’s APIs affordably

The Competitive Advantage: Why Sarvam Wins

AspectSarvam AIChatGPTGemini
Regional Language OCR84.3% (olmOCR-Bench)Lower on regional scriptsLower on regional scripts
Regional Languages Supported22 official languages native100+ but lower quality for regional100+ but deprioritized regional
TTS for Regional Languages35 voices, authenticLimited regional voicesLimited regional voices
Cost for India10-20x cheaperEnterprise pricingEnterprise pricing
SovereigntyBuilt in IndiaUS-controlledUS-controlled
Data PrivacyOn-premise optionCloud-onlyCloud-only

Recognition & Validation

Tech Industry Validation:

  • Deedy Das (respected tech commentator): “I was wrong about Sarvam. The work is impressive.”
  • Industry observers: Recognition across NDTV, Times of India, Forbes India
  • Global media: Coverage in international technology publications

Benchmark Validation:

  • olmOCR-Bench: 84.3% (beat Gemini 3 Pro, DeepSeek OCR v2, ChatGPT)
  • OmniDocBench v1.5: 93.28% (among highest globally)

Government & Enterprise Interest:

Multiple Indian banks, government agencies, and enterprises are in pilot and adoption phases.


The Future: Where Sarvam AI is Headed

Based on public statements and roadmap signals, Sarvam AI is expanding into:

  1. Larger Language Models: Building India-native LLMs rivaling ChatGPT’s reasoning capability
  2. Multimodal reasoning: Combining vision, language, and speech for complex decision-making
  3. Domain-specific models: Specialized models for healthcare, legal, financial services
  4. Enterprise deployment: On-premise solutions for governments and large institutions
  5. Global expansion: Offering Sarvam’s regional language models globally for diaspora and international users

Why Sarvam AI Matters to You

For Founders: Sarvam AI proves that solving India’s problems creates globally competitive technology. You don’t need Silicon Valley’s approval to build something world-class.

For Investors: Sovereign AI is a multi-billion-dollar category. Sarvam AI is a first-mover in India’s AI independence.

For Enterprises: Sarvam AI offers cost-effective, sovereign alternatives to ChatGPT and Gemini, with better performance on regional language tasks.

For Developers: Sarvam’s APIs are clean, affordable, and optimized for South Asian use cases. Building on Sarvam means supporting regional AI infrastructure.


The Bigger Picture: India’s AI Independence

Sarvam AI is not alone. India is witnessing a broader wave of AI startups building sovereign, India-first alternatives to global models. This movement will define the next decade of technology in the region.

When AI is built with local context, linguistic depth, and real-world constraints in mind, it doesn’t just compete globally—it wins.

Stay ahead of India’s startup ecosystem with exclusive funding news, founder stories, and AI innovation coverage at BestStartup.India.

Related News

FREE: PROMOTE YOUR COMPANY

Indian Founders: We want to interview you.

If you are a founder, we want to interview you. Getting interviewed is a simple (and free) process.
PROMOTE MY STARTUP 
close-link

Don't Miss

101 India Based Advertising Platforms Companies | The Most Innovative Advertising Platforms Companies

6 Kolkata Based Fitness Companies | The Most Innovative Fitness Companies

At Best Startup India we track over 400,000 Indian startups and over
101 India Based Advertising Platforms Companies | The Most Innovative Advertising Platforms Companies

10 Kochi Based Outsourcing Companies | The Most Innovative Outsourcing Companies

At Best Startup India we track over 400,000 Indian startups and over