Google releases Gemini 3.1 Flash-Lite at $0.25/M input — 2.5x faster than prior generation

SHAREPOST ON X →SHARE ON LINKEDIN →

Source

↗

Blog post

Google DeepMind blog

What Happened

Google released Gemini 3.1 Flash-Lite, an efficiency-focused model delivering 2.5x faster response times and 45% faster output generation compared to earlier Gemini versions. Input pricing is $0.25 per million tokens — among the lowest of any frontier-class model currently available. The model is accessible via Google AI Studio and Vertex AI.

Why It Matters

At $0.25/M input, Gemini 3.1 Flash-Lite prices below DeepSeek's API tier and well below GPT-4o Mini's comparable pricing, intensifying a cost race at the high-volume developer tier. Speed and cost together target the workloads where developers currently use smaller, cheaper models — classification, summarization, extraction, routing — but want frontier-quality outputs. Google is subsidizing market share in the developer tier where Anthropic and OpenAI currently have stronger mind share.