Skip to content
Seedance 2.0 Face is here — generate video from real-person reference photos.Try it now
Back to Models
googlegoogle·May 7, 2026

Google: Gemini 3.1 Flash Lite

google/gemini-3.1-flash-lite
context
1.0M
Max Output
65.5K
Input / 1M
$0.25
Output / 1M
$1.50

About

Gemini 3.1 Flash Lite is Google's most cost-efficient Gemini model, generally available and optimized for low-latency, high-volume workloads. It accepts text, image, video, audio, and PDF inputs with text output, supports a roughly 1M-token context window with up to 64K output tokens, and targets lightweight agentic workflows, simple data extraction, and applications where responsiveness and API cost are the primary constraints. It supports configurable thinking levels (minimal, low, medium, high) for fine-grained cost/performance trade-offs and includes targeted improvements for instruction following and audio-input quality.

Capabilities

Context Length
1.0M
Max Output
65.5K
Reasoning
Yes
In
text, image, video, file, audio
Out
text
33.5#50 of 133
Intelligence Index
30.1#67 of 118
Coding Index

Reasoning & Knowledge

GPQA Diamond82.2%
HLE16.2%

Coding & Agentic

SciCode41.9%
Terminal-Bench Hard24.2%

Source: Artificial Analysis

TypePrice / 1M tokens
Input$0.25
Output$1.50
Cache Read$0.025
Cache Write$0.083333
Audio Input$0.50
Audio Cache$0.05
Reasoning$1.50
Image Input$0.25
Web Search$0.01 / call

OpenAI-compatible · Model ID google/gemini-3.1-flash-lite

curl https://api.elliotgate.com/v1/chat/completions \
  -H "Authorization: Bearer sk-omg-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "google/gemini-3.1-flash-lite",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'