G
GoogleActive

Gemini 2.5 Flash Lite API

Google's most affordable Gemini model. Ideal for high-volume, cost-sensitive applications like classification and simple extraction.

💰 Save up to 70% vs official Google pricing
INPUT / 1M tokens
$0.10
OUTPUT / 1M tokens
$0.40
CONTEXT WINDOW
1M

Technical Specifications

ProviderGoogle
Model FamilyGemini 2.5 Flash Lite
Release Date2026-05
Context Window1M
Max Output Tokens8,192
Input Price$0.10 / 1M tokens
Output Price$0.40 / 1M tokens
Vision SupportYes ✓
Function CallingNo
JSON ModeNo
StreamingYes ✓
Fine TuningNot Available
StatusActive ✓

Overview

Gemini 2.5 Flash Lite is Google's current gemini-class AI model, released in 2026-05. Google's most affordable Gemini model. Ideal for high-volume, cost-sensitive applications like classification and simple extraction.

With a 1M context window and maximum output of 8,192 tokens, Gemini 2.5 Flash Lite is well-suited for free tier and ultra-low cost. At $0.10/1M input and $0.40/1M output, it offers extremely competitive pricing within the Google ecosystem.

Gemini 2.5 Flash Lite supports 2 capabilities: Vision, Streaming. Fine-tuning is not available for this model. The model serves as an free tier solution for developers building multimodal AI applications.

Through AI API Hub, you can access Gemini 2.5 Flash Lite with USDT & USDC payments, no credit card required. All via a fully OpenAI-compatible API — just change your base URL and start building in 30 seconds.

API Examples

Python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyihe.org/v1"
)

response = client.chat.completions.create(
    model="gemini-2-5-flash-lite",
    messages=[
        {"role": "user", "content": "Hello"}
    ]
)

print(response.choices[0].message.content)

JavaScript / Node.js

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.API_KEY,
  baseURL: "https://api.apiyihe.org/v1"
});

const response = await client.chat.completions.create({
  model: "gemini-2-5-flash-lite",
  messages: [
    { role: "user", content: "Hello" }
  ]
});

console.log(response.choices[0].message.content);

cURL

curl https://api.apiyihe.org/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gemini-2-5-flash-lite",
    "messages": [
      {"role": "user", "content": "Hello"}
    ]
  }'

Supported Features

Vision / Image Input✅ Supported
Audio / Voice Input❌ Not Available
Function Calling❌ Not Available
JSON Mode❌ Not Available
Streaming✅ Supported
Fine-Tuning❌ Not Available
Multimodal❌ Not Available

Benchmark Scores

BenchmarkScore
MMLUNot Publicly Available
GPQANot Publicly Available
SWE-BenchNot Publicly Available
HumanEvalNot Publicly Available
GSM8KNot Publicly Available
MATHNot Publicly Available
MMMUNot Publicly Available
Scores are from official provider publications. Empty fields indicate benchmarks not yet publicly disclosed.

Pricing History

Gemini 2.5 Flash Lite was released in 2026-05 by Google and is currently publicly available via AI API Hub.

Current Pricing: $0.10 per 1M input tokens · $0.40 per 1M output tokens. Pay-as-you-go with no minimum commitment.

Pricing Model: Token-based billing (pay per use). No subscription fees. No hidden costs.

💡 Google occasionally updates pricing. AI API Hub reflects current pricing in real-time. All prices in USD. Pay with USDT or USDC — no currency conversion fees.

Compare Alternatives

Frequently Asked Questions

What is Gemini 2.5 Flash Lite?

Gemini 2.5 Flash Lite is Google's current gemini model. Google's most affordable Gemini model. Ideal for high-volume, cost-sensitive applications like classification and simple extraction.. It features a 1M context window, supports Vision, Streaming, and is available through AI API Hub with USDT/USDC payments.

How much does Gemini 2.5 Flash Lite cost?

Gemini 2.5 Flash Lite pricing: $0.10 per 1M input tokens, $0.40 per 1M output tokens. Pay-as-you-go with no minimum commitment. Sign up at AI API Hub and start with as little as $5.

Gemini 2.5 Flash Lite vs GPT-5.5?

Gemini 2.5 Flash Lite: $0.10/$1M input, 1M context. GPT-5.5: $5.00/$1M input, 256K context. Gemini 2.5 Flash Lite is more cost-effective. Free tier. Compare them at /compare/gemini-2-5-flash-lite-vs-gpt-5.5/.

Gemini 2.5 Flash Lite context window?

Gemini 2.5 Flash Lite has a 1M context window, capable of processing up to 1,048,576 tokens in a single request. Maximum output tokens: 8,192.

Does Gemini 2.5 Flash Lite support function calling?

No, Gemini 2.5 Flash Lite does not natively support function calling. For function calling use cases, consider Google's flagship models.

Is Gemini 2.5 Flash Lite multimodal?

Partially — Gemini 2.5 Flash Lite supports vision (image input) but not native audio processing.

Gemini 2.5 Flash Lite API rate limits?

Gemini 2.5 Flash Lite rate limits: 10K RPM. Higher tier plans offer increased throughput. For high-volume production use, consider Google's faster variant models.

How to access Gemini 2.5 Flash Lite API?

Access Gemini 2.5 Flash Lite through AI API Hub: (1) Register at api.apiyihe.org/register?aff=8JZC, (2) Deposit USDT/USDC, (3) Get your API key instantly, (4) Use the OpenAI-compatible endpoint https://api.apiyihe.org/v1 with model name "gemini-2-5-flash-lite". Start building in under 30 seconds.

Get Gemini 2.5 Flash Lite API Access

Pay with USDT & USDC. Same model, up to 70% less.

إنشاء حساب
احصل على مفتاح API