Zhipu AIActive

GLM-5-Turbo API

Faster variant of GLM-5.1 at a slightly lower price. Good for interactive applications.

💰 Save up to 70% vs official Zhipu AI pricing

TL;DR

Price: $0.70/1M input · $3.10/1M output

Context: 128K · max 8,192 output

Provider: Zhipu AI

Cost advantage: Cheaper than official API · No credit card

GLM-5-Turbo — cheaper than the official Zhipu AI API

Access GLM-5-Turbo through AI API Hub and pay less per token. Same OpenAI-compatible endpoint, lower cost.

$0.70/1M

input token price

INPUT / 1M tokens

$0.70

OUTPUT / 1M tokens

$3.10

CONTEXT WINDOW

128K

Technical Specifications

Provider	Zhipu AI
Model Family	GLM-5-Turbo
Release Date	2026-05
Context Window	128K
Max Output Tokens	8,192
Input Price	$0.70 / 1M tokens
Output Price	$3.10 / 1M tokens
Vision Support	No
Function Calling	Yes ✓
JSON Mode	No
Streaming	Yes ✓
Fine Tuning	Not Available
Status	Active ✓

Overview

GLM-5-Turbo is Zhipu AI's current glm model, released in 2026-05. Faster variant of GLM-5.1 at a slightly lower price. Good for interactive applications.

For developers, the headline numbers are a 128K context window and up to 8,192 output tokens per response — enough headroom for faster glm-5 and slightly cheaper without chunking your input. Priced at $0.70/1M input and $3.10/1M output, it sits in the budget tier — ideal for high-volume pipelines where token cost dominates.

On the capability side, GLM-5-Turbo exposes 4 features: Web Search, Tool Use, Bilingual, Streaming. Note that fine-tuning isn't supported — you'll work with the base model. It's text-only, so route image or audio workloads elsewhere.

The practical appeal of routing GLM-5-Turbo through AI API Hub is simplicity: one OpenAI-compatible endpoint, USDT & USDC payments, no credit card, and you're calling the API in under 30 seconds — just swap your base URL.

What Makes GLM-5-Turbo Different

How GLM-5-Turbo is used

GLM-5-Turbo is used for general-purpose text tasks — chat, summarization, drafting, classification, and extraction. Function calling extends it to agent-style workflows where it invokes external APIs. For specialized workloads (coding, reasoning, vision), a purpose-tuned sibling may perform better.

Pricing position within Zhipu AI

GLM-5-Turbo sits in the middle of Zhipu AI's pricing at $0.70/1M input — 69% above the lineup average ($0.00 cheapest, $0.85 most expensive). 2 siblings cost less, 1 cost more. This mid-tier positioning makes it a sensible default when you're unsure which variant to pick.

GLM-5-Turbo's role in the lineup

Within Zhipu AI's lineup, GLM-5-Turbo is a mid-tier option — balanced between cost and capability. The glm family has 4 active variants, and GLM-5-Turbo occupies the upper end. This makes it a safe default for production workloads where you're not sure which tier to pick.

Real-world use cases

Real-world deployments: customer support chatbots, content drafting and summarization, classification pipelines, and extraction workflows. GLM-5-Turbo handles the standard text-in/text-out case reliably — route specialized tasks (vision, coding, reasoning) to purpose-tuned siblings.

vs sibling models

What makes GLM-5-Turbo different from sibling models: compared to GLM-5.1 ($0.15/1M more expensive, same 128K context); GLM-4.5-Air ($0.59/1M cheaper, same 128K context); GLM-4.7 Flash ($0.70/1M cheaper, same 128K context). Choose GLM-5-Turbo when a balanced cost-to-capability ratio fits your workload.

API Examples

Python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyihe.org/v1"
)

response = client.chat.completions.create(
    model="glm-5-turbo",
    messages=[
        {"role": "user", "content": "Hello"}
    ]
)

print(response.choices[0].message.content)

JavaScript / Node.js

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.API_KEY,
  baseURL: "https://api.apiyihe.org/v1"
});

const response = await client.chat.completions.create({
  model: "glm-5-turbo",
  messages: [
    { role: "user", content: "Hello" }
  ]
});

console.log(response.choices[0].message.content);

cURL

curl https://api.apiyihe.org/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "glm-5-turbo",
    "messages": [
      {"role": "user", "content": "Hello"}
    ]
  }'

Supported Features

Vision / Image Input	❌ Not Available
Audio / Voice Input	❌ Not Available
Function Calling	✅ Supported
JSON Mode	❌ Not Available
Streaming	✅ Supported
Fine-Tuning	❌ Not Available
Multimodal	❌ Not Available

Benchmark Scores

Benchmark	Score
MMLU	Not Publicly Available
GPQA	Not Publicly Available
SWE-Bench	Not Publicly Available
HumanEval	Not Publicly Available
GSM8K	Not Publicly Available
MATH	Not Publicly Available
MMMU	Not Publicly Available

Scores are from official provider publications. Empty fields indicate benchmarks not yet publicly disclosed.

Pricing History

GLM-5-Turbo was released in 2026-05 by Zhipu AI and is currently publicly available via AI API Hub.

Current Pricing: $0.70 per 1M input tokens · $3.10 per 1M output tokens. Pay-as-you-go with no minimum commitment.

Pricing Model: Token-based billing (pay per use). No subscription fees. No hidden costs.

💡 Zhipu AI occasionally updates pricing. AI API Hub reflects current pricing in real-time. All prices in USD. Pay with USDT or USDC — no currency conversion fees.

Compare Alternatives

GPT-5.5$5.00/1M

OpenAI · 256K context

GPT-5.4$2.50/1M

OpenAI · 256K context

GLM-5-Turbo vs GPT-5.5 GLM-5-Turbo vs GPT-5.4 GLM-5-Turbo vs GPT-4.1 GLM-5-Turbo vs GPT-4.1 Mini

Frequently Asked Questions

What is GLM-5-Turbo?

GLM-5-Turbo is Zhipu AI's current glm model. Faster variant of GLM-5.1 at a slightly lower price. Good for interactive applications. It offers a 128K context window and supports Web Search, Tool Use, Bilingual. You can access it through AI API Hub using USDT or USDC — no credit card required.

How much does GLM-5-Turbo cost?

GLM-5-Turbo is priced at $0.70 per 1M input tokens and $3.10 per 1M output tokens, billed pay-as-you-go with no minimum. Through AI API Hub you can start with as little as $5 and scale from there.

GLM-5-Turbo vs GPT-5.5?

They're built for different jobs. GLM-5-Turbo costs $0.70/1M input with a 128K window; GPT-5.5 runs $5.00/1M input with 256K. GLM-5-Turbo is the more cost-effective pick and still brings faster glm-5. See the full side-by-side at /compare/glm-5-turbo-vs-gpt-5.5/.

GLM-5-Turbo context window?

GLM-5-Turbo has a 128K context window, capable of processing up to 128,000 tokens in a single request. Maximum output tokens: 8,192.

Does GLM-5-Turbo support function calling?

Yes, GLM-5-Turbo supports function/tool calling, allowing you to define functions that the model can invoke. This enables AI agents, API integrations, and structured data extraction.

Is GLM-5-Turbo multimodal?

No, GLM-5-Turbo is a text-only model. For multimodal use cases, consider models with vision/audio capabilities.

GLM-5-Turbo API rate limits?

GLM-5-Turbo rate limits: 10K RPM. Higher tier plans offer increased throughput. For high-volume production use, consider Zhipu AI's faster variant models.

How to access GLM-5-Turbo API?

Access GLM-5-Turbo through AI API Hub: (1) Register at api.apiyihe.org/register?aff=8JZC, (2) Deposit USDT/USDC, (3) Get your API key instantly, (4) Use the OpenAI-compatible endpoint https://api.apiyihe.org/v1 with model name "glm-5-turbo". Start building in under 30 seconds.

Get GLM-5-Turbo API Access

Pay with USDT & USDC. Same model, up to 70% less.

Créer un Compte