DeepSeek API Python Guide: V3 & R1 Integration

Introduction

DeepSeek provides GPT-4 level performance at 90% lower cost. This guide covers integrating DeepSeek V3 and R1 into Python applications through AI API Hub's OpenAI-compatible endpoint.

Why DeepSeek?

| Feature | DeepSeek V3 | GPT-4o | |---|---|---| | Input Price | $0.27/1M | $2.50/1M | | Output Price | $1.10/1M | $10.00/1M | | Context | 128K | 128K | | Streaming | Yes | Yes | | Code Generation | Excellent | Excellent |

Quick Setup

import openai

client = openai.OpenAI(
    api_key="your-api-key",
    base_url="https://api.apiyihe.org/v1"
)

Basic Completion

response = client.chat.completions.create(
    model="deepseek-v3",
    messages=[
        {"role": "system", "content": "You are a Python expert."},
        {"role": "user", "content": "Write a function to find the longest palindrome in a string."}
    ],
    temperature=0.3,
    max_tokens=2000
)

print(response.choices[0].message.content)

Streaming with DeepSeek

stream = client.chat.completions.create(
    model="deepseek-v3",
    messages=[{"role": "user", "content": "Explain machine learning"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

DeepSeek R1: Chain-of-Thought Reasoning

response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[{
        "role": "user",
        "content": "Solve: If a train leaves Station A at 60 mph and another leaves Station B at 80 mph, when do they meet?"
    }],
    temperature=0.1,
    max_tokens=3000
)
# R1 outputs reasoning steps before the final answer
print(response.choices[0].message.content)

Function Calling with DeepSeek V3

import json

def search_database(query: str) -> list:
    db = {"users": [{"name": "Alice"}, {"name": "Bob"}]}
    return db.get(query, [])

functions = [{
    "name": "search_database",
    "description": "Search the database",
    "parameters": {
        "type": "object",
        "properties": {"query": {"type": "string"}},
        "required": ["query"]
    }
}]

response = client.chat.completions.create(
    model="deepseek-v3",
    messages=[{"role": "user", "content": "Find all users"}],
    functions=functions,
    function_call="auto"
)

msg = response.choices[0].message
if msg.function_call:
    result = search_database(**json.loads(msg.function_call.arguments))
    print(f"Function result: {result}")

Cost Optimization

from collections import defaultdict

class CostTracker:
    def __init__(self):
        self.total_input = 0
        self.total_output = 0
        self.calls = 0

    def record(self, usage):
        self.total_input += usage.prompt_tokens
        self.total_output += usage.completion_tokens
        self.calls += 1

    @property
    def cost(self):
        return (self.total_input / 1_000_000 * 0.27 +
                self.total_output / 1_000_000 * 1.10)

tracker = CostTracker()
response = client.chat.completions.create(model="deepseek-v3", messages=[...])
tracker.record(response.usage)
print(f"Total cost: ${tracker.cost:.4f}")

Node.js Quick Start

import OpenAI from "openai";
const client = new OpenAI({
  apiKey: "your-api-key",
  baseURL: "https://api.apiyihe.org/v1",
});
const response = await client.chat.completions.create({
  model: "deepseek-v3",
  messages: [{ role: "user", content: "Hello DeepSeek!" }],
});
console.log(response.choices[0].message.content);

Common Issues

| Issue | Fix | |---|---| | Model returns Chinese | Set system prompt to "Respond in English" | | Function calling not working | Use DeepSeek V3 (R1 doesn't support functions) | | Slow responses | Reduce max_tokens or temperature | | 429 Rate Limit | Implement exponential backoff |

FAQ

Q: Is DeepSeek V3 as good as GPT-4o for coding? A: Yes, benchmarks show comparable performance. For many tasks, DeepSeek matches or exceeds GPT-4o at a fraction of the cost.

Q: Can I switch between DeepSeek and OpenAI in the same code? A: Yes. Change the model parameter. The API format is identical.

Q: Does DeepSeek support image inputs? A: No. DeepSeek V3 and R1 are text-only. Use GPT-4o or Gemini for multimodal tasks.

Q: What's the maximum context length? A: 128,000 tokens for both V3 and R1.