Using Doubao API with Python in 5 Minutes (2026 Tutorial)

What you'll have at the end: a working Python script that calls Doubao-Seed-2.0-Pro for chat completions (sync), streaming responses, async high-throughput, and function calling. All using the standard OpenAI Python SDK — no proprietary library needed.

1Install the SDK

pip install openai>=1.40.0

That's the only dependency. Doubao is OpenAI-compatible, so the standard openai package handles everything.

2Get a Key (Free)

Sign up at aiapi-pro.com — no credit card, no phone number, $0.50 free credit. Copy the sk-... key from the dashboard. Or set it directly:

export NOVAI_API_KEY="sk-YOUR_KEY"

3Your First Call (3 Lines That Matter)

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://aiapi-pro.com/v1",
    api_key=os.environ["NOVAI_API_KEY"],
)

response = client.chat.completions.create(
    model="doubao-seed-2.0-pro",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)

print(response.choices[0].message.content)
# → "The capital of France is Paris."

That's it. The only differences from a stock OpenAI call:

base_url points to NovAI
model is doubao-seed-2.0-pro (or doubao-seed-2.0-lite, doubao-1.5-vision, etc.)

4Streaming (for chat UIs)

If you're rendering tokens in a UI, streaming is essential — first-token latency feels instant.

stream = client.chat.completions.create(
    model="doubao-seed-2.0-pro",
    messages=[{"role": "user", "content": "Write a 200-word story about a developer who finally found a cheaper LLM."}],
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)
print()

Chunks arrive within ~300ms of the request from a Hong Kong client.

5Async (for batch / high-throughput)

If you need to fire 100 requests in parallel, use AsyncOpenAI:

import asyncio
from openai import AsyncOpenAI

aclient = AsyncOpenAI(
    base_url="https://aiapi-pro.com/v1",
    api_key=os.environ["NOVAI_API_KEY"],
)

async def summarize(text: str) -> str:
    r = await aclient.chat.completions.create(
        model="doubao-seed-2.0-pro",
        messages=[{"role": "user", "content": f"Summarize in 1 sentence:\n\n{text}"}],
        max_tokens=80,
    )
    return r.choices[0].message.content

async def main():
    docs = ["The quick brown fox..."] * 50
    results = await asyncio.gather(*(summarize(d) for d in docs))
    for r in results[:3]:
        print("-", r)

asyncio.run(main())

50 parallel requests typically complete in 4–6 seconds. You'll hit the free-tier rate limit (10 RPM) quickly — top up to lift it.

6Function Calling (tools)

Doubao supports the OpenAI tool-calling protocol. Same JSON schema as GPT.

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather in a city",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {"type": "string"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["city"]
        }
    }
}]

resp = client.chat.completions.create(
    model="doubao-seed-2.0-pro",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools,
)

call = resp.choices[0].message.tool_calls[0]
print(call.function.name, call.function.arguments)
# → get_weather {"city":"Tokyo","unit":"celsius"}

7Common Errors & Fixes

Error	What it means	Fix
openai.AuthenticationError: 401	Bad / missing API key	Check key starts with sk- and has no trailing whitespace
openai.NotFoundError: model not found	Wrong model ID	Use exactly doubao-seed-2.0-pro (lowercase, hyphens not underscores)
openai.RateLimitError: 429	Free tier RPM cap	Add tenacity backoff or top up to remove the cap
httpx.ConnectError	Network blocked	Verify outbound HTTPS to aiapi-pro.com is allowed; some corporate proxies need configuration
Empty message.content	Model returned a tool call instead	Check message.tool_calls first when you pass tools
Slow first token (>3s)	Cold-start in low-traffic region	Use stream=True; warm-up call after each deploy

Retry helper (recommended for production)

from tenacity import retry, wait_exponential, stop_after_attempt
from openai import RateLimitError, APITimeoutError

@retry(
    wait=wait_exponential(multiplier=1, min=1, max=20),
    stop=stop_after_attempt(5),
    retry=lambda e: isinstance(e.exception(), (RateLimitError, APITimeoutError)),
)
def chat(prompt: str) -> str:
    r = client.chat.completions.create(
        model="doubao-seed-2.0-pro",
        messages=[{"role": "user", "content": prompt}],
        timeout=30,
    )
    return r.choices[0].message.content

You're 5 Minutes In. Now Get Your Key.

$0.50 free credit. No credit card. The code above will run as soon as you paste your key.

Get a Doubao Key →

What's Next?

Try Doubao Pro for reasoning — give it a 100K-token document and ask 5 questions in one prompt
Try Doubao Code for code generation — switch model="doubao-1.5-code"
Try Doubao Lite for high-volume background jobs at $0.075/1M input
Read the production guide for SLA, rate-limit, and reliability detail before you ship

Code examples tested with openai-python 1.40.0 in May 2026. SDK upgrades are backward-compatible for the patterns shown here.