pip install openai>=1.40.0
That's the only dependency. Doubao is OpenAI-compatible, so the standard openai package handles everything.
Sign up at aiapi-pro.com — no credit card, no phone number, $0.50 free credit. Copy the sk-... key from the dashboard. Or set it directly:
export NOVAI_API_KEY="sk-YOUR_KEY"
import os
from openai import OpenAI
client = OpenAI(
base_url="https://aiapi-pro.com/v1",
api_key=os.environ["NOVAI_API_KEY"],
)
response = client.chat.completions.create(
model="doubao-seed-2.0-pro",
messages=[{"role": "user", "content": "What is the capital of France?"}],
)
print(response.choices[0].message.content)
# → "The capital of France is Paris."
That's it. The only differences from a stock OpenAI call:
If you're rendering tokens in a UI, streaming is essential — first-token latency feels instant.
stream = client.chat.completions.create(
model="doubao-seed-2.0-pro",
messages=[{"role": "user", "content": "Write a 200-word story about a developer who finally found a cheaper LLM."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
print()
Chunks arrive within ~300ms of the request from a Hong Kong client.
If you need to fire 100 requests in parallel, use AsyncOpenAI:
import asyncio
from openai import AsyncOpenAI
aclient = AsyncOpenAI(
base_url="https://aiapi-pro.com/v1",
api_key=os.environ["NOVAI_API_KEY"],
)
async def summarize(text: str) -> str:
r = await aclient.chat.completions.create(
model="doubao-seed-2.0-pro",
messages=[{"role": "user", "content": f"Summarize in 1 sentence:\n\n{text}"}],
max_tokens=80,
)
return r.choices[0].message.content
async def main():
docs = ["The quick brown fox..."] * 50
results = await asyncio.gather(*(summarize(d) for d in docs))
for r in results[:3]:
print("-", r)
asyncio.run(main())
50 parallel requests typically complete in 4–6 seconds. You'll hit the free-tier rate limit (10 RPM) quickly — top up to lift it.
Doubao supports the OpenAI tool-calling protocol. Same JSON schema as GPT.
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather in a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["city"]
}
}
}]
resp = client.chat.completions.create(
model="doubao-seed-2.0-pro",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=tools,
)
call = resp.choices[0].message.tool_calls[0]
print(call.function.name, call.function.arguments)
# → get_weather {"city":"Tokyo","unit":"celsius"}
| Error | What it means | Fix |
|---|---|---|
| openai.AuthenticationError: 401 | Bad / missing API key | Check key starts with sk- and has no trailing whitespace |
| openai.NotFoundError: model not found | Wrong model ID | Use exactly doubao-seed-2.0-pro (lowercase, hyphens not underscores) |
| openai.RateLimitError: 429 | Free tier RPM cap | Add tenacity backoff or top up to remove the cap |
| httpx.ConnectError | Network blocked | Verify outbound HTTPS to aiapi-pro.com is allowed; some corporate proxies need configuration |
| Empty message.content | Model returned a tool call instead | Check message.tool_calls first when you pass tools |
| Slow first token (>3s) | Cold-start in low-traffic region | Use stream=True; warm-up call after each deploy |
from tenacity import retry, wait_exponential, stop_after_attempt
from openai import RateLimitError, APITimeoutError
@retry(
wait=wait_exponential(multiplier=1, min=1, max=20),
stop=stop_after_attempt(5),
retry=lambda e: isinstance(e.exception(), (RateLimitError, APITimeoutError)),
)
def chat(prompt: str) -> str:
r = client.chat.completions.create(
model="doubao-seed-2.0-pro",
messages=[{"role": "user", "content": prompt}],
timeout=30,
)
return r.choices[0].message.content
$0.50 free credit. No credit card. The code above will run as soon as you paste your key.
Get a Doubao Key →Code examples tested with openai-python 1.40.0 in May 2026. SDK upgrades are backward-compatible for the patterns shown here.