Alibaba's Qwen (Tongyi Qianwen) family of models has become one of the strongest contenders in the AI landscape, especially for multilingual tasks, long-context understanding, and instruction following. In many benchmarks, Qwen-Max rivals GPT-4o and Claude 3.5 Sonnet.
The challenge for international developers? Qwen's official API is served through Alibaba Cloud (Aliyun), which typically requires Chinese identity verification and payment through Chinese channels. This effectively locks out most developers outside China.
Alibaba's flagship model. Strongest reasoning and generation capabilities in the Qwen family. Comparable to GPT-4o on many benchmarks.
Pricing: $0.40 / 1M input tokens · $1.20 / 1M output tokens
Balanced performance and cost. Great for production applications that need strong capabilities without premium pricing.
Pricing: $0.20 / 1M input tokens · $0.60 / 1M output tokens
Fastest and most affordable. Ideal for high-volume tasks like classification, extraction, and simple Q&A. Excellent cost-efficiency.
Pricing: $0.06 / 1M input tokens · $0.20 / 1M output tokens
Step 1: Sign up at aiapi-pro.com with your email. No phone number or identity verification needed.
Step 2: Get your API key from the dashboard (starts with nvai-).
Step 3: Make your first API call:
from openai import OpenAI
client = OpenAI(
api_key="nvai-your-api-key",
base_url="https://aiapi-pro.com/v1"
)
# Try Qwen-Max for strongest performance
response = client.chat.completions.create(
model="qwen-max",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Compare React and Vue for a new project"}
],
temperature=0.7
)
print(response.choices[0].message.content)
Multilingual excellence: Qwen was trained with a strong emphasis on Chinese and English, but it also performs well in Japanese, Korean, French, German, Spanish, and more. If you're building multilingual products, Qwen is arguably the best value.
Long context: Qwen-Max supports up to 32K context window, and Qwen-Turbo handles 128K tokens. This makes it excellent for document analysis, code review, and conversation summarization.
Instruction following: Qwen models excel at structured output, JSON generation, and following complex multi-step instructions — critical for building reliable AI agents and pipelines.
| Use Case | Best Model | Why |
|---|---|---|
| Code generation | DeepSeek-v3.2 | Best coding benchmarks |
| Chinese/English translation | Qwen-Max | Native bilingual training |
| Document analysis (long) | Moonshot-128K | 128K native context |
| High-volume classification | Qwen-Turbo | $0.06/1M input, fast |
| General assistant | Qwen-Plus | Best performance/price ratio |
| Vision tasks | GLM-4.6V | Multimodal input |
| Free testing | GLM-4.6V-Flash | Completely free |
The beauty of NovAI is that all these models are accessible through the same API key and endpoint. You can route different tasks to different models based on their strengths.
import httpx, json
url = "https://aiapi-pro.com/v1/chat/completions"
headers = {
"Authorization": "Bearer nvai-your-api-key",
"Content-Type": "application/json"
}
payload = {
"model": "qwen-turbo",
"messages": [{"role": "user", "content": "Write a haiku about programming"}],
"stream": True
}
with httpx.stream("POST", url, json=payload, headers=headers) as r:
for line in r.iter_lines():
if line.startswith("data: ") and "[DONE]" not in line:
chunk = json.loads(line[6:])
delta = chunk["choices"][0]["delta"].get("content", "")
print(delta, end="", flush=True)
Free account, free GLM model for testing, and $5 minimum top-up when you're ready. 8 Chinese AI models, one API key.
Get Started Free →