OpenAI Responses API for Chinese AI Models: Use GLM-5, DeepSeek & Qwen in Any Agent Tool

Published March 16, 2026 · 12 min read

TL;DR

East Signal is the first and only API platform in the world that supports OpenAI's new Responses API (/v1/responses) for Chinese AI models. This means tools like Open Cowork, OpenAI Agents SDK, and any application built on the Responses API can now use GLM-5, DeepSeek-v3.2, Qwen-Max, and more — at a fraction of GPT-4o pricing.

The Problem: Chinese Models Are Locked Out of Modern Agent Tools

In early 2025, OpenAI introduced the Responses API — a new unified endpoint (POST /v1/responses) designed to replace the legacy Chat Completions API for agent-based applications. Tools like Open Cowork, the OpenAI Agents SDK, and an increasing number of AI agent frameworks have adopted this new API format.

Here's the problem: no Chinese AI provider supports the Responses API. Zhipu (GLM-5), DeepSeek, Alibaba (Qwen), MiniMax, and Moonshot all exclusively offer the legacy /v1/chat/completions endpoint. If you try to point Open Cowork or any Responses API tool at these providers, you'll get a 404 Not Found error.

This creates a massive gap: the most cost-effective AI models in the world (Chinese models are 10-100x cheaper than GPT-4o) are completely inaccessible to the fastest-growing category of AI tools.

Before East Signal:
Open Cowork / Agents SDK → POST /v1/responses → Zhipu API → 404 Not Found

With East Signal:
Open Cowork / Agents SDK → POST /v1/responses → East Signal → auto-translate → /chat/completions → GLM-5 / DeepSeek / Qwen → 200 OK

The Solution: East Signal's Responses API Compatibility Layer

East Signal now provides a fully compliant /v1/responses endpoint that automatically translates between OpenAI's Responses API format and the Chat Completions format used by Chinese providers. The translation is seamless and invisible to the client application.

What East Signal handles behind the scenes:

Responses API Feature	Translation	Status
`input` (string or array)	→ `messages` array	Supported
`instructions`	→ system message	Supported
Multimodal content (images)	→ `image_url` format	Supported
Streaming (SSE events)	→ Full 9-event sequence	Supported
Function/tool calling	→ `tools` array	Supported
Reasoning content	→ Thinking/CoT output	Supported
`developer` role	→ `system` role	Supported

Which Models Are Available?

Model	Provider	Input Price	Output Price	vs GPT-4o
`glm-5`	Zhipu AI	$0.004/1M	$0.004/1M	1,250x cheaper
`deepseek-v3.2`	DeepSeek	$0.20/1M	$0.40/1M	25x cheaper
`qwen-max`	Alibaba	$1.60/1M	$6.40/1M	3x cheaper
`qwen-plus`	Alibaba	$0.40/1M	$1.20/1M	12x cheaper
`qwen-turbo`	Alibaba	$0.05/1M	$0.20/1M	100x cheaper
`minimax-text-01`	MiniMax	$0.20/1M	$1.60/1M	10x cheaper
`glm-4.6v`	Zhipu AI	$0.40/1M	$1.20/1M	12x cheaper
`glm-4.6v-flash`	Zhipu AI	Free	Free	Infinite savings

GLM-5 deserves special attention: Zhipu's latest flagship model includes a built-in chain-of-thought reasoning engine, comparable to GPT-4o-level quality, at $0.004 per million tokens. That's over 1,000x cheaper than OpenAI. Until now, international users had no way to use this model with modern agent tools.

Quick Start: 3 Steps

Step 1: Get Your East Signal API Key

Step 2: Configure Your Tool

In any Responses API-compatible tool, set:

Base URL:  https://aiapi-pro.com/v1
API Key:   nvai-your-key-here
Model:     glm-5

Step 3: Start Using It

That's it. No code changes, no middleware, no proxies. The tool will send Responses API requests and receive proper Responses API responses — East Signal handles all the translation invisibly.

Setup Guide: Open Cowork

Open Cowork is an open-source AI desktop assistant that uses the Responses API exclusively. Here's how to connect it to Chinese models through East Signal:

Open Open Cowork → Settings (gear icon)
Set API Provider to OpenAI
Set Base URL to https://aiapi-pro.com/v1
Paste your East Signal API key in the API Key field
Set Model to glm-5 (or deepseek-v3.2, qwen-max, etc.)
Click Save and start chatting

Previously: Configuring Open Cowork with Zhipu's API directly would return a 404 error because Zhipu doesn't support /v1/responses. East Signal solves this completely.

Setup Guide: OpenAI Agents SDK (Python)

The OpenAI Agents SDK uses the Responses API internally. To route it through East Signal:

from openai import OpenAI
from agents import Agent, Runner

# Point the client at East Signal
client = OpenAI(
    api_key="nvai-your-key-here",
    base_url="https://aiapi-pro.com/v1"
)

agent = Agent(
    name="My Agent",
    instructions="You are a helpful coding assistant.",
    model="deepseek-v3.2",
)

result = Runner.run_sync(agent, "Write a Python function to sort a list")
print(result.final_output)

Every agent call will use DeepSeek at $0.20/1M tokens instead of GPT-4o at $5/1M — a 25x cost reduction with comparable code quality.

API Reference: POST /v1/responses

Request

POST https://aiapi-pro.com/v1/responses
Authorization: Bearer nvai-your-key
Content-Type: application/json

{
  "model": "glm-5",
  "input": "Explain quantum computing in simple terms.",
  "stream": true,
  "instructions": "You are a physics tutor for beginners.",
  "temperature": 0.7,
  "max_output_tokens": 2000
}

Non-Streaming Response

{
  "id": "resp_abc123...",
  "object": "response",
  "created_at": 1773595519,
  "status": "completed",
  "model": "glm-5",
  "output": [
    {
      "id": "msg_xyz789...",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Quantum computing uses quantum bits...",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 15,
    "output_tokens": 200,
    "total_tokens": 215
  }
}

Streaming Events

When "stream": true, East Signal returns the full OpenAI-compliant SSE event sequence:

event: response.created
event: response.in_progress
event: response.output_item.added
event: response.content_part.added
event: response.output_text.delta    (repeated for each token)
event: response.output_text.done
event: response.content_part.done
event: response.output_item.done
event: response.completed

This is byte-for-byte compatible with OpenAI's own streaming format. Any library or framework that parses OpenAI Responses API streams will work without modification.

Supported Input Formats

The input field accepts the same formats as OpenAI's Responses API:

Simple String

{"model": "glm-5", "input": "Hello, world!"}

Conversation Array

{"model": "glm-5", "input": [
  {"role": "developer", "content": "You are a code reviewer."},
  {"role": "user", "content": "Review this function: def add(a,b): return a+b"}
]}

Multimodal (Vision)

{"model": "glm-4.6v", "input": [
  {"role": "user", "content": [
    {"type": "input_text", "text": "What's in this image?"},
    {"type": "input_image", "image_url": "https://example.com/photo.jpg"}
  ]}
]}

Tool/Function Calls

{
  "model": "deepseek-v3.2",
  "input": "What's the weather in Tokyo?",
  "tools": [
    {
      "type": "function",
      "name": "get_weather",
      "description": "Get current weather for a city",
      "parameters": {
        "type": "object",
        "properties": {
          "city": {"type": "string"}
        },
        "required": ["city"]
      }
    }
  ]
}

Platform Compatibility

Platform / Tool	Direct Chinese API	Via East Signal
Open Cowork	404 Error	Works
OpenAI Agents SDK	Not Compatible	Works
Custom Responses API Apps	Not Compatible	Works
OpenAI Python SDK (responses)	Not Compatible	Works
Cursor / Continue IDE	Works (chat/completions)	Works (both APIs)
LangChain / LlamaIndex	Works (chat/completions)	Works (both APIs)

Why This Matters

The AI industry is moving rapidly toward agent-based architectures. OpenAI's Responses API is becoming the standard interface for these systems. By bridging this gap, East Signal enables a new paradigm:

Build with the latest agent frameworks. Pay Chinese model prices.

An agent pipeline that would cost $50/day with GPT-4o can run for $0.04/day with GLM-5 through East Signal — without changing a single line of code in your agent framework.

East Signal supports both the legacy /v1/chat/completions endpoint and the new /v1/responses endpoint. You can use whichever your application needs, or both simultaneously. All models are available through both endpoints.