API 参考
答安 API 完全兼容 OpenAI 接口格式,所有请求发送至统一 Base URL:
Base URL
https://model.token618.com/api/open-apis/v1认证方式
Authorization: Bearer YOUR_API_KEY接口概览
| 方法 | 路径 | 说明 |
|---|---|---|
| POST | /chat/completions | 对话补全(核心接口) |
| POST | /images/generations | 图像生成 |
| POST | /embeddings | 文本向量化 |
| GET | /models | 获取可用模型列表 |
Chat Completions
最常用的对话补全接口,支持流式(SSE)和非流式响应,兼容 OpenAI SDK 直接调用。
POST /chat/completions请求参数
| 参数 | 类型 | 必填 | 说明 |
|---|---|---|---|
| model | string | 必填 | 模型 ID(如 GLM-5.1、DeepSeek-V3.2) |
| messages | array | 必填 | 消息列表,每条包含 role(system / user / assistant / tool)和 content |
| stream | boolean | 选填 | 是否使用流式输出(默认 false) |
| temperature | number | 选填 | 采样温度,0-2 之间(默认 1)。值越高输出越随机 |
| max_tokens | integer | 选填 | 最大生成 Token 数。不设则使用模型默认值 |
| top_p | number | 选填 | 核采样概率阈值(默认 1),与 temperature 二选一 |
| stop | string / array | 选填 | 停止生成的标记,最多 4 个 |
| tools | array | 选填 | 可用的函数工具列表(见函数调用章节) |
| tool_choice | string / object | 选填 | 控制函数调用行为:auto、none、或指定函数名 |
| response_format | object | 选填 | 指定输出格式,如 { "type": "json_object" } |
| stream_options | object | 选填 | 流式选项,设为 { "include_usage": true } 可在最后获取用量 |
非流式请求示例
Request
curl https://model.token618.com/api/open-apis/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "GLM-5.1",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"stream": false
}'Response
{
"id": "chatcmpl-xxx",
"object": "chat.completion",
"model": "GLM-5.1",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 8,
"total_tokens": 28
}
}流式输出(Streaming / SSE)
设置 stream: true 后, 接口返回 Server-Sent Events (SSE) 格式的增量响应。每个 chunk 包含一小段生成的文本, 最后以 [DONE] 标记结束。
流式请求示例
Request
curl https://model.token618.com/api/open-apis/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "GLM-5.1",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'Response
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" How"},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" can I help"},"finish_reason":null}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" you today?"},"finish_reason":"stop"}]}
data: [DONE]流式用量统计
默认情况下流式响应不包含 usage 字段。如需获取 Token 用量,添加参数:
stream_options 参数
{
"model": "GLM-5.1",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true,
"stream_options": { "include_usage": true }
}设置后,在 [DONE] 之前的最后一个 chunk 将包含完整的 usage 统计。
SDK 调用示例
Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://model.token618.com/api/open-apis/v1"
)
stream = client.chat.completions.create(
model="GLM-5.1",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
stream_options={"include_usage": True},
)
for chunk in stream:
if chunk.choices and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
if chunk.usage:
print(f"\nUsage: {chunk.usage}")Node.js (OpenAI SDK)
import OpenAI from "daan";
const client = new OpenAI({
apiKey: "YOUR_API_KEY",
baseURL: "https://model.token618.com/api/open-apis/v1",
});
const stream = await client.chat.completions.create({
model: "GLM-5.1",
messages: [{ role: "user", content: "Hello!" }],
stream: true,
stream_options: { include_usage: true },
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
if (chunk.usage) {
console.log("\nUsage:", chunk.usage);
}
}函数调用(Function Calling / Tool Use)
支持通过 tools 参数让模型调用自定义函数。 模型会在需要时返回函数调用请求,你的代码执行后将结果传回模型继续对话。
请求示例
Python (Function Calling)
from openai import OpenAI
import json
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://model.token618.com/api/open-apis/v1"
)
# Step 1: 发送带 tools 的请求
response = client.chat.completions.create(
model="GLM-5.1",
messages=[{"role": "user", "content": "北京今天天气怎么样?"}],
tools=[{
"type": "function",
"function": {
"name": "get_weather",
"description": "获取指定城市的天气信息",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "城市名称"
}
},
"required": ["city"]
}
}
}],
tool_choice="auto"
)
message = response.choices[0].message
# Step 2: 检查模型是否要调用函数
if message.tool_calls:
tool_call = message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
# Step 3: 执行你的函数逻辑
weather_result = get_weather(args["city"])
# Step 4: 将结果返回给模型
response2 = client.chat.completions.create(
model="GLM-5.1",
messages=[
{"role": "user", "content": "北京今天天气怎么样?"},
message,
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(weather_result, ensure_ascii=False)
},
]
)
print(response2.choices[0].message.content)Response
// 模型返回 tool_calls 时的响应格式
{
"choices": [{
"message": {
"role": "assistant",
"content": null,
"tool_calls": [{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"city\":\"北京\"}"
}
}]
},
"finish_reason": "tool_calls"
}]
}finish_reason 取值说明
响应中 finish_reason 字段表示生成结束的原因:
| 值 | 说明 |
|---|---|
| stop | 模型正常完成输出,或遇到了 stop 序列 |
| length | 输出达到 max_tokens 限制而被截断 |
| tool_calls | 模型决定调用函数 |
| content_filter | 内容被安全过滤器拦截 |
| error | 生成过程中发生错误(流式场景下 HTTP 状态码仍为 200) |
Images
图像生成接口,支持 DALL-E 等模型。
POST /images/generationsRequest
curl https://model.token618.com/api/open-apis/v1/images/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "z-image-turbo",
"prompt": "A cute cat wearing a hat",
"n": 1,
"size": "1024x1024"
}'Embeddings
文本向量化接口,支持多种 Embedding 模型。
POST /embeddingsRequest
curl https://model.token618.com/api/open-apis/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "text-embedding-3-small",
"input": "Hello world"
}'Models
获取当前账户可用的模型列表。
GET /modelsRequest
curl https://model.token618.com/api/open-apis/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"Response
{
"object": "list",
"data": [
{
"id": "GLM-5.1",
"object": "model",
"owned_by": "daan"
},
{
"id": "deepseek-chat",
"object": "model",
"owned_by": "deepseek"
}
]
}错误处理
答安 API 采用两级错误返回机制,确保不同场景下错误信息可被正确捕获。
非流式请求 / 流式请求开始前
返回标准 HTTP 错误状态码和 JSON 错误体,可正常通过 HTTP 状态码判断错误类型。
Response
{
"error": {
"code": 401,
"message": "Invalid API key provided",
"metadata": {}
}
}流式请求进行中(Mid-stream Error)
如果错误发生在流式传输过程中,HTTP 状态码仍为 200,但 SSE 事件中会包含错误信息, 同时 finish_reason 为 "error"。
Response
data: {"error":{"code":502,"message":"Provider error"},"choices":[{"index":0,"delta":{"content":null},"finish_reason":"error"}]}
data: [DONE]错误码一览
| 状态码 | 类型 | 说明 |
|---|---|---|
| 400 | 参数错误 | 请求参数格式错误或缺少必填参数 |
| 401 | 认证失败 | API Key 无效、未提供或已过期 |
| 402 | 余额不足 | 账户余额不足以支付本次请求 |
| 403 | 权限拒绝 | 无权访问该模型或资源 |
| 408 | 请求超时 | 请求处理超时,建议稍后重试 |
| 429 | 频率限制 | 请求频率超限,请降低并发或联系客服提升限额 |
| 500 | 服务错误 | 服务端内部错误,答安会自动重试或切换渠道 |
| 502 | 上游错误 | 上游供应商返回错误,答安会自动尝试备用渠道 |
| 503 | 暂时不可用 | 模型暂时不可用,通常几分钟后恢复 |
重试建议
- 遇到 429/502/503 时建议使用指数退避重试(Exponential Backoff)
- 401/402/403 属于业务错误,重试不会生效,请检查 API Key 和余额
- 答安网关已内置故障转移机制,500/502 会自动尝试备用渠道