TokenLX
QWEN

alibaba/qwen-plus

1M  context$0.12/M tokens input$0.3/M tokens output5.70B  tokens servedText

Balanced Qwen tier with strong Chinese + reasonable cost. The pragmatic default for production Chinese apps.

Key strengths

  • Excellent Chinese fluency
  • Moderate cost
  • Solid reasoning
  • Tool calling support

Use cases

  • Chinese customer service
  • Content generation
  • Document summarization
  • Localization
chinesebalanced

Alibaba's alibaba/qwen-plus is a frontier text generation model in the Qwen family. It excels at complex reasoning, agentic workflows, code generation, and long-form writing tasks, with native support for streaming, tool calling, JSON mode, and multi-turn conversations.

The model handles long-context inputs gracefully and is particularly effective for software engineering, multi-step research, and end-to-end project execution. Its tokenizer and pricing are optimized for high-throughput production workloads, with a competitive cost profile relative to other models in its tier.

alibaba/qwen-plus is fully OpenAI-compatible — drop in your existing OpenAI Python or Node SDK and switch `baseURL` to `https://api.tokenlx.ai`. TokenLX transparently routes your requests to the optimal provider endpoint while preserving streaming, function-calling, and structured-output semantics.

Performance

Compare different providers across TokenLX · All locations.

Throughput
66
tok/s
Latency
127
ms
E2E Latency
200
ms
Tool Call Errors
0.07
%
Output Errors
0.38
%
Time to First Token
95
ms

Effective Pricing

Actual cost per million tokens across providers over the past 7 days.

Input
$0.12
per 1M tokens
7d agotoday
Output
$0.3
per 1M tokens
7d agotoday
Cache read
$0.02
per 1M tokens
7d agotoday
Cache write
$0.12
per 1M tokens
7d agotoday
Reasoning
$1.18
per 1M tokens
7d agotoday
Input tiers
0M - 0.128M$0.12per 1M tokens
0.128M - 0.256M$0.35per 1M tokens
0.256M - 1M$0.71per 1M tokens
Output tiers
0M - 0.128M$0.3per 1M tokens
0.128M - 0.256M$2.95per 1M tokens
0.256M - 1M$7.09per 1M tokens

Recent activity

Total usage per day on TokenLX (last 30 days).

Prompt
1.60B
Completion
4.09B
Reasoning
491.13M
30d ago15d agotoday

Sample code & API

TokenLX normalizes requests and responses across providers. Use any OpenAI SDK or our native SDK.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.tokenlx.ai/v1",
    api_key="sk-tokenlx-...",
)

# Non-streaming
response = client.chat.completions.create(
    model="qwen-plus",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"},
    ],
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="qwen-plus",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Replace sk-aihubrouter-… with your key from the dashboard.