TokenLX
DUBAO

bytedance/doubao-seed-1.6

128K  context$0.12/M tokens input$1.18/M tokens outputThinking   Supported0.07B  tokens servedText

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

Key strengths

  • Chinese performance
  • Volcano Engine native
  • Good latency
  • Multi-size tiers

Use cases

  • Chinese products
  • Social / content
  • E-commerce
  • Internal tools
Thinking token billing

When Thinking mode is enabled, reasoning tokens generated by the model are counted as billable output. This can increase total usage beyond the visible answer tokens.

chinesebytedance

ByteDance's bytedance/doubao-seed-1.6 is a frontier text generation model in the Doubao family. It excels at complex reasoning, agentic workflows, code generation, and long-form writing tasks, with native support for streaming, tool calling, JSON mode, and multi-turn conversations.

The model handles long-context inputs gracefully and is particularly effective for software engineering, multi-step research, and end-to-end project execution. Its tokenizer and pricing are optimized for high-throughput production workloads, with a competitive cost profile relative to other models in its tier.

bytedance/doubao-seed-1.6 is fully OpenAI-compatible — drop in your existing OpenAI Python or Node SDK and switch `baseURL` to `https://api.tokenlx.ai`. TokenLX transparently routes your requests to the optimal provider endpoint while preserving streaming, function-calling, and structured-output semantics.

Performance

Compare different providers across TokenLX · All locations.

Throughput
53
tok/s
Latency
111
ms
E2E Latency
175
ms
Tool Call Errors
0.07
%
Output Errors
0.38
%
Time to First Token
80
ms

Effective Pricing

Actual cost per million tokens across providers over the past 7 days.

Input
$0.12
per 1M tokens
7d agotoday
Output
$1.18
per 1M tokens
7d agotoday
Cache read
$0.02
per 1M tokens
7d agotoday
Input tiers
0M - 0.032M$0.12per 1M tokens
0M - 0.032M$0.12per 1M tokens
0.032M - 0.128M$0.18per 1M tokens
0.128M - 0.256M$0.35per 1M tokens
Output tiers
0M - 0.032M$1.18per 1M tokens
0M - 0.032M$0.3per 1M tokens
0.032M - 0.128M$2.36per 1M tokens
0.128M - 0.256M$3.54per 1M tokens

Recent activity

Total usage per day on TokenLX (last 30 days).

Prompt
21.25M
Completion
45.69M
30d ago15d agotoday

Sample code & API

TokenLX normalizes requests and responses across providers. Use any OpenAI SDK or our native SDK.

Control Thinking cost

Disable Thinking when you do not need explicit reasoning, or set a lower budget_tokens value to cap the reasoning length. Only enable return_thoughts when you need to inspect the thinking process.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.tokenlx.ai/v1",
    api_key="sk-tokenlx-...",
)

# Non-streaming
response = client.chat.completions.create(
    model="doubao-seed-1.6",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"},
    ],
    # Optional: enable Thinking / reasoning.
    extra_body={
        "thinking": {
            "enabled": True,
            "budget_tokens": 2048,
            "return_thoughts": True,
        }
    },
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="doubao-seed-1.6",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True,
    extra_body={
        "thinking": {
            "enabled": True,
            "budget_tokens": 2048,
            "return_thoughts": True,
        }
    },
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Replace sk-aihubrouter-… with your key from the dashboard.