Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.
Key strengths
- Strong Chinese
- Good English
- Multiple size tiers
- Tool calling
Use cases
- Chinese assistants
- Bilingual content
- Enterprise chat
- Cross-border apps
Alibaba's alibaba/qwen-gte-rerank-v2 is a cross-encoder reranking model. Designed for the second stage of retrieval pipelines, it scores query-document pairs to improve relevance ranking from initial vector or BM25 retrieval.
Particularly effective in hybrid search systems and RAG applications where surfacing the most relevant documents at the top of the result set materially improves downstream generation quality.
alibaba/qwen-gte-rerank-v2 is fully OpenAI-compatible — drop in your existing OpenAI Python or Node SDK and switch `baseURL` to `https://api.tokenlx.ai`. TokenLX transparently routes your requests to the optimal provider endpoint while preserving streaming, function-calling, and structured-output semantics.
Performance
Compare different providers across TokenLX · All locations.
Effective Pricing
Actual cost per million tokens across providers over the past 7 days.
Recent activity
Total usage per day on TokenLX (last 30 days).
Sample code & API
TokenLX normalizes requests and responses across providers. Use any OpenAI SDK or our native SDK.
# Python — use HTTP client directly
# Endpoint: POST https://api.tokenlx.ai/v1/videos/generations
# Headers: Authorization: Bearer $TOKENLX_API_KEY
# Body: { "model": "qwen-gte-rerank-v2", "prompt": "...", "duration": 5 }Replace sk-aihubrouter-… with your key from the dashboard.