TokenLX

Models

202 models · 12 providers

Showing 202 of 202 models
OPENROUTER
openrouter/claude-opus-4.8-fast
frontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by openrouterMar 3, 2026128K context26.7B tokens
$10.00/M tokens in$50.00/M tokens out
OPENROUTER
openrouter/claude-opus-4.8
frontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by openrouterJan 2, 2026128K context25.6B tokens
$5.00/M tokens in$25.00/M tokens out
BEDROCK
aws/claude-opus-4-8
frontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by awsFeb 5, 2026128K context30.5B tokens
$5.00/M tokens in$25.00/M tokens out
BEDROCK
aws/claude-opus-4-8-openai
frontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by awsFeb 12, 2026128K context26.2B tokens
$5.00/M tokens in$25.00/M tokens out
QWEN
alibaba/qwen3.7-max
Thinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaApr 5, 20261M context6.02B tokens
$1.77/M tokens in$5.31/M tokens out
TENCENT
tencent/deepseek-v4-pro
Thinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencentMar 28, 2026128K context3.60B tokens
$1.77/M tokens in$3.54/M tokens out
QWEN
alibaba/deepseek-v4-pro
Thinkingopen-weightcheap

DeepSeek — open-weight Chinese LLM family. Strong cost-to-quality ratio and good code generation.

by alibabaApr 10, 20261M context4.42B tokens
$1.77/M tokens in$3.54/M tokens out
TENCENT
tencent/deepseek-v4-flash
Thinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencentApr 1, 2026128K context4.87B tokens
$0.15/M tokens in$0.3/M tokens out
QWEN
alibaba/deepseek-v4-flash
Thinkingopen-weightcheap

DeepSeek — open-weight Chinese LLM family. Strong cost-to-quality ratio and good code generation.

by alibabaJan 23, 20261M context5.06B tokens
$0.15/M tokens in$0.3/M tokens out
KIMI
moonshot/Kimi-K2.6
Thinkingchineselong-context

Moonshot Kimi — long-context Chinese model known for strong document reading and comprehension.

by moonshotJan 27, 2026262K context26.8B tokens
$0.95/M tokens in$4.00/M tokens out
TENCENT
tencent/Kimi-K2.6

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencentApr 13, 2026262K context36.7B tokens
$0.96/M tokens in$3.99/M tokens out
QWEN
alibaba/Kimi-K2.6

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibabaJan 9, 2026262K context29.9B tokens
$0.96/M tokens in$3.99/M tokens out
BEDROCK
aws/claude-opus-4-7
Thinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by awsMar 6, 2026200K context31.7B tokens
$5.00/M tokens in$25.00/M tokens out
BEDROCK
aws/claude-opus-4-7-openai
Thinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by awsMay 25, 2026200K context38.7B tokens
$5.00/M tokens in$25.00/M tokens out
BEDROCK
aws/claude-mythos
Thinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by awsApr 28, 2026200K context0.51B tokens
$3.30/M tokens in$16.50/M tokens out
BEDROCK
aws/claude-mythos-openai
Thinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by awsFeb 9, 2026200K context0.07B tokens
$3.30/M tokens in$16.50/M tokens out
QWEN
alibaba/qwen3.6-plus
Thinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaJan 24, 20261M context4.70B tokens
$0.3/M tokens in$1.77/M tokens out
QWEN
alibaba/qwen3.6-flash
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaFeb 20, 20261M context3.29B tokens
$0.18/M tokens in$1.06/M tokens out
QWEN
alibaba/wan2.7-image

Text-to-image model. Generates original images from natural-language prompts.

by alibabaFeb 14, 20260.34B tokens
from $0.03 per call
QWEN
alibaba/wan2.7-image-pro

Text-to-image model. Generates original images from natural-language prompts.

by alibabaMay 8, 20260.44B tokens
from $0.07 per call
QWEN
alibaba/wan2.7

Video generation model. Produces video clips from text or images.

by alibabaMay 22, 20260.51B tokens
from $0.09 per second
TENCENT
tencent/glm-5.1
Thinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencentApr 6, 2026200K context0.42B tokens
$0.89/M tokens in$3.54/M tokens out
QWEN
alibaba/glm-5.1
Thinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibabaMay 14, 2026200K context0.45B tokens
$0.89/M tokens in$3.54/M tokens out
ZHIPU
zhipu/glm-5.1
Thinkingchinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu aiFeb 8, 2026200K context0.50B tokens
$0.89/M tokens in$3.54/M tokens out
QWEN
alibaba/MiniMax-M2.7

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibabaMay 21, 2026246K context0.09B tokens
$0.31/M tokens in$1.24/M tokens out
MINIMAX
minimax/MiniMax-M2.7
long-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimaxMar 23, 2026246K context0.06B tokens
$0.3/M tokens in$1.20/M tokens out
TENCENT
tencent/MiniMax-M2.7

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencentJan 17, 2026246K context0.08B tokens
$0.31/M tokens in$1.24/M tokens out
BEDROCK
aws/claude-sonnet-4-6
Thinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by awsMar 3, 2026200K context34.4B tokens
$3.00/M tokens in$15.00/M tokens out
BEDROCK
aws/claude-sonnet-4-6-openai
Thinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by awsApr 12, 2026200K context23.3B tokens
$3.00/M tokens in$15.00/M tokens out
AZURE
azure/gpt-5.4
frontierflagship

Latest-generation frontier model with expanded reasoning and faster tool execution. Top choice when quality trumps cost.

by azureJan 10, 20261M context28.1B tokens
$5.00/M tokens in$15.00/M tokens out
GEMINI
google/gemini-3.1-flash-lite
cheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by googleFeb 15, 20261M context4.32B tokens
$0.25/M tokens in$1.50/M tokens out
GEMINI
google/gemini-3.1-flash-tts
cheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by googleMay 14, 20263.82B tokens
$0.5/M tokens in$10.00/M tokens out
QWEN
alibaba/qwen-image-2.0-pro
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaMar 5, 20263.20B tokens
from $0.07 per call
GEMINI
google/gemini-3.1-flash-image
cheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by googleFeb 26, 20263.42B tokens
$0.5/M tokens in$60.00/M tokens out
GEMINI
google/gemini-3.1-pro
multimodallong-context

Gemini Pro — Google's higher-quality Gemini tier. Strong reasoning with large context windows.

by googleMar 23, 20262M context5.17B tokens
$2.00/M tokens in$12.00/M tokens out
QWEN
alibaba/qwen3.5-27b
Thinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaApr 4, 2026262K context4.78B tokens
$0.09/M tokens in$0.71/M tokens out
QWEN
alibaba/qwen3.5-plus
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaMay 5, 20261M context3.84B tokens
$0.12/M tokens in$0.71/M tokens out
QWEN
alibaba/qwen3.5-397b-a17b
Thinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaJan 14, 2026262K context5.36B tokens
$0.18/M tokens in$1.06/M tokens out
QWEN
alibaba/qwen3.5-35b-a3b
Thinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaJan 6, 2026262K context4.04B tokens
$0.06/M tokens in$0.47/M tokens out
QWEN
alibaba/qwen3.5-omni
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaMar 12, 20264.76B tokens
$1.03/M tokens in$5.91/M tokens out
QWEN
alibaba/qwen3.5-122b-a10b
Thinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaMar 13, 2026262K context4.58B tokens
$0.12/M tokens in$0.94/M tokens out
QWEN
alibaba/qwen3.5-flash
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaJan 19, 20261M context3.38B tokens
$0.03/M tokens in$0.3/M tokens out
TENCENT
tencent/gpt-image-2

Text-to-image model. Generates original images from natural-language prompts.

by tencentApr 21, 20260.08B tokens
from $0.12 per call
KIMI
moonshot/Kimi-K2.5
Thinkingchineselong-context

Moonshot's Kimi K2.5 — Chinese-first model with exceptional long-context ability. Known for strong reading comprehension.

by moonshotMay 26, 2026262K context34.8B tokens
$0.6/M tokens in$3.00/M tokens out
AZURE
azure/gpt-image-2

Text-to-image model. Generates original images from natural-language prompts.

by azureMay 7, 20260.09B tokens
$5.00/M tokens in$30.00/M tokens out
TENCENT
tencent/Kimi-K2.5

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencentMar 12, 2026262K context35.3B tokens
$0.59/M tokens in$3.10/M tokens out
QWEN
alibaba/Kimi-K2.5

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibabaFeb 10, 2026262K context31.2B tokens
$0.59/M tokens in$3.10/M tokens out
DUBAO
bytedance/doubao-seed-2.0-code
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceJan 16, 2026256K context0.40B tokens
$0.47/M tokens in$2.36/M tokens out
DUBAO
bytedance/doubao-seed-2.0-lite
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceMay 1, 2026262K context0.09B tokens
$0.09/M tokens in$0.53/M tokens out
DUBAO
bytedance/doubao-seed-2.0-pro
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceMay 20, 2026256K context0.36B tokens
$0.47/M tokens in$2.36/M tokens out
DUBAO
bytedance/doubao-seedance-2-0
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceApr 10, 20260.37B tokens
$6.79/M tokens out
DUBAO
bytedance/doubao-seedance-2-0-fast
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceMar 15, 20260.06B tokens
$5.46/M tokens out
DUBAO
bytedance/doubao-seed-2.0-mini
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceApr 14, 2026262K context0.43B tokens
$0.03/M tokens in$0.3/M tokens out
DUBAO
bytedance/doubao-seedream-5.0
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceMar 8, 20260.41B tokens
from $0.03 per call
QWEN
alibaba/glm-5
Thinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibabaMay 19, 2026200K context0.07B tokens
$0.59/M tokens in$2.66/M tokens out
TENCENT
tencent/glm-5v-turbo

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencentMar 22, 2026200K context0.58B tokens
$0.74/M tokens in$3.25/M tokens out
ZHIPU
zhipu/glm-5
Thinkingchinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu aiApr 25, 2026200K context0.08B tokens
$0.59/M tokens in$2.66/M tokens out
TENCENT
tencent/glm-5-turbo

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencentJan 22, 2026200K context0.53B tokens
$0.74/M tokens in$3.25/M tokens out
TENCENT
tencent/glm-5
Thinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencentFeb 23, 2026200K context0.08B tokens
$0.59/M tokens in$2.66/M tokens out
ZHIPU
zhipu/glm-5-turbo
Thinkingchinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu aiFeb 8, 2026200K context0.37B tokens
$0.74/M tokens in$3.25/M tokens out
BEDROCK
aws/claude-opus-4-6
Thinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by awsApr 7, 2026200K context33.0B tokens
$5.00/M tokens in$25.00/M tokens out
BEDROCK
aws/claude-opus-4-6-openai
Thinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by awsMar 2, 2026200K context30.8B tokens
$5.00/M tokens in$25.00/M tokens out
QWEN
alibaba/kling-v3-video

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by alibabaApr 20, 20260.51B tokens
from $0.09 per second
QWEN
alibaba/kling-v3-omni

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by alibabaMay 8, 20260.42B tokens
from $0.03 per call
QWEN
alibaba/kling-v3-omni-video

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by alibabaMar 12, 20260.50B tokens
from $0.09 per second
VIDU
vidu/viduq3-mix

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduFeb 28, 20260.48B tokens
from $0.75 per call
VIDU
vidu/viduq3-pro

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduApr 11, 20260.06B tokens
from $0.75 per call
VIDU
vidu/viduq3

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduMar 11, 20260.07B tokens
from $0.75 per call
VIDU
vidu/viduq3-turbo

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduJan 6, 20260.45B tokens
from $0.35 per call
VIDU
vidu/viduq3-pro-fast

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduMay 6, 20260.47B tokens
from $0.12 per second
QWEN
alibaba/MiniMax-M2.5

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibabaFeb 23, 2026246K context0.07B tokens
$0.31/M tokens in$1.24/M tokens out
MINIMAX
minimax/MiniMax-M2.5
long-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimaxJan 21, 2026246K context0.05B tokens
$0.3/M tokens in$1.20/M tokens out
TENCENT
tencent/MiniMax-M2.5

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencentApr 15, 2026246K context0.08B tokens
$0.31/M tokens in$1.24/M tokens out
QWEN
alibaba/wan2.6-image

Text-to-image model. Generates original images from natural-language prompts.

by alibabaJan 13, 20260.06B tokens
from $0.03 per call
QWEN
alibaba/wan2.6

Video generation model. Produces video clips from text or images.

by alibabaJan 23, 20260.09B tokens
from $0.09 per second
QWEN
alibaba/wan2.6-flash

Video generation model. Produces video clips from text or images.

by alibabaFeb 6, 20260.39B tokens
from $0.04 per second
AZURE
azure/gpt-5.2
frontieragent

Upgraded GPT-5 with longer context and improved latency. Production default for demanding agentic workloads.

by azureApr 8, 2026400K context25.4B tokens
$1.75/M tokens in$14.00/M tokens out
AZURE
azure/gpt-5.2-codex
codingagent

Codex variant of GPT-5.2 tuned for software engineering. Specialized for repo-aware coding agents.

by azureApr 28, 2026400K context33.8B tokens
$1.75/M tokens in$14.00/M tokens out
DUBAO
bytedance/doubao-seedream-4.5
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceApr 12, 20260.51B tokens
from $0.04 per call
BEDROCK
aws/claude-opus-4-5
Thinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by awsMay 8, 2026200K context22.1B tokens
$5.00/M tokens in$25.00/M tokens out
BEDROCK
aws/claude-opus-4-5-openai
Thinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by awsMar 23, 2026200K context30.0B tokens
$5.00/M tokens in$25.00/M tokens out
GEMINI
google/gemini-3-pro-image
multimodallong-context

Gemini Pro — Google's higher-quality Gemini tier. Strong reasoning with large context windows.

by googleMay 14, 20264.02B tokens
$2.00/M tokens in$120.00/M tokens out
GEMINI
google/gemini-3-pro
multimodallong-context

Gemini Pro — Google's higher-quality Gemini tier. Strong reasoning with large context windows.

by googleApr 28, 20261M context4.03B tokens
$2.00/M tokens in$12.00/M tokens out
MINIMAX
minimax/MiniMax-M2.1
long-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimaxFeb 17, 2026246K context0.09B tokens
$0.3/M tokens in$1.20/M tokens out
ZHIPU
zhipu/glm-4.7
Thinkingchinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu aiFeb 5, 2026200K context0.07B tokens
$0.3/M tokens in$1.18/M tokens out
DUBAO
bytedance/doubao-seed-1.8
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceApr 7, 2026128K context0.08B tokens
$0.12/M tokens in$1.18/M tokens out
BEDROCK
aws/claude-haiku-4-5
fastcheap

Claude Haiku — fast, affordable AWS model. Best for high-volume real-time tasks.

by awsJan 7, 2026200K context0.06B tokens
$1.00/M tokens in$5.00/M tokens out
BEDROCK
aws/claude-haiku-4-5-openai
fastcheap

Claude Haiku — fast, affordable AWS model. Best for high-volume real-time tasks.

by awsFeb 18, 2026200K context0.42B tokens
$1.00/M tokens in$5.00/M tokens out
GEMINI
google/veo3.1

Video generation model. Produces video clips from text or images.

by googleJan 23, 20260.09B tokens
from $0.4 per second
GEMINI
google/veo3.1-fast

Video generation model. Produces video clips from text or images.

by googleJan 14, 20260.57B tokens
from $0.15 per second
MINIMAX
minimax/MiniMax-M2
long-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimaxMar 18, 2026246K context0.37B tokens
$0.3/M tokens in$1.20/M tokens out
GOOGLE
google/virtual-try-on-images

Text-to-image model. Generates original images from natural-language prompts.

by googleFeb 27, 20260.05B tokens
from $0.06 per call
QWEN
alibaba/aitryon-plus

Image generation model. Creates or edits images from text prompts.

by alibabaFeb 18, 20260.38B tokens
from $0.07 per call
QWEN
alibaba/happyhouse-1.0

Video generation model. Produces video clips from text or images.

by alibabaMar 11, 20260.08B tokens
from $0.13 per second
QWEN
alibaba/happyhouse-1.0-video-edit

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by alibabaMar 2, 20260.45B tokens
from $0.13 per second
ZHIPU
zhipu/glm-4.6
Thinkingchinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu aiJan 4, 2026128K context0.41B tokens
$0.44/M tokens in$2.07/M tokens out
ZHIPU
zhipu/glm-4.6v
chinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu aiApr 18, 2026128K context0.32B tokens
$0.15/M tokens in$0.44/M tokens out
AZURE
azure/sora-video-generate-2.0

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by azureFeb 13, 20260.06B tokens
from $0.1 per second
BEDROCK
aws/claude-sonnet-4-5
Thinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by awsFeb 2, 2026200K context33.1B tokens
$3.00/M tokens in$15.00/M tokens out
TENCENT
tencent/deepseek-v3.2

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencentMar 21, 2026128K context35.8B tokens
$0.3/M tokens in$0.44/M tokens out
QWEN
alibaba/deepseek-v3.2
open-weightcheap

DeepSeek — open-weight Chinese LLM family. Strong cost-to-quality ratio and good code generation.

by alibabaJan 27, 2026128K context24.2B tokens
$0.3/M tokens in$0.44/M tokens out
BEDROCK
aws/claude-sonnet-4-5-openai
Thinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by awsApr 15, 2026200K context24.2B tokens
$3.00/M tokens in$15.00/M tokens out
QWEN
alibaba/wan2.5-image

Text-to-image model. Generates original images from natural-language prompts.

by alibabaFeb 18, 20260.55B tokens
from $0.03 per call
QWEN
alibaba/wan2.5

Video generation model. Produces video clips from text or images.

by alibabaFeb 24, 20260.55B tokens
from $0.04 per second
DUBAO
bytedance/doubao-seedance-1-5-pro
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceMay 2, 20260.48B tokens
$2.36/M tokens out
TENCENT
tencent/tencent-video-edit-1.0

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by tencentMay 3, 20260.08B tokens
from $0.000039 per second
TENCENT
tencent/tencent-video-enhance-1.0

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by tencentFeb 3, 20260.07B tokens
from $0.0045 per second
DUBAO
bytedance/doubao-seedream-4.0
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceApr 17, 20260.07B tokens
from $0.03 per call
QWEN
alibaba/video-style-transform

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by alibabaMay 27, 20260.09B tokens
from $0.03 per second
MINIMAX
minimax/minimax-music-2.0
long-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimaxMar 25, 20260.06B tokens
from $0.03 per call
QWEN
alibaba/qwen-flash
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaJan 11, 20261M context4.17B tokens
$0.02/M tokens in$0.22/M tokens out
DUBAO
bytedance/deepseek-v3.1
open-weightcheapreasoning

Upgraded DeepSeek V3.1 with improved reasoning and better tool calling. Pareto-optimal on cost vs quality.

by bytedanceMay 26, 2026128K context31.4B tokens
$0.59/M tokens in$1.77/M tokens out
AZURE
azure/gpt5

Text generation model. Compatible with the OpenAI Chat Completions API.

by azureMay 19, 2026128K context0.08B tokens
$1.25/M tokens in$10.00/M tokens out
AZURE
azure/gpt5-chat

Text generation model. Compatible with the OpenAI Chat Completions API.

by azureJan 28, 2026200K context0.40B tokens
$1.25/M tokens in$10.00/M tokens out
AZURE
azure/gpt5-nano

Text generation model. Compatible with the OpenAI Chat Completions API.

by azureMar 16, 2026128K context0.44B tokens
$0.05/M tokens in$0.4/M tokens out
AZURE
azure/gpt5-mini

Text generation model. Compatible with the OpenAI Chat Completions API.

by azureApr 27, 2026128K context0.08B tokens
$0.25/M tokens in$2.00/M tokens out
DUBAO
bytedance/ByteDance-Seed-SC

Text generation model. Compatible with the OpenAI Chat Completions API.

by bytedanceMay 19, 2026128K context0.08B tokens
$0.12/M tokens in$0.3/M tokens out
DUBAO
bytedance/doubao-witty-remark
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceFeb 14, 20260.48B tokens
from $0.07 per call
QWEN
alibaba/qwen-image
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaFeb 4, 20263.42B tokens
from $0.03 per call
TENCENTIMG
tencent/tencent-image-translate-1.0

Text-to-image model. Generates original images from natural-language prompts.

by tencentJan 3, 20260.06B tokens
from $0.03 per call
GEMINI
google/gemini-image-2.0-flash
cheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by googleApr 18, 20264.78B tokens
$0.15/M tokens in$30.00/M tokens out
QWEN
alibaba/wan2.2-animate-mix

Video generation model. Produces video clips from text or images.

by alibabaMar 26, 20260.46B tokens
from $0.13 per second
QWEN
alibaba/wan2.2

Video generation model. Produces video clips from text or images.

by alibabaMay 27, 20260.07B tokens
from $0.02 per second
DUBAO
bytedance/doubao-seed-1.6
Thinkingchinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceJan 9, 2026128K context0.07B tokens
$0.12/M tokens in$1.18/M tokens out
DUBAO
bytedance/doubao-seedance
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceMar 6, 20260.46B tokens
$1.48/M tokens in$1.48/M tokens out
DUBAO
bytedance/doubao-seedance-pro
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceMay 22, 20260.52B tokens
$2.21/M tokens out
DUBAO
bytedance/doubao-realtime-audio-transcription
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceApr 9, 20260.08B tokens
from $0.07 per call
DUBAO
bytedance/doubao-embedding
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceApr 13, 2026128K context0.08B tokens
$0.07/M tokens in
DUBAO
bytedance/doubao-seed-1.6-flash
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceMay 14, 2026128K context0.43B tokens
$0.02/M tokens in$0.22/M tokens out
VIDU
vidu/vidu-tts

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduMay 23, 20260.09B tokens
$100.00/M tokens out
VIDU
vidu/vidu-voice-clone

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduFeb 22, 20260.52B tokens
from $1.50 per call
QWEN
alibaba/qwen-text-embedding-v4
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaJan 19, 2026128K context3.35B tokens
$0.07/M tokens in
QWEN
alibaba/qwen-multimodal-embedding-v1
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaMar 1, 2026128K context3.56B tokens
$0.1/M tokens in
QWEN
alibaba/qwen-gte-rerank-v2
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaMar 2, 2026128K context4.76B tokens
$0.12/M tokens in
QWEN
alibaba/deepseek-r1-0528
open-weightcheap

DeepSeek — open-weight Chinese LLM family. Strong cost-to-quality ratio and good code generation.

by alibabaApr 21, 2026128K context26.3B tokens
$0.59/M tokens in$2.36/M tokens out
BEDROCK
aws/claude-4-sonnet
Thinkingagentcoding

Claude 4 Sonnet — balance of speed, quality, and cost for agentic workflows and production coding.

by awsMar 10, 2026200K context0.54B tokens
$3.00/M tokens in$15.00/M tokens out
BEDROCK
aws/claude-4-sonnet-openai
Thinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by awsMar 9, 2026200K context0.08B tokens
$3.00/M tokens in$15.00/M tokens out
GEMINI
google/veo3

Video generation model. Produces video clips from text or images.

by googleFeb 28, 20260.47B tokens
from $0.4 per second
QWEN
alibaba/qwen3-32b
Thinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaJan 27, 2026128K context3.76B tokens
$0.3/M tokens in$1.18/M tokens out
QWEN
alibaba/qwen3-30b
Thinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaApr 21, 2026128K context5.51B tokens
$0.11/M tokens in$0.44/M tokens out
QWEN
alibaba/qwen3-235b
Thinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaApr 22, 2026128K context4.42B tokens
$0.3/M tokens in$1.18/M tokens out
QWEN
alibaba/qwen3-coder-plus
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaJan 6, 20261M context5.33B tokens
$0.59/M tokens in$2.36/M tokens out
QWEN
alibaba/qwen3-coder-flash
chinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibabaFeb 16, 20261M context4.25B tokens
$0.15/M tokens in$0.59/M tokens out
AZURE
azure/gpt-image-1

Latest Azure image model with improved realism and editing. Supports inpainting, outpainting, and mask-guided edits.

by azureJan 8, 20260.57B tokens
$5.00/M tokens in$40.00/M tokens out
VIDU
vidu/vidu-2.0

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduFeb 20, 20260.41B tokens
from $0.1 per call
VIDU
vidu/viduq2-pro

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduMay 8, 20260.39B tokens
from $1.30 per call
VIDU
vidu/viduq2

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduApr 12, 20260.42B tokens
from $0.35 per call
VIDU
vidu/viduq2-turbo

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduJan 27, 20260.08B tokens
from $0.43 per call
VIDU
vidu/viduq2-pro-fast

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduMay 21, 20260.09B tokens
from $0.18 per call
DUBAO
bytedance/doubao-seededit-3.0
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceJan 23, 20260.09B tokens
from $0.03 per call
DUBAO
bytedance/doubao-seedream-3.0
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceMay 26, 20260.61B tokens
from $0.04 per call
AZURE
azure/gpt-o3
reasoningfrontier

Next-generation reasoning model succeeding o1. Solves problems that stumped previous models, at a reasonable cost.

by azureApr 7, 2026200K context0.09B tokens
$2.00/M tokens in$8.00/M tokens out
AZURE
azure/gpt-o3-mini
reasoningthinking

Azure o-series — reasoning-first models that think before answering. Best for hard math, science, and code.

by azureMay 27, 2026200K context0.08B tokens
$1.10/M tokens in$4.40/M tokens out
AZURE
azure/gpt-4.1
codinglong-contexttools

Code-focused GPT-4 successor with stronger instruction following and 1M+ context. Great for long-document analysis and agentic coding.

by azureFeb 26, 20261M context24.9B tokens
$2.00/M tokens in$8.00/M tokens out
AZURE
azure/gpt-4.1-nano
cheapestfast

Tiny model optimized for classification and structured output. Cheapest in the GPT-4 family.

by azureFeb 9, 20261M context27.2B tokens
$0.1/M tokens in$0.4/M tokens out
AZURE
azure/gpt-4.1-mini
cheaplong-context

Smaller GPT-4.1 with the same 1M context at a fraction of the cost. The new default for long-context RAG and bulk processing.

by azureJan 26, 20261M context36.9B tokens
$0.4/M tokens in$1.60/M tokens out
MINIMAX
minimax/minimax-video-1.0
long-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimaxMar 12, 20260.48B tokens
from $0.28 per call
DUBAO
bytedance/doubao-seaweed
chinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedanceJan 16, 20260.36B tokens
$1.48/M tokens in$1.48/M tokens out
GEMINI
google/gemini-2.5-pro
Thinkingreasoningfrontierlong-context

Gemini 2.5 Pro — Google's top reasoning model with thinking mode. Frontier performance on coding and math.

by googleJan 10, 20262M context27.5B tokens
$1.25/M tokens in$10.00/M tokens out
GEMINI
google/gemini-2.5-pro-tts
multimodallong-context

Gemini Pro — Google's higher-quality Gemini tier. Strong reasoning with large context windows.

by googleJan 26, 202639.0B tokens
$1.00/M tokens in$20.00/M tokens out
GEMINI
google/gemini-2.5-flash
Thinkingcheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by googleApr 5, 20261M context40.1B tokens
$0.3/M tokens in$2.50/M tokens out
GEMINI
google/gemini-2.5-flash-image
cheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by googleApr 27, 202629.4B tokens
$0.3/M tokens in$30.00/M tokens out
GEMINI
google/gemini-2.5-flash-tts
cheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by googleMay 19, 202631.6B tokens
$0.5/M tokens in$10.00/M tokens out
BEDROCK
aws/claude-3-7-sonnet
Thinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by awsJan 3, 2026200K context3.86B tokens
$3.00/M tokens in$15.00/M tokens out
BEDROCK
aws/claude-3-7-sonnet-openai
Thinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by awsMay 14, 2026200K context5.01B tokens
$3.00/M tokens in$15.00/M tokens out
VIDU
vidu/vidu-q1

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduMar 10, 20260.41B tokens
from $0.4 per call
VIDU
vidu/viduq1-classic

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduMay 20, 20260.38B tokens
from $0.4 per call
QWEN
alibaba/deepseek-r1
reasoningopen-weightthinking

DeepSeek R1 — a reasoning-first model trained with reinforcement learning. Competes with o1-class models at much lower cost.

by alibabaMar 17, 2026128K context31.6B tokens
$0.59/M tokens in$2.36/M tokens out
QWEN
alibaba/wanx2.1-plus

Text-to-image model. Generates original images from natural-language prompts.

by alibabaApr 19, 20260.08B tokens
from $0.1 per second
QWEN
alibaba/wanx2.1-turbo

Text-to-image model. Generates original images from natural-language prompts.

by alibabaMay 11, 20260.09B tokens
from $0.04 per second
QWEN
alibaba/deepseek-v3
open-weightcheapcoding

Open-weight DeepSeek V3 — MoE architecture delivering frontier-adjacent quality at a fraction of the cost.

by alibabaFeb 3, 2026128K context33.8B tokens
$0.3/M tokens in$1.18/M tokens out
GEMINI
google/gemini-2.0-flash
multimodaltoolssearch

Next-gen Gemini Flash with improved reasoning and native tool use. Drop-in upgrade to 1.5 Flash.

by googleMay 4, 20261M context22.1B tokens
$0.15/M tokens in$0.6/M tokens out
GEMINI
google/gemini-2.0-flash-lite
cheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by googleJan 13, 20261M context28.9B tokens
$0.08/M tokens in$0.3/M tokens out
SORA
azure/sora-video-generate-1.0

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by azureMar 22, 20260.36B tokens
from $0.1 per second
DUBAO
bytedance/dubao_reasoning_lite_32k

Text generation model. Compatible with the OpenAI Chat Completions API.

by bytedanceMar 13, 202632K context0.08B tokens
$0.04/M tokens in$0.09/M tokens out
VOLCENGINE
volcengine/volcengine-veImageX

Text-to-image model. Generates original images from natural-language prompts.

by volcengineJan 22, 20260.35B tokens
from $0.01 per call
VIDU
vidu/vidu-1.5

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduJan 16, 20260.33B tokens
from $0.07 per second
QWEN
alibaba/qwen-plus
chinesebalanced

Balanced Qwen tier with strong Chinese + reasonable cost. The pragmatic default for production Chinese apps.

by alibabaMar 9, 20261M context5.70B tokens
$0.12/M tokens in$0.3/M tokens out
QWEN
alibaba/qwen-max
chinesebilingualflagship

Alibaba's flagship Qwen model. Strong bilingual (Chinese / English) performance, especially tuned for enterprise scenarios.

by alibabaJan 7, 2026262K context24.7B tokens
$0.35/M tokens in$1.42/M tokens out
VIDU
vidu/vidu-t1

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by viduJan 19, 20260.07B tokens
from $0.22 per call
AZURE
azure/gpt-o1
reasoningthinking

Reasoning-first model that thinks before answering. Best for math, science, and multi-step problem solving.

by azureJan 9, 2026128K context0.06B tokens
$15.00/M tokens in$60.00/M tokens out
ZHIPU
zhipu/charglm-4

Text generation model. Compatible with the OpenAI Chat Completions API.

by zhipu aiJan 2, 202632K context0.47B tokens
$0.15/M tokens in$0.15/M tokens out
DUBAO
bytedance/dubao_pro_32k

Text generation model. Compatible with the OpenAI Chat Completions API.

by bytedanceFeb 3, 202632K context0.09B tokens
$0.12/M tokens in$0.3/M tokens out
DUBAO
bytedance/dubao_pro_32k_init

Text generation model. Compatible with the OpenAI Chat Completions API.

by bytedanceApr 20, 202632K context0.42B tokens
$0.12/M tokens in$0.3/M tokens out
BEDROCK
aws/claude3-5-sonnet
codingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by awsMay 2, 2026200K context0.49B tokens
$3.00/M tokens in$15.00/M tokens out
ZHIPU
zhipu/zhipu-vidu2-image

Text-to-image model. Generates original images from natural-language prompts.

by zhipu aiJan 14, 20260.37B tokens
from $0.15 per call
ZHIPU
zhipu/zhipu-viduq1

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by zhipu aiApr 8, 20260.40B tokens
from $0.15 per call
ZHIPU
zhipu/zhipu-viduq2-reference

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by zhipu aiMay 17, 20260.07B tokens
from $0.15 per call
ZHIPU
zhipu/zhipu-doubao-seedance

Video generation model. Produces video clips from text or images.

by zhipu aiMay 11, 20260.06B tokens
$1.48/M tokens in$1.48/M tokens out
ZHIPU
zhipu/zhipu-doubao-seedance-pro

Video generation model. Produces video clips from text or images.

by zhipu aiApr 3, 20260.07B tokens
$2.21/M tokens in$2.21/M tokens out
ZHIPU
zhipu/zhipu-doubao-seedream-4.0

Video generation model. Produces video clips from text or images.

by zhipu aiMay 8, 20260.55B tokens
from $0.03 per call
DUBAO
bytedance/dubao-pro-32k-large

Text generation model. Compatible with the OpenAI Chat Completions API.

by bytedanceJan 5, 202632K context0.08B tokens
$0.12/M tokens in$0.3/M tokens out
AZURE
azure/gpt-4o
visiontoolsstreaming

Flagship multimodal model from Azure with native text, vision, and voice understanding. Strong at general-purpose reasoning and instruction following.

by azureMar 12, 2026128K context24.0B tokens
$2.50/M tokens in$10.00/M tokens out
AZURE
azure/gpt-4o-mini
cheapfastvision

Cheap and fast sibling of GPT-4o. Best value for high-volume classification, extraction, and routing tasks.

by azureFeb 18, 2026128K context27.9B tokens
$0.15/M tokens in$0.6/M tokens out
QWEN
alibaba/qwen-long
long-contextchinese

Qwen variant with very long context (10M+ tokens). Purpose-built for long-document analysis and codebase-level tasks.

by alibabaMar 7, 202610M context4.11B tokens
$0.07/M tokens in$0.3/M tokens out
QWEN
alibaba/qwen-turbo
chinesecheapfast

Smallest, cheapest Qwen. Good for classification, routing, and high-volume light tasks in Chinese.

by alibabaJan 9, 20261M context5.59B tokens
$0.04/M tokens in$0.09/M tokens out
BEDROCK
aws/claude3-haiku
fastcheap

Claude Haiku — fast, affordable AWS model. Best for high-volume real-time tasks.

by awsMar 3, 2026200K context0.08B tokens
$0.25/M tokens in$1.25/M tokens out
BEDROCK
aws/claude-3-sonnet
codingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by awsMay 17, 2026200K context3.67B tokens
$3.00/M tokens in$15.00/M tokens out
GEMINI
google/gemini-1.5-pro
multimodallong-context

Gemini Pro — Google's higher-quality Gemini tier. Strong reasoning with large context windows.

by googleFeb 21, 202631.8B tokens
$0.31/M tokens in$2.50/M tokens out
GEMINI
google/gemini-1.5-flash
long-contextmultimodalcheap

Google's fast multimodal model with 1M context. Remarkable value for long-document and video input tasks.

by googleJan 26, 20263.67B tokens
$0.02/M tokens in$0.15/M tokens out
AZURE
azure/dall-e-3

Azure's production image generator. Known for strong prompt adherence and coherent in-image text rendering.

by azureMay 28, 20260.42B tokens
from $0.04 per call
AZURE
azure/gpt4

Text generation model. Compatible with the OpenAI Chat Completions API.

by azureApr 18, 202632K context0.46B tokens
$30.00/M tokens in$60.00/M tokens out