Models

235 models · 13 providers

Showing 235 of 235 models

alibaba/glm-5.2

0.06B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibaba·Jan 15, 2026·128K context0.06B tokens

Text

$0.89/M tokens in$3.55/M tokens out

alibaba/Kimi-K2.7-Code

26.0B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibaba·Feb 22, 2026·128K context26.0B tokens

Text

$0.96/M tokens in$3.99/M tokens out

aws/claude-fable-5

0.37B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by aws·Apr 20, 2026·128K context0.37B tokens

Text

$10.00/M tokens in$50.00/M tokens out

openrouter/claude-fable-5

0.33B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by openrouter·May 6, 2026·128K context0.33B tokens

Text

$10.00/M tokens in$50.00/M tokens out

aws/claude-sonnet-5

0.08B tokenscodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by aws·Apr 3, 2026·128K context0.08B tokens

Text

$2.00/M tokens in$10.00/M tokens out

openrouter/claude-sonnet-5

0.08B tokenscodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by openrouter·Apr 9, 2026·128K context0.08B tokens

Text

$2.00/M tokens in$10.00/M tokens out

deepseek/deepseek-v4-flash

5.06B tokensThinkingopen-weightcheap

DeepSeek — open-weight Chinese LLM family. Strong cost-to-quality ratio and good code generation.

by deepseek·Jan 23, 2026·1M context5.06B tokens

Text

$0.15/M tokens in$0.3/M tokens out

deepseek/deepseek-v4-pro

4.42B tokensThinkingopen-weightcheap

DeepSeek — open-weight Chinese LLM family. Strong cost-to-quality ratio and good code generation.

by deepseek·Apr 10, 2026·1M context4.42B tokens

Text

$0.44/M tokens in$0.89/M tokens out

bytedance/doubao-seed-2.1-pro

0.06B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Jan 17, 2026·128K context0.06B tokens

Text

$0.89/M tokens in$4.44/M tokens out

bytedance/doubao-seed-2.1-turbo

0.45B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Feb 16, 2026·128K context0.45B tokens

Text

$0.44/M tokens in$2.22/M tokens out

bytedance/doubao-seed-evolving

0.56B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Mar 8, 2026·128K context0.56B tokens

Text

$0.89/M tokens in$4.44/M tokens out

bytedance/doubao-seed3d-2.0

0.06B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Mar 5, 20260.06B tokens

Image

—$11.83/M tokens out

bytedance/doubao-seedance-2-0-mini

0.38B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Apr 4, 20260.38B tokens

Video

—$3.40/M tokens out

bytedance/doubao-seedream-5.0-lite

0.07B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Mar 3, 20260.07B tokens

Image

from $0.03 per call

bytedance/doubao-seedream-5.0-pro

0.45B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Feb 24, 20260.45B tokens

Image

from $0.09 per call

wavespeed/eleven-v3

0.41B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by wavespeed·Mar 8, 2026·128K context0.41B tokens

Text

$100.00/M tokens in—

google/gemini-embedding-001

4.70B tokens

Embedding / reranking model. Maps text into semantic vectors for search, retrieval, and classification.

by google·May 12, 2026·128K context4.70B tokens

Embedding

$0.15/M tokens in—

google/gemini-embedding-2

3.07B tokens

Embedding / reranking model. Maps text into semantic vectors for search, retrieval, and classification.

by google·Mar 5, 2026·128K context3.07B tokens

Embedding

$0.2/M tokens in—

zhipu/glm-5.2

0.09B tokenschinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu ai·Mar 9, 2026·128K context0.09B tokens

Text

$1.18/M tokens in$4.14/M tokens out

zhipu/glm-5v-turbo

0.38B tokenschinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu ai·May 28, 2026·128K context0.38B tokens

Text

$0.74/M tokens in$3.25/M tokens out

azure/gpt-5.6-luna

27.4B tokensfrontierreasoning

Azure GPT-5 family — next-gen flagship with extended reasoning, stronger tools, and safer behavior.

by azure·Apr 11, 2026·128K context27.4B tokens

Text

$5.00/M tokens in$30.00/M tokens out

azure/gpt-5.6-sol

20.8B tokensfrontierreasoning

Azure GPT-5 family — next-gen flagship with extended reasoning, stronger tools, and safer behavior.

by azure·Jan 7, 2026·128K context20.8B tokens

Text

$5.00/M tokens in$30.00/M tokens out

azure/gpt-5.6-terra

30.2B tokensfrontierreasoning

Azure GPT-5 family — next-gen flagship with extended reasoning, stronger tools, and safer behavior.

by azure·May 11, 2026·128K context30.2B tokens

Text

$5.00/M tokens in$30.00/M tokens out

alibaba/happyhouse-1.1

0.51B tokens

Video generation model. Produces video clips from text or images.

by alibaba·Apr 12, 20260.51B tokens

Video

from $0.13 per second

alibaba/happyoyster

0.06B tokens

Video generation model. Produces video clips from text or images.

by alibaba·Feb 5, 20260.06B tokens

Video

from $0.74 per call

bytedance/hitem3d-2.0

0.35B tokens

Image generation model. Creates or edits images from text prompts.

by bytedance·Mar 10, 20260.35B tokens

Image

—$11.83/M tokens out

bytedance/hyper3d-gen2

0.08B tokens

Image generation model. Creates or edits images from text prompts.

by bytedance·Mar 7, 20260.08B tokens

Image

—$11.83/M tokens out

moonshot/Kimi-K3

0.49B tokenschineselong-context

Moonshot Kimi — long-context Chinese model known for strong document reading and comprehension.

by moonshot·May 16, 2026·128K context0.49B tokens

Text

$2.96/M tokens in$14.79/M tokens out

minimax/MiniMax-M3

0.06B tokenslong-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimax·Apr 19, 2026·128K context0.06B tokens

Text

$0.9/M tokens in$3.60/M tokens out

wavespeed/openai-whisper

0.46B tokens

Audio model. Handles transcription or speech synthesis.

by wavespeed·May 20, 20260.46B tokens

Audio

from $0.0010 per second

wavespeed/qwen-image-layered

5.54B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by wavespeed·Mar 25, 20265.54B tokens

Image

from $0.1 per call

tencent/gemini-2.5-flash-image

30.4B tokens

Text-to-image model. Generates original images from natural-language prompts.

by tencent·Apr 1, 202630.4B tokens

Image

from $0.04 per call

tencent/gemini-3-pro-image

3.98B tokens

Text-to-image model. Generates original images from natural-language prompts.

by tencent·Jan 20, 20263.98B tokens

Image

from $0.15 per call

tencent/gemini-3.1-flash-image

4.44B tokens

Text-to-image model. Generates original images from natural-language prompts.

by tencent·May 10, 20264.44B tokens

Image

from $0.05 per call

tencent/glm-5.2

0.07B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·May 7, 2026·128K context0.07B tokens

Text

$0.89/M tokens in$3.55/M tokens out

tencent/Kimi-K2.7-Code

32.5B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Jan 10, 2026·128K context32.5B tokens

Text

$0.96/M tokens in$3.99/M tokens out

tripo/tripo-h3.1

0.32B tokens

Image generation model. Creates or edits images from text prompts.

by alibaba·Apr 6, 20260.32B tokens

Image

from $0.1 per call

tripo/tripo-p1.0

0.06B tokens

Image generation model. Creates or edits images from text prompts.

by alibaba·Apr 7, 20260.06B tokens

Image

from $0.31 per call

alibaba/wan2.2-animate-move

0.07B tokens

Video generation model. Produces video clips from text or images.

by alibaba·Feb 5, 20260.07B tokens

Video

from $0.0030 per second

alibaba/wan2.2-s2v

0.06B tokens

Video generation model. Produces video clips from text or images.

by alibaba·May 5, 20260.06B tokens

Video

from $0.07 per second

alibaba/wan2.7-image-realtime

0.35B tokens

Text-to-image model. Generates original images from natural-language prompts.

by alibaba·Apr 28, 20260.35B tokens

Image

from $0.15 per call

moonshot/Kimi-K2.7-Code

26.5B tokenschineselong-context

Moonshot Kimi — long-context Chinese model known for strong document reading and comprehension.

by moonshot·May 10, 2026·128K context26.5B tokens

Text

$0.96/M tokens in$3.99/M tokens out

openrouter/claude-opus-4.8-fast

26.7B tokensfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by openrouter·Mar 3, 2026·128K context26.7B tokens

Text

$10.00/M tokens in$50.00/M tokens out

openrouter/claude-opus-4.8

25.6B tokensfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by openrouter·Jan 2, 2026·128K context25.6B tokens

Text

$5.00/M tokens in$25.00/M tokens out

aws/claude-opus-4-8

30.5B tokensfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by aws·Feb 5, 2026·128K context30.5B tokens

Text

$5.00/M tokens in$25.00/M tokens out

aws/claude-opus-4-8-openai

26.2B tokensfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by aws·Feb 12, 2026·128K context26.2B tokens

Text

$5.00/M tokens in$25.00/M tokens out

alibaba/qwen3.7-max

6.02B tokensThinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Apr 5, 2026·1M context6.02B tokens

Text

$1.77/M tokens in$5.32/M tokens out

tencent/deepseek-v4-pro

3.60B tokensThinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Mar 28, 2026·128K context3.60B tokens

Text

$1.77/M tokens in$3.55/M tokens out

tencent/deepseek-v4-flash

4.87B tokensThinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Apr 1, 2026·128K context4.87B tokens

Text

$0.15/M tokens in$0.3/M tokens out

moonshot/Kimi-K2.6

26.8B tokensThinkingchineselong-context

Moonshot Kimi — long-context Chinese model known for strong document reading and comprehension.

by moonshot·Jan 27, 2026·262K context26.8B tokens

Text

$0.95/M tokens in$4.00/M tokens out

tencent/Kimi-K2.6

36.7B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Apr 13, 2026·262K context36.7B tokens

Text

$0.96/M tokens in$3.99/M tokens out

alibaba/Kimi-K2.6

29.9B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibaba·Jan 9, 2026·262K context29.9B tokens

Text

$0.96/M tokens in$3.99/M tokens out

aws/claude-opus-4-7

31.7B tokensThinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by aws·Mar 6, 2026·200K context31.7B tokens

Text

$5.00/M tokens in$25.00/M tokens out

aws/claude-opus-4-7-openai

38.7B tokensThinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by aws·May 25, 2026·200K context38.7B tokens

Text

$5.00/M tokens in$25.00/M tokens out

alibaba/qwen3.6-plus

4.70B tokensThinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Jan 24, 2026·1M context4.70B tokens

Text

$0.3/M tokens in$1.77/M tokens out

alibaba/qwen3.6-flash

3.29B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Feb 20, 2026·1M context3.29B tokens

Text

$0.18/M tokens in$1.06/M tokens out

alibaba/wan2.7-image

0.34B tokens

Text-to-image model. Generates original images from natural-language prompts.

by alibaba·Feb 14, 20260.34B tokens

Image

from $0.03 per call

alibaba/wan2.7-image-pro

0.44B tokens

Text-to-image model. Generates original images from natural-language prompts.

by alibaba·May 8, 20260.44B tokens

Image

from $0.07 per call

alibaba/wan2.7

0.51B tokens

Video generation model. Produces video clips from text or images.

by alibaba·May 22, 20260.51B tokens

Video

from $0.09 per second

tencent/glm-5.1

0.42B tokensThinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Apr 6, 2026·200K context0.42B tokens

Text

$0.89/M tokens in$3.55/M tokens out

alibaba/glm-5.1

0.45B tokensThinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibaba·May 14, 2026·200K context0.45B tokens

Text

$0.89/M tokens in$3.55/M tokens out

zhipu/glm-5.1

0.50B tokensThinkingchinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu ai·Feb 8, 2026·200K context0.50B tokens

Text

$0.89/M tokens in$3.55/M tokens out

alibaba/MiniMax-M2.7

0.09B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibaba·May 21, 2026·246K context0.09B tokens

Text

$0.31/M tokens in$1.24/M tokens out

minimax/MiniMax-M2.7

0.06B tokenslong-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimax·Mar 23, 2026·246K context0.06B tokens

Text

$0.3/M tokens in$1.20/M tokens out

tencent/MiniMax-M2.7

0.08B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Jan 17, 2026·246K context0.08B tokens

Text

$0.31/M tokens in$1.24/M tokens out

aws/claude-sonnet-4-6

34.4B tokensThinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by aws·Mar 3, 2026·200K context34.4B tokens

Text

$3.00/M tokens in$15.00/M tokens out

aws/claude-sonnet-4-6-openai

23.3B tokensThinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by aws·Apr 12, 2026·200K context23.3B tokens

Text

$3.00/M tokens in$15.00/M tokens out

azure/gpt-5.4

28.1B tokensfrontierflagship

Latest-generation frontier model with expanded reasoning and faster tool execution. Top choice when quality trumps cost.

by azure·Jan 10, 2026·1M context28.1B tokens

Text

$5.00/M tokens in$15.00/M tokens out

google/gemini-3.1-flash-lite

4.32B tokenscheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by google·Feb 15, 2026·1M context4.32B tokens

Text

$0.25/M tokens in$1.50/M tokens out

google/gemini-3.1-flash-tts

3.82B tokenscheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by google·May 14, 20263.82B tokens

Audio

$0.5/M tokens in$10.00/M tokens out

alibaba/qwen-image-2.0-pro

3.20B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Mar 5, 20263.20B tokens

Image

from $0.07 per call

google/gemini-3.1-flash-image

3.42B tokenscheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by google·Feb 26, 20263.42B tokens

Image

$0.5/M tokens in$60.00/M tokens out

google/gemini-3.1-pro

5.17B tokensmultimodallong-context

Gemini Pro — Google's higher-quality Gemini tier. Strong reasoning with large context windows.

by google·Mar 23, 2026·2M context5.17B tokens

Text

$2.00/M tokens in$12.00/M tokens out

alibaba/qwen3.5-27b

4.78B tokensThinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Apr 4, 2026·262K context4.78B tokens

Text

$0.09/M tokens in$0.71/M tokens out

alibaba/qwen3.5-plus

3.84B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·May 5, 2026·1M context3.84B tokens

Text

$0.12/M tokens in$0.71/M tokens out

alibaba/qwen3.5-397b-a17b

5.36B tokensThinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Jan 14, 2026·262K context5.36B tokens

Text

$0.18/M tokens in$1.06/M tokens out

alibaba/qwen3.5-35b-a3b

4.04B tokensThinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Jan 6, 2026·262K context4.04B tokens

Text

$0.06/M tokens in$0.47/M tokens out

alibaba/qwen3.5-omni

4.76B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Mar 12, 20264.76B tokens

Audio

$1.04/M tokens in$5.91/M tokens out

alibaba/qwen3.5-122b-a10b

4.58B tokensThinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Mar 13, 2026·262K context4.58B tokens

Text

$0.12/M tokens in$0.95/M tokens out

alibaba/qwen3.5-flash

3.38B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Jan 19, 2026·1M context3.38B tokens

Text

$0.03/M tokens in$0.3/M tokens out

tencent/gpt-image-2

0.08B tokens

Text-to-image model. Generates original images from natural-language prompts.

by tencent·Apr 21, 20260.08B tokens

Image

from $0.12 per call

moonshot/Kimi-K2.5

34.8B tokensThinkingchineselong-context

Moonshot's Kimi K2.5 — Chinese-first model with exceptional long-context ability. Known for strong reading comprehension.

by moonshot·May 26, 2026·262K context34.8B tokens

Text

$0.6/M tokens in$3.00/M tokens out

azure/gpt-image-2

0.09B tokens

Text-to-image model. Generates original images from natural-language prompts.

by azure·May 7, 20260.09B tokens

Image

$5.00/M tokens in$30.00/M tokens out

tencent/Kimi-K2.5

35.3B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Mar 12, 2026·262K context35.3B tokens

Text

$0.59/M tokens in$3.11/M tokens out

alibaba/Kimi-K2.5

31.2B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibaba·Feb 10, 2026·262K context31.2B tokens

Text

$0.59/M tokens in$3.11/M tokens out

bytedance/doubao-seed-2.0-code

0.40B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Jan 16, 2026·256K context0.40B tokens

Text

$0.47/M tokens in$2.37/M tokens out

bytedance/doubao-seed-2.0-lite

0.09B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·May 1, 2026·262K context0.09B tokens

Text

$0.09/M tokens in$0.53/M tokens out

bytedance/doubao-seed-2.0-pro

0.36B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·May 20, 2026·256K context0.36B tokens

Text

$0.47/M tokens in$2.37/M tokens out

bytedance/doubao-seedance-2-0

0.37B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Apr 10, 20260.37B tokens

Video

—$6.80/M tokens out

bytedance/doubao-seedance-2-0-fast

0.06B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Mar 15, 20260.06B tokens

Video

—$5.47/M tokens out

bytedance/doubao-seed-2.0-mini

0.43B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Apr 14, 2026·262K context0.43B tokens

Text

$0.03/M tokens in$0.3/M tokens out

bytedance/doubao-seedream-5.0

0.41B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Mar 8, 20260.41B tokens

Image

from $0.03 per call

alibaba/glm-5

0.07B tokensThinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibaba·May 19, 2026·200K context0.07B tokens

Text

$0.59/M tokens in$2.66/M tokens out

tencent/glm-5v-turbo

0.58B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Mar 22, 2026·200K context0.58B tokens

Text

$0.74/M tokens in$3.25/M tokens out

zhipu/glm-5

0.08B tokensThinkingchinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu ai·Apr 25, 2026·200K context0.08B tokens

Text

$0.59/M tokens in$2.66/M tokens out

tencent/glm-5-turbo

0.53B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Jan 22, 2026·200K context0.53B tokens

Text

$0.74/M tokens in$3.25/M tokens out

tencent/glm-5

0.08B tokensThinking

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Feb 23, 2026·200K context0.08B tokens

Text

$0.59/M tokens in$2.66/M tokens out

zhipu/glm-5-turbo

0.37B tokensThinkingchinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu ai·Feb 8, 2026·200K context0.37B tokens

Text

$0.74/M tokens in$3.25/M tokens out

aws/claude-opus-4-6

33.0B tokensThinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by aws·Apr 7, 2026·200K context33.0B tokens

Text

$5.00/M tokens in$25.00/M tokens out

aws/claude-opus-4-6-openai

30.8B tokensThinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by aws·Mar 2, 2026·200K context30.8B tokens

Text

$5.00/M tokens in$25.00/M tokens out

alibaba/kling-v3-video

0.51B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by alibaba·Apr 20, 20260.51B tokens

Video

from $0.09 per second

alibaba/kling-v3-omni

0.42B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by alibaba·May 8, 20260.42B tokens

Image

from $0.03 per call

alibaba/kling-v3-omni-video

0.50B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by alibaba·Mar 12, 20260.50B tokens

Video

from $0.09 per second

vidu/viduq3-mix

0.48B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·Feb 28, 20260.48B tokens

Video

from $0.75 per call

vidu/viduq3-pro

0.06B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·Apr 11, 20260.06B tokens

Video

from $0.75 per call

vidu/viduq3

0.07B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·Mar 11, 20260.07B tokens

Video

from $0.75 per call

vidu/viduq3-turbo

0.45B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·Jan 6, 20260.45B tokens

Video

from $0.35 per call

vidu/viduq3-pro-fast

0.47B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·May 6, 20260.47B tokens

Video

from $0.12 per second

alibaba/MiniMax-M2.5

0.07B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by alibaba·Feb 23, 2026·246K context0.07B tokens

Text

$0.31/M tokens in$1.24/M tokens out

minimax/MiniMax-M2.5

0.05B tokenslong-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimax·Jan 21, 2026·246K context0.05B tokens

Text

$0.3/M tokens in$1.20/M tokens out

tencent/MiniMax-M2.5

0.08B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Apr 15, 2026·246K context0.08B tokens

Text

$0.31/M tokens in$1.24/M tokens out

alibaba/wan2.6-image

0.06B tokens

Text-to-image model. Generates original images from natural-language prompts.

by alibaba·Jan 13, 20260.06B tokens

Image

from $0.03 per call

alibaba/wan2.6

0.09B tokens

Video generation model. Produces video clips from text or images.

by alibaba·Jan 23, 20260.09B tokens

Video

from $0.09 per second

alibaba/wan2.6-flash

0.39B tokens

Video generation model. Produces video clips from text or images.

by alibaba·Feb 6, 20260.39B tokens

Video

from $0.04 per second

azure/gpt-5.2

25.4B tokensfrontieragent

Upgraded GPT-5 with longer context and improved latency. Production default for demanding agentic workloads.

by azure·Apr 8, 2026·400K context25.4B tokens

Text

$1.75/M tokens in$14.00/M tokens out

azure/gpt-5.2-codex

33.8B tokenscodingagent

Codex variant of GPT-5.2 tuned for software engineering. Specialized for repo-aware coding agents.

by azure·Apr 28, 2026·400K context33.8B tokens

Text

$1.75/M tokens in$14.00/M tokens out

bytedance/doubao-seedream-4.5

0.51B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Apr 12, 20260.51B tokens

Image

from $0.04 per call

aws/claude-opus-4-5

22.1B tokensThinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by aws·May 8, 2026·200K context22.1B tokens

Text

$5.00/M tokens in$25.00/M tokens out

aws/claude-opus-4-5-openai

30.0B tokensThinkingfrontierflagship

Claude Opus — AWS's most capable (and expensive) tier. Reserved for the hardest problems.

by aws·Mar 23, 2026·200K context30.0B tokens

Text

$5.00/M tokens in$25.00/M tokens out

google/gemini-3-pro-image

4.02B tokensmultimodallong-context

Gemini Pro — Google's higher-quality Gemini tier. Strong reasoning with large context windows.

by google·May 14, 20264.02B tokens

Image

$2.00/M tokens in$120.00/M tokens out

google/gemini-3-pro

4.03B tokensmultimodallong-context

Gemini Pro — Google's higher-quality Gemini tier. Strong reasoning with large context windows.

by google·Apr 28, 2026·1M context4.03B tokens

Text

$2.00/M tokens in$12.00/M tokens out

minimax/MiniMax-M2.1

0.09B tokenslong-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimax·Feb 17, 2026·246K context0.09B tokens

Text

$0.3/M tokens in$1.20/M tokens out

zhipu/glm-4.7

0.07B tokensThinkingchinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu ai·Feb 5, 2026·200K context0.07B tokens

Text

$0.3/M tokens in$1.18/M tokens out

bytedance/doubao-seed-1.8

0.08B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Apr 7, 2026·128K context0.08B tokens

Text

$0.12/M tokens in$1.18/M tokens out

aws/claude-haiku-4-5

0.06B tokensfastcheap

Claude Haiku — fast, affordable AWS model. Best for high-volume real-time tasks.

by aws·Jan 7, 2026·200K context0.06B tokens

Text

$1.00/M tokens in$5.00/M tokens out

aws/claude-haiku-4-5-openai

0.42B tokensfastcheap

Claude Haiku — fast, affordable AWS model. Best for high-volume real-time tasks.

by aws·Feb 18, 2026·200K context0.42B tokens

Text

$1.00/M tokens in$5.00/M tokens out

google/veo3.1

0.09B tokens

Video generation model. Produces video clips from text or images.

by google·Jan 23, 20260.09B tokens

Video

from $0.4 per second

google/veo3.1-fast

0.57B tokens

Video generation model. Produces video clips from text or images.

by google·Jan 14, 20260.57B tokens

Video

from $0.15 per second

minimax/MiniMax-M2

0.37B tokenslong-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimax·Mar 18, 2026·246K context0.37B tokens

Text

$0.3/M tokens in$1.20/M tokens out

google/virtual-try-on-images

0.05B tokens

Text-to-image model. Generates original images from natural-language prompts.

by google·Feb 27, 20260.05B tokens

Image

from $0.06 per call

alibaba/aitryon-plus

0.38B tokens

Image generation model. Creates or edits images from text prompts.

by alibaba·Feb 18, 20260.38B tokens

Image

from $0.07 per call

alibaba/happyhouse-1.0

0.08B tokens

Video generation model. Produces video clips from text or images.

by alibaba·Mar 11, 20260.08B tokens

Video

from $0.13 per second

alibaba/happyhouse-1.0-video-edit

0.45B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by alibaba·Mar 2, 20260.45B tokens

Video

from $0.13 per second

zhipu/glm-4.6

0.41B tokensThinkingchinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu ai·Jan 4, 2026·128K context0.41B tokens

Text

$0.44/M tokens in$2.07/M tokens out

zhipu/glm-4.6v

0.32B tokenschinesebilingual

Zhipu GLM — Chinese LLM from Tsinghua. Solid bilingual support with academic training roots.

by zhipu ai·Apr 18, 2026·128K context0.32B tokens

Text

$0.15/M tokens in$0.44/M tokens out

azure/sora-video-generate-2.0

0.06B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by azure·Feb 13, 20260.06B tokens

Video

from $0.1 per second

aws/claude-sonnet-4-5

33.1B tokensThinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by aws·Feb 2, 2026·200K context33.1B tokens

Text

$3.00/M tokens in$15.00/M tokens out

tencent/deepseek-v3.2

35.8B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by tencent·Mar 21, 2026·128K context35.8B tokens

Text

$0.3/M tokens in$0.44/M tokens out

alibaba/deepseek-v3.2

24.2B tokensopen-weightcheap

DeepSeek — open-weight Chinese LLM family. Strong cost-to-quality ratio and good code generation.

by alibaba·Jan 27, 2026·128K context24.2B tokens

Text

$0.3/M tokens in$0.44/M tokens out

aws/claude-sonnet-4-5-openai

24.2B tokensThinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by aws·Apr 15, 2026·200K context24.2B tokens

Text

$3.00/M tokens in$15.00/M tokens out

alibaba/wan2.5-image

0.55B tokens

Text-to-image model. Generates original images from natural-language prompts.

by alibaba·Feb 18, 20260.55B tokens

Image

from $0.03 per call

alibaba/wan2.5

0.55B tokens

Video generation model. Produces video clips from text or images.

by alibaba·Feb 24, 20260.55B tokens

Video

from $0.04 per second

bytedance/doubao-seedance-1-5-pro

0.48B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·May 2, 20260.48B tokens

Video

—$2.37/M tokens out

tencent/tencent-video-edit-1.0

0.08B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by tencent·May 3, 20260.08B tokens

Video

from $0.000039 per second

tencent/tencent-video-enhance-1.0

0.07B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by tencent·Feb 3, 20260.07B tokens

Video

from $0.0045 per second

bytedance/doubao-seedream-4.0

0.07B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Apr 17, 20260.07B tokens

Image

from $0.03 per call

alibaba/video-style-transform

0.09B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by alibaba·May 27, 20260.09B tokens

Video

from $0.03 per second

minimax/minimax-music-2.0

0.06B tokenslong-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimax·Mar 25, 20260.06B tokens

Audio

from $0.03 per call

alibaba/qwen-flash

4.17B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Jan 11, 2026·1M context4.17B tokens

Text

$0.02/M tokens in$0.22/M tokens out

bytedance/deepseek-v3.1

31.4B tokensopen-weightcheapreasoning

Upgraded DeepSeek V3.1 with improved reasoning and better tool calling. Pareto-optimal on cost vs quality.

by bytedance·May 26, 2026·128K context31.4B tokens

Text

$0.59/M tokens in$1.77/M tokens out

azure/gpt5

0.08B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by azure·May 19, 2026·128K context0.08B tokens

Text

$1.25/M tokens in$10.00/M tokens out

azure/gpt5-chat

0.40B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by azure·Jan 28, 2026·200K context0.40B tokens

Text

$1.25/M tokens in$10.00/M tokens out

azure/gpt5-nano

0.44B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by azure·Mar 16, 2026·128K context0.44B tokens

Text

$0.05/M tokens in$0.4/M tokens out

azure/gpt5-mini

0.08B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by azure·Apr 27, 2026·128K context0.08B tokens

Text

$0.25/M tokens in$2.00/M tokens out

bytedance/ByteDance-Seed-SC

0.08B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by bytedance·May 19, 2026·128K context0.08B tokens

Text

$0.12/M tokens in$0.3/M tokens out

bytedance/doubao-witty-remark

0.48B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Feb 14, 20260.48B tokens

Audio

from $0.07 per call

alibaba/qwen-image

3.42B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Feb 4, 20263.42B tokens

Image

from $0.03 per call

tencent/tencent-image-translate-1.0

0.06B tokens

Text-to-image model. Generates original images from natural-language prompts.

by tencent·Jan 3, 20260.06B tokens

Image

from $0.03 per call

google/gemini-image-2.0-flash

4.78B tokenscheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by google·Apr 18, 20264.78B tokens

Image

$0.15/M tokens in$30.00/M tokens out

alibaba/wan2.2-animate-mix

0.46B tokens

Video generation model. Produces video clips from text or images.

by alibaba·Mar 26, 20260.46B tokens

Video

from $0.13 per second

alibaba/wan2.2

0.07B tokens

Video generation model. Produces video clips from text or images.

by alibaba·May 27, 20260.07B tokens

Video

from $0.02 per second

bytedance/doubao-seed-1.6

0.07B tokensThinkingchinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Jan 9, 2026·128K context0.07B tokens

Text

$0.12/M tokens in$1.18/M tokens out

bytedance/doubao-seedance

0.46B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Mar 6, 20260.46B tokens

Video

$1.48/M tokens in$1.48/M tokens out

bytedance/doubao-seedance-pro

0.52B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·May 22, 20260.52B tokens

Video

—$2.22/M tokens out

bytedance/doubao-realtime-audio-transcription

0.08B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Apr 9, 20260.08B tokens

Audio

from $0.07 per call

bytedance/doubao-embedding

0.08B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Apr 13, 2026·128K context0.08B tokens

Embedding

$0.07/M tokens in—

bytedance/doubao-seed-1.6-flash

0.43B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·May 14, 2026·128K context0.43B tokens

Text

$0.02/M tokens in$0.22/M tokens out

vidu/vidu-tts

0.09B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·May 23, 20260.09B tokens

Speech

—$100.00/M tokens out

vidu/vidu-voice-clone

0.52B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·Feb 22, 20260.52B tokens

Audio

from $1.50 per call

alibaba/qwen-text-embedding-v4

3.35B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Jan 19, 2026·128K context3.35B tokens

Embedding

$0.07/M tokens in—

alibaba/qwen-multimodal-embedding-v1

3.56B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Mar 1, 2026·128K context3.56B tokens

Embedding

$0.1/M tokens in—

alibaba/qwen-gte-rerank-v2

4.76B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Mar 2, 2026·128K context4.76B tokens

Rerank

$0.12/M tokens in—

alibaba/deepseek-r1-0528

26.3B tokensopen-weightcheap

DeepSeek — open-weight Chinese LLM family. Strong cost-to-quality ratio and good code generation.

by alibaba·Apr 21, 2026·128K context26.3B tokens

Text

$0.59/M tokens in$2.37/M tokens out

aws/claude-4-sonnet

0.54B tokensThinkingagentcoding

Claude 4 Sonnet — balance of speed, quality, and cost for agentic workflows and production coding.

by aws·Mar 10, 2026·200K context0.54B tokens

Text

$3.00/M tokens in$15.00/M tokens out

aws/claude-4-sonnet-openai

0.08B tokensThinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by aws·Mar 9, 2026·200K context0.08B tokens

Text

$3.00/M tokens in$15.00/M tokens out

google/veo3

0.47B tokens

Video generation model. Produces video clips from text or images.

by google·Feb 28, 20260.47B tokens

Video

from $0.4 per second

alibaba/qwen3-32b

3.76B tokensThinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Jan 27, 2026·128K context3.76B tokens

Text

$0.3/M tokens in$1.18/M tokens out

alibaba/qwen3-30b

5.51B tokensThinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Apr 21, 2026·128K context5.51B tokens

Text

$0.11/M tokens in$0.44/M tokens out

alibaba/qwen3-235b

4.42B tokensThinkingchinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Apr 22, 2026·128K context4.42B tokens

Text

$0.3/M tokens in$1.18/M tokens out

alibaba/qwen3-coder-plus

5.33B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Jan 6, 2026·1M context5.33B tokens

Text

$0.59/M tokens in$2.37/M tokens out

alibaba/qwen3-coder-flash

4.25B tokenschinesebilingual

Alibaba Qwen series — Chinese-first LLMs with strong bilingual support. Wide range from turbo to max tiers.

by alibaba·Feb 16, 2026·1M context4.25B tokens

Text

$0.15/M tokens in$0.59/M tokens out

azure/gpt-image-1

0.57B tokens

Latest Azure image model with improved realism and editing. Supports inpainting, outpainting, and mask-guided edits.

by azure·Jan 8, 20260.57B tokens

Image

$5.00/M tokens in$40.00/M tokens out

vidu/vidu-2.0

0.41B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·Feb 20, 20260.41B tokens

Video

from $0.1 per call

vidu/viduq2-pro

0.39B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·May 8, 20260.39B tokens

Video

from $1.30 per call

vidu/viduq2

0.42B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·Apr 12, 20260.42B tokens

Video

from $0.35 per call

vidu/viduq2-turbo

0.08B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·Jan 27, 20260.08B tokens

Video

from $0.43 per call

vidu/viduq2-pro-fast

0.09B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·May 21, 20260.09B tokens

Video

from $0.18 per call

azure/gpt-o3

0.09B tokensreasoningfrontier

Next-generation reasoning model succeeding o1. Solves problems that stumped previous models, at a reasonable cost.

by azure·Apr 7, 2026·200K context0.09B tokens

Text

$2.00/M tokens in$8.00/M tokens out

azure/gpt-o3-mini

0.08B tokensreasoningthinking

Azure o-series — reasoning-first models that think before answering. Best for hard math, science, and code.

by azure·May 27, 2026·200K context0.08B tokens

Text

$1.10/M tokens in$4.40/M tokens out

azure/gpt-4.1

24.9B tokenscodinglong-contexttools

Code-focused GPT-4 successor with stronger instruction following and 1M+ context. Great for long-document analysis and agentic coding.

by azure·Feb 26, 2026·1M context24.9B tokens

Text

$2.00/M tokens in$8.00/M tokens out

azure/gpt-4.1-nano

27.2B tokenscheapestfast

Tiny model optimized for classification and structured output. Cheapest in the GPT-4 family.

by azure·Feb 9, 2026·1M context27.2B tokens

Text

$0.1/M tokens in$0.4/M tokens out

azure/gpt-4.1-mini

36.9B tokenscheaplong-context

Smaller GPT-4.1 with the same 1M context at a fraction of the cost. The new default for long-context RAG and bulk processing.

by azure·Jan 26, 2026·1M context36.9B tokens

Text

$0.4/M tokens in$1.60/M tokens out

minimax/minimax-video-1.0

0.48B tokenslong-contextchinese

MiniMax — Chinese LLM family with hybrid attention for extreme-length contexts.

by minimax·Mar 12, 20260.48B tokens

Video

from $0.28 per call

bytedance/doubao-seaweed

0.36B tokenschinesebytedance

ByteDance Doubao — Chinese LLM family tuned for the Volcano Engine cloud and ByteDance ecosystem.

by bytedance·Jan 16, 20260.36B tokens

Video

$1.48/M tokens in$1.48/M tokens out

google/gemini-2.5-pro

27.5B tokensThinkingreasoningfrontierlong-context

Gemini 2.5 Pro — Google's top reasoning model with thinking mode. Frontier performance on coding and math.

by google·Jan 10, 2026·2M context27.5B tokens

Text

$1.25/M tokens in$10.00/M tokens out

google/gemini-2.5-pro-tts

39.0B tokensmultimodallong-context

Gemini Pro — Google's higher-quality Gemini tier. Strong reasoning with large context windows.

by google·Jan 26, 202639.0B tokens

Audio

$1.00/M tokens in$20.00/M tokens out

google/gemini-2.5-flash

40.1B tokensThinkingcheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by google·Apr 5, 2026·1M context40.1B tokens

Text

$0.3/M tokens in$2.50/M tokens out

google/gemini-2.5-flash-image

29.4B tokenscheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by google·Apr 27, 202629.4B tokens

Image

$0.3/M tokens in$30.00/M tokens out

google/gemini-2.5-flash-tts

31.6B tokenscheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by google·May 19, 202631.6B tokens

Audio

$0.5/M tokens in$10.00/M tokens out

aws/claude-3-7-sonnet

3.86B tokensThinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by aws·Jan 3, 2026·200K context3.86B tokens

Text

$3.00/M tokens in$15.00/M tokens out

aws/claude-3-7-sonnet-openai

5.01B tokensThinkingcodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by aws·May 14, 2026·200K context5.01B tokens

Text

$3.00/M tokens in$15.00/M tokens out

vidu/vidu-q1

0.41B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·Mar 10, 20260.41B tokens

Video

from $0.4 per call

vidu/viduq1-classic

0.38B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·May 20, 20260.38B tokens

Video

from $0.4 per call

alibaba/deepseek-r1

31.6B tokensreasoningopen-weightthinking

DeepSeek R1 — a reasoning-first model trained with reinforcement learning. Competes with o1-class models at much lower cost.

by alibaba·Mar 17, 2026·128K context31.6B tokens

Text

$0.59/M tokens in$2.37/M tokens out

alibaba/wanx2.1-plus

0.08B tokens

Text-to-image model. Generates original images from natural-language prompts.

by alibaba·Apr 19, 20260.08B tokens

Video

from $0.1 per second

alibaba/wanx2.1-turbo

0.09B tokens

Text-to-image model. Generates original images from natural-language prompts.

by alibaba·May 11, 20260.09B tokens

Video

from $0.04 per second

alibaba/deepseek-v3

33.8B tokensopen-weightcheapcoding

Open-weight DeepSeek V3 — MoE architecture delivering frontier-adjacent quality at a fraction of the cost.

by alibaba·Feb 3, 2026·128K context33.8B tokens

Text

$0.3/M tokens in$1.18/M tokens out

google/gemini-2.0-flash

22.1B tokensmultimodaltoolssearch

Next-gen Gemini Flash with improved reasoning and native tool use. Drop-in upgrade to 1.5 Flash.

by google·May 4, 2026·1M context22.1B tokens

Text

$0.15/M tokens in$0.6/M tokens out

google/gemini-2.0-flash-lite

28.9B tokenscheapmultimodal

Gemini Flash — fast Google multimodal model with long context. Best value for volume tasks.

by google·Jan 13, 2026·1M context28.9B tokens

Text

$0.08/M tokens in$0.3/M tokens out

azure/sora-video-generate-1.0

0.36B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by azure·Mar 22, 20260.36B tokens

Video

from $0.1 per second

bytedance/dubao_reasoning_lite_32k

0.08B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by bytedance·Mar 13, 2026·32K context0.08B tokens

Text

$0.04/M tokens in$0.09/M tokens out

vidu/vidu-1.5

0.33B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·Jan 16, 20260.33B tokens

Video

from $0.07 per second

alibaba/qwen-plus

5.70B tokenschinesebalanced

Balanced Qwen tier with strong Chinese + reasonable cost. The pragmatic default for production Chinese apps.

by alibaba·Mar 9, 2026·1M context5.70B tokens

Text

$0.12/M tokens in$0.3/M tokens out

alibaba/qwen-max

24.7B tokenschinesebilingualflagship

Alibaba's flagship Qwen model. Strong bilingual (Chinese / English) performance, especially tuned for enterprise scenarios.

by alibaba·Jan 7, 2026·262K context24.7B tokens

Text

$0.35/M tokens in$1.42/M tokens out

vidu/vidu-t1

0.07B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by vidu·Jan 19, 20260.07B tokens

Video

from $0.22 per call

azure/gpt-o1

0.06B tokensreasoningthinking

Reasoning-first model that thinks before answering. Best for math, science, and multi-step problem solving.

by azure·Jan 9, 2026·128K context0.06B tokens

Text

$15.00/M tokens in$60.00/M tokens out

zhipu/charglm-4

0.47B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by zhipu ai·Jan 2, 2026·32K context0.47B tokens

Text

$0.15/M tokens in$0.15/M tokens out

bytedance/dubao_pro_32k

0.09B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by bytedance·Feb 3, 2026·32K context0.09B tokens

Text

$0.12/M tokens in$0.3/M tokens out

bytedance/dubao_pro_32k_init

0.42B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by bytedance·Apr 20, 2026·32K context0.42B tokens

Text

$0.12/M tokens in$0.3/M tokens out

aws/claude3-5-sonnet

0.49B tokenscodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by aws·May 2, 2026·200K context0.49B tokens

Text

$3.00/M tokens in$15.00/M tokens out

zhipu/zhipu-vidu2-image

0.37B tokens

Text-to-image model. Generates original images from natural-language prompts.

by zhipu ai·Jan 14, 20260.37B tokens

Video

from $0.15 per call

zhipu/zhipu-viduq1

0.40B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by zhipu ai·Apr 8, 20260.40B tokens

Video

from $0.15 per call

zhipu/zhipu-viduq2-reference

0.07B tokens

Text-to-video or image-to-video model. Generates short video clips with configurable duration and resolution.

by zhipu ai·May 17, 20260.07B tokens

Video

from $0.15 per call

zhipu/zhipu-doubao-seedance

0.06B tokens

Video generation model. Produces video clips from text or images.

by zhipu ai·May 11, 20260.06B tokens

Video

$1.48/M tokens in$1.48/M tokens out

zhipu/zhipu-doubao-seedance-pro

0.07B tokens

Video generation model. Produces video clips from text or images.

by zhipu ai·Apr 3, 20260.07B tokens

Video

$2.22/M tokens in$2.22/M tokens out

zhipu/zhipu-doubao-seedream-4.0

0.55B tokens

Video generation model. Produces video clips from text or images.

by zhipu ai·May 8, 20260.55B tokens

Video

from $0.03 per call

bytedance/dubao-pro-32k-large

0.08B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by bytedance·Jan 5, 2026·32K context0.08B tokens

Text

$0.12/M tokens in$0.3/M tokens out

azure/gpt-4o

24.0B tokensvisiontoolsstreaming

Flagship multimodal model from Azure with native text, vision, and voice understanding. Strong at general-purpose reasoning and instruction following.

by azure·Mar 12, 2026·128K context24.0B tokens

Text

$2.50/M tokens in$10.00/M tokens out

azure/gpt-4o-mini

27.9B tokenscheapfastvision

Cheap and fast sibling of GPT-4o. Best value for high-volume classification, extraction, and routing tasks.

by azure·Feb 18, 2026·128K context27.9B tokens

Text

$0.15/M tokens in$0.6/M tokens out

alibaba/qwen-long

4.11B tokenslong-contextchinese

Qwen variant with very long context (10M+ tokens). Purpose-built for long-document analysis and codebase-level tasks.

by alibaba·Mar 7, 2026·10M context4.11B tokens

Text

$0.07/M tokens in$0.3/M tokens out

alibaba/qwen-turbo

5.59B tokenschinesecheapfast

Smallest, cheapest Qwen. Good for classification, routing, and high-volume light tasks in Chinese.

by alibaba·Jan 9, 2026·1M context5.59B tokens

Text

$0.04/M tokens in$0.09/M tokens out

aws/claude3-haiku

0.08B tokensfastcheap

Claude Haiku — fast, affordable AWS model. Best for high-volume real-time tasks.

by aws·Mar 3, 2026·200K context0.08B tokens

Text

$0.25/M tokens in$1.25/M tokens out

aws/claude-3-sonnet

3.67B tokenscodingwritingtools

Claude Sonnet — AWS's balanced model. Strong coding, writing, and tool use with 200K context.

by aws·May 17, 2026·200K context3.67B tokens

Text

$3.00/M tokens in$15.00/M tokens out

azure/dall-e-3

0.42B tokens

Azure's production image generator. Known for strong prompt adherence and coherent in-image text rendering.

by azure·May 28, 20260.42B tokens

Image

from $0.04 per call

azure/gpt4

0.46B tokens

Text generation model. Compatible with the OpenAI Chat Completions API.

by azure·Apr 18, 2026·32K context0.46B tokens

Text

$30.00/M tokens in$60.00/M tokens out