Claude Haiku — fast, affordable AWS model. Best for high-volume real-time tasks.
Key strengths
- Sub-second latency
- Low cost
- Tool calling
- Reliable at scale
Use cases
- Real-time chat
- Bulk classification
- Inline suggestions
- Agent sub-steps
AWS's aws/claude3-haiku is a frontier text generation model in the Claude family. It excels at complex reasoning, agentic workflows, code generation, and long-form writing tasks, with native support for streaming, tool calling, JSON mode, and multi-turn conversations.
The model handles long-context inputs gracefully and is particularly effective for software engineering, multi-step research, and end-to-end project execution. Its tokenizer and pricing are optimized for high-throughput production workloads, with a competitive cost profile relative to other models in its tier.
aws/claude3-haiku is fully OpenAI-compatible — drop in your existing OpenAI Python or Node SDK and switch `baseURL` to `https://api.tokenlx.ai`. TokenLX transparently routes your requests to the optimal provider endpoint while preserving streaming, function-calling, and structured-output semantics.
Performance
Compare different providers across TokenLX · All locations.
Effective Pricing
Actual cost per million tokens across providers over the past 7 days.
Recent activity
Total usage per day on TokenLX (last 30 days).
Sample code & API
TokenLX normalizes requests and responses across providers. Use any OpenAI SDK or our native SDK.
import anthropic
client = anthropic.Anthropic(
base_url="https://api.tokenlx.ai",
api_key="sk-tokenlx-...",
)
# Non-streaming
message = client.messages.create(
model="claude3-haiku",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello!"},
],
)
print(message.content[0].text)
# Streaming
with client.messages.stream(
model="claude3-haiku",
max_tokens=1024,
messages=[{"role": "user", "content": "Tell me a story"}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)Replace sk-aihubrouter-… with your key from the dashboard.