TokenLX — One API for 235+ AI Models

Claude Haiku — fast, affordable AWS model. Best for high-volume real-time tasks.

Key strengths

Sub-second latency
Low cost
Tool calling
Reliable at scale

Use cases

Real-time chat
Bulk classification
Inline suggestions
Agent sub-steps

fastcheap

AWS's aws/claude-haiku-4-5 is a frontier text generation model in the Claude family. It excels at complex reasoning, agentic workflows, code generation, and long-form writing tasks, with native support for streaming, tool calling, JSON mode, and multi-turn conversations.

The model handles long-context inputs gracefully and is particularly effective for software engineering, multi-step research, and end-to-end project execution. Its tokenizer and pricing are optimized for high-throughput production workloads, with a competitive cost profile relative to other models in its tier.

aws/claude-haiku-4-5 is fully OpenAI-compatible — drop in your existing OpenAI Python or Node SDK and switch `baseURL` to `https://api.tokenlx.ai`. TokenLX transparently routes your requests to the optimal provider endpoint while preserving streaming, function-calling, and structured-output semantics.

Performance

Compare different providers across TokenLX · All locations.

Throughput

tok/s

Latency

E2E Latency

166

Tool Call Errors

0.07

Output Errors

0.38

Time to First Token

Effective Pricing

Actual cost per million tokens across providers over the past 7 days.

Input

$1.00

per 1M tokens

7d agotoday

Output

$5.00

per 1M tokens

7d agotoday

Cache read

$0.1

per 1M tokens

7d agotoday

Cache write

$1.25

per 1M tokens

7d agotoday

Recent activity

Total usage per day on TokenLX (last 30 days).

Prompt

23.75M

Completion

32.06M

30d ago15d agotoday

Sample code & API

TokenLX normalizes requests and responses across providers. Use any OpenAI SDK or our native SDK.

import anthropic

client = anthropic.Anthropic(
    base_url="https://api.tokenlx.ai",
    api_key="sk-tokenlx-...",
)

# Non-streaming
message = client.messages.create(
    model="claude-haiku-4-5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"},
    ],
)
print(message.content[0].text)

# Streaming
with client.messages.stream(
    model="claude-haiku-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Tell me a story"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Replace sk-aihubrouter-… with your key from the dashboard.

API Endpoints

Chat Completions

Sends a request for a model response for the given chat conversation. Supports both streaming and non-streaming modes.

POSThttps://api.tokenlx.ai/v1/chat/completions

AuthorizationBearer $TOKENLX_API_KEY

Content-Typeapplication/json

Modelclaude-haiku-4-5

Responses API

Creates a streaming or non-streaming response using the OpenAI Responses API format.

POSThttps://api.tokenlx.ai/v1/responses

AuthorizationBearer $TOKENLX_API_KEY

Content-Typeapplication/json

Modelclaude-haiku-4-5

Anthropic Messages

Creates a message using the Anthropic Messages API format. Supports text, images, tools, and extended thinking.

POSThttps://api.tokenlx.ai/v1/messages

Authorizationx-api-key: $TOKENLX_API_KEY

Content-Typeapplication/json

Modelclaude-haiku-4-5

Sample code & API

TokenLX normalizes requests and responses across providers. Use any OpenAI SDK or our native SDK.

import anthropic

client = anthropic.Anthropic(
    base_url="https://api.tokenlx.ai",
    api_key="sk-tokenlx-...",
)

# Non-streaming
message = client.messages.create(
    model="claude-haiku-4-5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"},
    ],
)
print(message.content[0].text)

# Streaming
with client.messages.stream(
    model="claude-haiku-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Tell me a story"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Replace sk-aihubrouter-… with your key from the dashboard.

API Endpoints

Chat Completions

Sends a request for a model response for the given chat conversation. Supports both streaming and non-streaming modes.

POSThttps://api.tokenlx.ai/v1/chat/completions

AuthorizationBearer $TOKENLX_API_KEY

Content-Typeapplication/json

Modelclaude-haiku-4-5

Responses API

Creates a streaming or non-streaming response using the OpenAI Responses API format.

POSThttps://api.tokenlx.ai/v1/responses

AuthorizationBearer $TOKENLX_API_KEY

Content-Typeapplication/json

Modelclaude-haiku-4-5

Anthropic Messages

Creates a message using the Anthropic Messages API format. Supports text, images, tools, and extended thinking.

POSThttps://api.tokenlx.ai/v1/messages

Authorizationx-api-key: $TOKENLX_API_KEY

Content-Typeapplication/json

Modelclaude-haiku-4-5

aws/claude-haiku-4-5

Key strengths

Use cases

Performance

Effective Pricing

Recent activity

Sample code & API

API Endpoints

More models from AWS

aws/claude-haiku-4-5

Key strengths

Use cases

Performance

Effective Pricing

Recent activity

Sample code & API

API Endpoints

More models from AWS