$ethereal/docs

ethereal

Anthropic-compatible API. Drop-in replacement for api.anthropic.com — same wire format, same models, plug it into any agent in under a minute.

What you get

  • Anthropic Messages API on POST /v1/messages — verbatim wire format, including streaming, tool use, vision, prompt caching, batches.
  • OpenAI Chat Completions on POST /v1/chat/completions — same key, same models, for tools that don't speak Anthropic.
  • Claude family of models — Opus, Sonnet, Haiku, current and recent generations.
  • Per-key rate limits — sustained RPM, ISO-8601 reset, retry-after hints.
  • 32 MB request body cap — same as upstream Anthropic.

Where to start

ethereal

Anthropic-совместимый API. Drop-in замена api.anthropic.com — тот же wire-формат, те же модели, подключи к любому агенту за минуту.

Что получаешь

  • Anthropic Messages API на POST /v1/messages — verbatim wire-формат, включая streaming, tool use, vision, prompt caching, batches.
  • OpenAI Chat Completions на POST /v1/chat/completions — тот же ключ, те же модели, для тулзов которые не говорят на Anthropic.
  • Семейство Claude — Opus, Sonnet, Haiku, текущие и недавние поколения.
  • Per-key rate limits — устойчивый RPM, ISO-8601 reset, подсказки retry-after.
  • 32 MB лимит на тело запроса — как у Anthropic.

С чего начать

Quickstart

Three steps. Get a key, set the base URL, send a request.

1. Get a key

Keys look like sk-ant-api03-…. Issue them via your dashboard or ask the operator. View live usage at /dashboard.

2. Point your client at ethereal

Base URL: https://api.ethereal.llc

Anthropic SDK / curl: pass the key as x-api-key. OpenAI SDK / curl: pass it as Authorization: Bearer …. Both work, both authenticate the same key.

3. First request

Anthropic-style:

curl https://api.ethereal.llc/v1/messages \
  -H "x-api-key: $ETHEREAL_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "ping"}]
  }'

OpenAI-style:

curl https://api.ethereal.llc/v1/chat/completions \
  -H "Authorization: Bearer $ETHEREAL_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [{"role": "user", "content": "ping"}]
  }'

Next steps

Quickstart

Три шага. Получить ключ, выставить base URL, отправить запрос.

1. Получи ключ

Ключи вида sk-ant-api03-…. Выпусти через свой dashboard или попроси оператора. Онлайн-расход на /dashboard.

2. Направь клиент на ethereal

Base URL: https://api.ethereal.llc

Anthropic SDK / curl — клади ключ в x-api-key. OpenAI SDK / curl — в Authorization: Bearer …. Оба варианта работают, ключ один и тот же.

3. Первый запрос

Anthropic-style:

curl https://api.ethereal.llc/v1/messages \
  -H "x-api-key: $ETHEREAL_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "ping"}]
  }'

OpenAI-style:

curl https://api.ethereal.llc/v1/chat/completions \
  -H "Authorization: Bearer $ETHEREAL_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [{"role": "user", "content": "ping"}]
  }'

Дальше

Authentication

One key. Two header conventions. Same identity.

Header precedence

If both are set, x-api-key wins. Anthropic SDK uses x-api-key; the OpenAI SDK / OpenAI-style clients use Authorization: Bearer. Pick one per request.

# either
x-api-key: sk-ant-api03-…

# or
Authorization: Bearer sk-ant-api03-…

Anthropic version

Send anthropic-version: 2023-06-01 on every Anthropic-style request. Anything older is rejected; anything newer is accepted as long as the wire format is compatible.

Where to put the key

  • Never in URLs / query strings — they leak via referrer headers and proxy logs.
  • Never in client-side bundles — same key issued to your CI is the same key serving your support team.
  • Server-side env is fine. Rotate on suspicion.
Self-introspect GET /v1/key/info with your own key returns the full plan, usage, rate limits and team membership. The dashboard is just a visual layer over this endpoint.

Авторизация

Один ключ. Два варианта заголовка. Одна личность.

Приоритет заголовков

Если оба выставлены, x-api-key побеждает. Anthropic SDK использует x-api-key, OpenAI SDK / OpenAI-style клиенты — Authorization: Bearer. На один запрос — один транспорт.

# либо
x-api-key: sk-ant-api03-…

# либо
Authorization: Bearer sk-ant-api03-…

Anthropic version

Шли anthropic-version: 2023-06-01 на каждом Anthropic-запросе. Старее — отказ; новее — ок если wire совместим.

Куда класть ключ

  • Не в URL / query string — утекает через referrer и логи прокси.
  • Не в клиентский бандл — тот же ключ из CI = ключ из саппорт-чата.
  • Server-side env — ок. Ротируй при подозрении.
Self-introspect GET /v1/key/info со своим ключом возвращает план, расход, rate limits и команду. Dashboard — это просто визуализация поверх этого эндпоинта.

Endpoints

Anthropic + OpenAI surfaces, side by side.

methodpathshapenote
POST/v1/messagesAnthropicMessages API, stream + non-stream
POST/v1/messages/count_tokensAnthropicToken estimator
POST/v1/messages/batchesAnthropicAsync batch submit
GET/v1/messages/batchesAnthropicList batches
GET/v1/messages/batches/{id}AnthropicBatch status
POST/v1/chat/completionsOpenAIChat completions, stream + non-stream
GET/v1/modelsbothLive model catalogue
GET/v1/files · POST /v1/filesAnthropicFiles API (upload, list)
GET/v1/files/{id} · /contentAnthropicGet metadata / blob
GET/v1/key/infoSelf-introspect: plan, usage, limits

Base URL

https://api.ethereal.llc — Anthropic SDK doesn't want a /v1 suffix (the SDK adds it). OpenAI SDK does — set base to https://api.ethereal.llc/v1.

Эндпоинты

Anthropic + OpenAI surface, рядом.

methodpathформатnote
POST/v1/messagesAnthropicMessages API, stream + non-stream
POST/v1/messages/count_tokensAnthropicПодсчёт токенов
POST/v1/messages/batchesAnthropicAsync batch submit
GET/v1/messages/batchesAnthropicСписок
GET/v1/messages/batches/{id}AnthropicСтатус batch'а
POST/v1/chat/completionsOpenAIChat completions, stream + non-stream
GET/v1/modelsобаLive каталог моделей
GET/v1/files · POST /v1/filesAnthropicFiles API (upload, list)
GET/v1/files/{id} · /contentAnthropicМетаданные / blob
GET/v1/key/infoSelf-introspect: план, расход, лимиты

Base URL

https://api.ethereal.llc — Anthropic SDK не хочет /v1 в base'е (SDK сам добавит). OpenAI SDK — хочет, ставь https://api.ethereal.llc/v1.

Models

Latest Claude families. Full versioned ids preferred.

Available ids

  • claude-opus-4-7 · claude-opus-4-6 · claude-opus-4-5
  • claude-sonnet-4-6 · claude-sonnet-4-5
  • claude-haiku-4-5
  • claude-3-7-sonnet-20250219
  • claude-3-5-sonnet-20241022 · claude-3-5-haiku-20241022
  • claude-3-opus-20240229

Live catalogue

curl https://api.ethereal.llc/v1/models \
  -H "x-api-key: $ETHEREAL_KEY"

Returns the full list with display names and creation dates. The dot-form (claude-opus-4.7) and dash-form (claude-opus-4-7) both resolve to the same model.

Aliases claude-sonnet-4 (without the minor) is not in the catalogue and may route to a guard-rail default. Use the full versioned id.

Модели

Свежие семейства Claude. Используй полные versioned id.

Доступные id

  • claude-opus-4-7 · claude-opus-4-6 · claude-opus-4-5
  • claude-sonnet-4-6 · claude-sonnet-4-5
  • claude-haiku-4-5
  • claude-3-7-sonnet-20250219
  • claude-3-5-sonnet-20241022 · claude-3-5-haiku-20241022
  • claude-3-opus-20240229

Live-каталог

curl https://api.ethereal.llc/v1/models \
  -H "x-api-key: $ETHEREAL_KEY"

Возвращает весь список с display names и датами. Dot-форма (claude-opus-4.7) и dash-форма (claude-opus-4-7) роутятся в одну модель.

Алиасы claude-sonnet-4 (без minor) нет в каталоге и может попасть в guard-rail дефолт. Используй полный versioned id.

Streaming

Server-Sent Events on both transports. Identical to upstream wire format.

Anthropic SSE

curl -N https://api.ethereal.llc/v1/messages \
  -H "x-api-key: $ETHEREAL_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 256,
    "stream": true,
    "messages": [{"role": "user", "content": "count to 5"}]
  }'

Events fired: message_start, content_block_start, content_block_delta, content_block_stop, message_delta, message_stop. Stop reason in message_delta.

OpenAI SSE

curl -N https://api.ethereal.llc/v1/chat/completions \
  -H "Authorization: Bearer $ETHEREAL_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "stream": true,
    "messages": [{"role": "user", "content": "count to 5"}]
  }'

Events come as data: {…} chunks ending with data: [DONE]. Identical to OpenAI's own format.

Streaming

SSE на обоих транспортах. Wire-формат как у апстримов.

Anthropic SSE

curl -N https://api.ethereal.llc/v1/messages \
  -H "x-api-key: $ETHEREAL_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "max_tokens": 256,
    "stream": true,
    "messages": [{"role": "user", "content": "count to 5"}]
  }'

События: message_start, content_block_start, content_block_delta, content_block_stop, message_delta, message_stop. Stop reason — внутри message_delta.

OpenAI SSE

curl -N https://api.ethereal.llc/v1/chat/completions \
  -H "Authorization: Bearer $ETHEREAL_KEY" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "stream": true,
    "messages": [{"role": "user", "content": "count to 5"}]
  }'

Чанки data: {…}, в конце data: [DONE]. Один в один как у OpenAI.

Tool use

Function calling on both transports.

Anthropic shape

{
  "model": "claude-sonnet-4-5",
  "max_tokens": 1024,
  "tools": [{
    "name": "get_weather",
    "description": "Look up current weather for a city",
    "input_schema": {
      "type": "object",
      "properties": {"city": {"type": "string"}},
      "required": ["city"]
    }
  }],
  "messages": [{"role": "user", "content": "weather in Berlin?"}]
}

OpenAI shape

{
  "model": "claude-sonnet-4-5",
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Look up current weather for a city",
      "parameters": {
        "type": "object",
        "properties": {"city": {"type": "string"}},
        "required": ["city"]
      }
    }
  }],
  "messages": [{"role": "user", "content": "weather in Berlin?"}]
}

The model emits a tool_use content block (Anthropic) or a tool_calls message (OpenAI). Round-trip the result back as tool_result / tool role on the next turn.

Tool use

Function calling на обоих транспортах.

Anthropic-формат

{
  "model": "claude-sonnet-4-5",
  "max_tokens": 1024,
  "tools": [{
    "name": "get_weather",
    "description": "Look up current weather for a city",
    "input_schema": {
      "type": "object",
      "properties": {"city": {"type": "string"}},
      "required": ["city"]
    }
  }],
  "messages": [{"role": "user", "content": "weather in Berlin?"}]
}

OpenAI-формат

{
  "model": "claude-sonnet-4-5",
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "Look up current weather for a city",
      "parameters": {
        "type": "object",
        "properties": {"city": {"type": "string"}},
        "required": ["city"]
      }
    }
  }],
  "messages": [{"role": "user", "content": "weather in Berlin?"}]
}

Модель отдаёт tool_use блок (Anthropic) или tool_calls message (OpenAI). Вернёшь результат на следующий ход через tool_result / role tool.

Vision

Pass images inline as base64 or by URL.

{
  "model": "claude-sonnet-4-5",
  "max_tokens": 1024,
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "What's in this picture?"},
      {"type": "image", "source": {
        "type": "base64",
        "media_type": "image/png",
        "data": "iVBORw0KGgoAAA..."
      }}
    ]
  }]
}

URL form — {"type": "image", "source": {"type": "url", "url": "https://…"}} — fetches the image server-side. PNG, JPEG, GIF, WEBP supported.

Token budget Each image is billed at ~1200 input tokens regardless of size. Large images downscale before going to the model.

Vision

Картинки inline base64 либо по URL.

{
  "model": "claude-sonnet-4-5",
  "max_tokens": 1024,
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "Что на этой картинке?"},
      {"type": "image", "source": {
        "type": "base64",
        "media_type": "image/png",
        "data": "iVBORw0KGgoAAA..."
      }}
    ]
  }]
}

URL-форма — {"type": "image", "source": {"type": "url", "url": "https://…"}} — фетчит картинку server-side. PNG, JPEG, GIF, WEBP.

Token бюджет Каждая картинка ≈ 1200 input-токенов независимо от размера. Большие даунскейлятся перед отправкой в модель.

Prompt caching

Mark a block with cache_control; subsequent requests with the same prefix replay it cheap.

{
  "model": "claude-sonnet-4-5",
  "max_tokens": 512,
  "system": [
    {
      "type": "text",
      "text": "Long context document goes here…",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [{"role": "user", "content": "summarise the doc"}]
}

Cached prefixes are scoped to the requesting key. They expire ~5 minutes after last hit. The response's usage.cache_creation_input_tokens / cache_read_input_tokens tells you what landed in cache vs what re-read.

Prompt caching

Помечаешь блок cache_control, следующие запросы с тем же префиксом дёшево его реплеят.

{
  "model": "claude-sonnet-4-5",
  "max_tokens": 512,
  "system": [
    {
      "type": "text",
      "text": "Большой контекстный документ…",
      "cache_control": {"type": "ephemeral"}
    }
  ],
  "messages": [{"role": "user", "content": "сделай саммари"}]
}

Кеш scoped на ключ. TTL ~5 минут после последнего хита. Поля ответа usage.cache_creation_input_tokens / cache_read_input_tokens показывают, что попало в кеш и что зачиталось.

Batches

Async submit, poll, fetch results. For when you don't need a synchronous answer.

Submit

POST /v1/messages/batches
{
  "requests": [
    {"custom_id": "q1", "params": {…}},
    {"custom_id": "q2", "params": {…}}
  ]
}

Poll

GET /v1/messages/batches/{id}
{
  "id": "msgbatch_…",
  "processing_status": "in_progress" | "ended",
  "request_counts": {"processing":1, "succeeded":1, "errored":0, ...}
}

Results

GET /v1/messages/batches/{id}/results returns NDJSON: one line per request.

{"custom_id":"q1","result":{"type":"succeeded","message":{...}}}
{"custom_id":"q2","result":{"type":"succeeded","message":{...}}}

Batches

Async submit, polling, забор результатов. Когда синхронный ответ не нужен.

Submit

POST /v1/messages/batches
{
  "requests": [
    {"custom_id": "q1", "params": {…}},
    {"custom_id": "q2", "params": {…}}
  ]
}

Polling

GET /v1/messages/batches/{id}
{
  "id": "msgbatch_…",
  "processing_status": "in_progress" | "ended",
  "request_counts": {"processing":1, "succeeded":1, "errored":0, ...}
}

Результаты

GET /v1/messages/batches/{id}/results — NDJSON, по строке на запрос.

{"custom_id":"q1","result":{"type":"succeeded","message":{...}}}
{"custom_id":"q2","result":{"type":"succeeded","message":{...}}}

Rate limits

Per-key cap, with retry-after. Two header sets to read.

Headers

headermeaning
x-ratelimit-remainingrequests left in the current window
x-ratelimit-resetseconds until window rolls over
anthropic-ratelimit-requests-limitper-key RPM cap
anthropic-ratelimit-requests-remainingsame as x-ratelimit-remaining
anthropic-ratelimit-requests-resetISO-8601 reset timestamp
anthropic-ratelimit-tokens-{limit,remaining,reset}token bucket mirror
retry-afteron 429 — seconds to wait

Backoff

On 429 read retry-after and sleep at least that long. Exponential backoff with jitter on top is fine. Don't hammer — repeated 429s extend the cool window.

Pool exhaustion A 429 with a long retry-after (minutes-to-hours) means the upstream pool is saturated, not just your key. Drop frequency or fall back to your own provider until it clears.

Rate limits

Per-key cap с retry-after. Два набора заголовков.

Заголовки

headerчто значит
x-ratelimit-remainingзапросов осталось в окне
x-ratelimit-resetсекунд до сброса окна
anthropic-ratelimit-requests-limitper-key RPM
anthropic-ratelimit-requests-remaining= x-ratelimit-remaining
anthropic-ratelimit-requests-resetISO-8601 reset
anthropic-ratelimit-tokens-{limit,remaining,reset}token bucket зеркало
retry-afterна 429 — секунд ждать

Backoff

На 429 читай retry-after, спи минимум столько. Экспонента с jitter сверху — норм. Не долби — повторные 429 продляют cool-окно.

Истощение пула 429 с длинным retry-after (минуты-часы) — это не твой ключ, а апстрим-пул сел. Снижай частоту или фолбэкнись на свой провайдер до восстановления.

Errors

Anthropic-shape error envelopes. Standard HTTP status codes.

{
  "type": "error",
  "error": {
    "type": "rate_limit_error",
    "message": "Per-key rate limit exceeded"
  }
}
statustypemeaning
400invalid_request_errormalformed body or bad parameter
401authentication_errorkey missing or invalid
403permission_errorkey disabled, model not on plan
404not_found_errorunknown resource
413request_too_largebody over 32 MB or context over model window
429rate_limit_errorper-key or upstream throttle
500api_errorinternal failure; retry
529overloaded_errorupstream overloaded; retry with backoff

Request id

Every response carries request-id: req_01…. Quote it in support tickets — it's how we find your request in logs.

Ошибки

Anthropic-shape envelope. Стандартные HTTP коды.

{
  "type": "error",
  "error": {
    "type": "rate_limit_error",
    "message": "Per-key rate limit exceeded"
  }
}
statustypeчто значит
400invalid_request_errorбитое тело / параметр
401authentication_errorключ отсутствует/невалидный
403permission_errorключ выключен / модель вне тарифа
404not_found_errorнеизвестный ресурс
413request_too_largeтело >32 MB или контекст > окна модели
429rate_limit_errorper-key или upstream throttle
500api_errorвнутренний сбой; ретрай
529overloaded_errorапстрим перегружен; ретрай с backoff

Request id

Каждый ответ несёт request-id: req_01…. Цитируй его в тикетах — по нему ищем твой запрос в логах.

Integrations

If your tool speaks Anthropic or OpenAI wire format, point its base URL at https://api.ethereal.llc and use $ETHEREAL_KEY.

Coding agents (terminal)

IDEs & editors

VS Code extensions

Other

Интеграции

Если тул говорит на Anthropic или OpenAI wire — направь base URL на https://api.ethereal.llc, ключ в $ETHEREAL_KEY.

Coding-агенты (terminal)

IDE и редакторы

VS Code расширения

Прочее

FAQ

Why is my key starting with sk-ant-api03- rejected on the Anthropic SDK?

Anthropic SDK passes the key under x-api-key; that's what we accept by default. If you're going through the OpenAI adapter, use Authorization: Bearer. Both work, just pick one per request.

Does prompt caching work?

Yes. Mark blocks with cache_control: {"type":"ephemeral"}. Cache is scoped to your key.

Can I count tokens before sending?

POST /v1/messages/count_tokens with the same body as /v1/messages. Returns {"input_tokens": N}.

Streaming?

Set stream: true. Anthropic SSE on /v1/messages, OpenAI SSE on /v1/chat/completions. Identical to upstream wire format.

What's the request body limit?

32 MB, matching Anthropic. If you need more, batch it or use the Files API.

Tool use across both transports?

Yes — Anthropic tool_use on /v1/messages, OpenAI tool_calls on /v1/chat/completions. The model and underlying call are the same.

Vision?

Inline base64 or URL. ~1200 input tokens per image. PNG / JPEG / GIF / WEBP.

Can my key access every model?

Depends on plan. GET /v1/key/info returns models: [...] — that's the actual whitelist. Empty/missing = no restriction.

FAQ

Почему ключ sk-ant-api03- отказывает на Anthropic SDK?

Anthropic SDK кладёт ключ в x-api-key — мы это принимаем. Если ходишь через OpenAI адаптер — Authorization: Bearer. Оба варианта живые.

Prompt caching работает?

Да. Помечай блоки cache_control: {"type":"ephemeral"}. Кеш scoped на ключ.

Можно посчитать токены до отправки?

POST /v1/messages/count_tokens с тем же телом что у /v1/messages. Возвращает {"input_tokens": N}.

Streaming?

stream: true. Anthropic SSE на /v1/messages, OpenAI SSE на /v1/chat/completions. Один в один как у апстрима.

Лимит на тело запроса?

32 MB, как у Anthropic. Больше — batch или Files API.

Tool use на обоих транспортах?

Да — Anthropic tool_use на /v1/messages, OpenAI tool_calls на /v1/chat/completions. Под капотом одно и то же.

Vision?

Inline base64 или URL. ~1200 input-токенов на картинку. PNG / JPEG / GIF / WEBP.

Все ли модели доступны моему ключу?

Зависит от плана. GET /v1/key/info возвращает models: [...] — реальный whitelist. Пусто/нет поля = без ограничений.

Support

First check the dashboard, then reach out.

Self-serve

  • Usage / billing / plan/dashboard
  • Key healthGET /v1/key/info with the key in question
  • Model availabilityGET /v1/models
  • Live statusGET /healthz on the API host

What to include in a ticket

  • request-id from the failing response (response header)
  • Key prefix (first 12 chars; never the whole key)
  • UTC time window (≤30 minutes if possible)
  • Endpoint + method, model, body size approximation
  • Response body if it's an error envelope

Reach the operator via the channel printed in your invite. Keep tickets short and quote logs verbatim — paraphrasing eats half the debug data.

Поддержка

Сначала dashboard, потом пиши.

Self-serve

  • Расход / биллинг / план/dashboard
  • Здоровье ключаGET /v1/key/info с этим же ключом
  • Доступность моделейGET /v1/models
  • Live statusGET /healthz на API-хосте

Что положить в тикет

  • request-id из неудавшегося ответа (заголовок)
  • Префикс ключа (первые 12 символов; никогда весь ключ)
  • UTC временное окно (≤30 минут если можно)
  • Endpoint + method, модель, размер тела примерно
  • Тело ответа если это error-envelope

Оператор — через канал из инвайта. Пиши коротко и цитируй логи verbatim — пересказ съедает половину debug-данных.