Epithre AI Platform · API Documentation

Indonesian-tuned AI inference, OpenAI-compatible.

Chat completions, embeddings, reranking, and image generation — all served from Jakarta, all callable with the same SDK you already use for OpenAI. Six models across three tiers, one API contract.

Drop-in compatible with the OpenAI Python and JavaScript SDKs: just change the base URL. Tuned for Bahasa Indonesia. Self-hosted (your prompts never leave Indonesia).

6Models

200KMax context (PRME)

$0.04Embed / 1M tokens

$5Free credit on signup

5-minute quickstart

Make your first request

Sign up at platform.epithre.com and verify your email (unlocks $5 free credit).
In the dashboard, create an API key. Copy the esk_live_… token immediately — the full value is shown only once.
Replace $EPITHRE_KEY below with your key.

curl https://api.epithre.com/v1/chat/completions \
  -H "Authorization: Bearer $EPITHRE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "epithre-omni",
    "messages": [
      {"role": "system", "content": "Kamu asisten yang menjawab singkat dalam bahasa Indonesia."},
      {"role": "user", "content": "Apa ibu kota Jepang dan apa mata uangnya?"}
    ]
  }'

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["EPITHRE_KEY"],
    base_url="https://api.epithre.com/v1",  # ← only line that changes vs OpenAI
)

resp = client.chat.completions.create(
    model="epithre-omni",
    messages=[
        {"role": "system", "content": "Kamu asisten yang menjawab singkat dalam bahasa Indonesia."},
        {"role": "user", "content": "Apa ibu kota Jepang dan apa mata uangnya?"},
    ],
)
print(resp.choices[0].message.content)
# → "Ibu kota Jepang adalah Tokyo. Mata uangnya yen Jepang (JPY)."

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.EPITHRE_KEY,
  baseURL: "https://api.epithre.com/v1",  // ← only line that changes vs OpenAI
});

const resp = await client.chat.completions.create({
  model: "epithre-omni",
  messages: [
    { role: "system", content: "Kamu asisten yang menjawab singkat dalam bahasa Indonesia." },
    { role: "user", content: "Apa ibu kota Jepang dan apa mata uangnya?" },
  ],
});
console.log(resp.choices[0].message.content);

package main

import (
    "context"
    "fmt"
    "os"

    "github.com/sashabaranov/go-openai"
)

func main() {
    config := openai.DefaultConfig(os.Getenv("EPITHRE_KEY"))
    config.BaseURL = "https://api.epithre.com/v1"
    client := openai.NewClientWithConfig(config)

    resp, err := client.CreateChatCompletion(context.Background(), openai.ChatCompletionRequest{
        Model: "epithre-omni",
        Messages: []openai.ChatCompletionMessage{
            {Role: "user", Content: "Apa ibu kota Jepang?"},
        },
    })
    if err != nil { panic(err) }
    fmt.Println(resp.Choices[0].Message.Content)
}

SDK compat: the OpenAI SDK works as-is. Only the base_url changes. All standard parameters (temperature, tools, stream, response_format, vision image_url) work the same way.

Coming from another provider

Migrating from OpenAI in 5 lines

Already using OpenAI? Switch to Epithre by changing two settings:

# Before
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
resp = client.chat.completions.create(model="gpt-4o", ...)

# After
client = OpenAI(
    api_key=os.environ["EPITHRE_KEY"],
    base_url="https://api.epithre.com/v1",
)
resp = client.chat.completions.create(model="epithre-omni", ...)

Model mapping

OpenAI model	Epithre equivalent	Notes
`gpt-4o` / `gpt-5`	`epithre-omni`	Multimodal flagship, similar capability tier
`o1-pro` / long-context	`epithre-prme`	200K context, reasoning + coding
`gpt-4o-mini` / `gpt-3.5`	`epithre-lyt`	Fast, cheap, multimodal incl. audio + video
`text-embedding-3-large`	`epithre-embed`	4000-dim, Indonesian-tuned, MRL-truncatable
`dall-e-3`	`epithre-iris`	FLUX-based, supports multi-ref editing

What's the same

Wire format: identical JSON request and response shapes
Streaming: SSE format, same delta chunks
Tool/function calling: same tools + tool_choice contract
Vision: same image_url content type
Errors: same {"error": {"message", "type"}} envelope

What's different

Indonesian quality: trained on Indonesian data, fluency on par with native speaker
Data residency: all inference happens in Jakarta data center — no data leaves Indonesia
Pricing: ~30-50% cheaper across the board (see Pricing)
Rerank: Cohere-compatible /v1/rerank endpoint (OpenAI doesn't have this)

Required for every call

Authentication

Send your API key in the Authorization header on every request to /v1/*:

Authorization: Bearer esk_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Key types

Prefix	Use for
`esk_live_*`	Production workloads. Real billing applies.
`esk_test_*`	Sandbox / CI. Same endpoints; separate billing for clarity.

Best practices

Store keys in environment variables; never hardcode them in source
Use a separate key per service/environment; one app, one key
Revoke immediately on suspected leak (one click in dashboard)
Set per-key spending caps in the dashboard (monthly USD cap)

Reference

Endpoints

Base URL: https://api.epithre.com

Models

GET/v1/models

List all available models with their tiers, contexts, and capabilities. Public — no auth required.

Response shape

{
  "object": "list",
  "data": [
    {
      "id": "epithre-omni",
      "object": "model",
      "owned_by": "epithre",
      "tier": "flagship",
      "context_window": 49152,
      "max_output_tokens": 16384,
      "modalities": ["text", "image"],
      "capabilities": ["chat", "tool_use", "vision", "thinking"],
      "description": "...",
      "pricing_model": "per_token"
    },
    ...
  ]
}

Available models

ID	Tier	Context	Best for
`epithre-prme`	Premium	200,000	Long-context reasoning, full-codebase analysis, document review
`epithre-omni`	Flagship	49,152	General chat with vision, agentic tool-use, extended thinking
`epithre-lyt`	Compact	32,768	High-throughput cheap chat, image + audio + video input
`epithre-embed`	Embedding	4,096	Semantic search, RAG retrieval (4000-dim)
`epithre-rerank`	Reranker	2,048	Boosting retrieval quality after embed search
`epithre-iris`	Image	—	Text-to-image, multi-reference image editing

Chat Completions

POST/v1/chat/completions

The workhorse endpoint. Conversational text generation with optional streaming, tool calling, and vision. Compatible with OpenAI's chat-completions SDK.

When to use which model

epithre-omni

Default choice for chat. Handles text + images, supports tool calling, multilingual including strong Bahasa Indonesia.

epithre-prme

When prompts exceed 32K tokens. Full-document analysis, multi-file code review, long agentic chains.

epithre-lyt

High-volume cheap chat. Use when latency matters and you don't need flagship reasoning. Also: audio + video inputs.

Request body

Field	Type	Required	Description
`model`	string	yes	`epithre-prme`, `epithre-omni`, or `epithre-lyt`
`messages`	array	yes	Standard chat messages. Roles: `system`, `user`, `assistant`, `tool`. Each message has `content` (string or vision array).
`max_tokens`	int	no	Max output tokens. Default model-dependent, max 16384 (omni/prme), 4096 (lyt)
`temperature`	float	no	0.0-2.0, default 1.0. Lower = more deterministic
`top_p`	float	no	0.0-1.0 nucleus sampling, default 1.0
`stream`	bool	no	If `true`, returns SSE chunks. Final chunk includes `usage`.
`tools`	array	no	Function definitions. Standard OpenAI tools schema.
`tool_choice`	string or object	no	`"auto"` (default), `"none"`, or `{"type":"function","function":{"name":"..."}}`
`response_format`	object	no	`{"type":"json_object"}` for guaranteed JSON output
`chat_template_kwargs`	object	no	e.g. `{"enable_thinking": true}` for extended thinking mode on Omni/PRME
`seed`	int	no	For reproducibility (best-effort)
`stop`	string or array	no	Stop sequences

Response shape (non-streaming)

{
  "id": "chatcmpl-xxxxxx",
  "object": "chat.completion",
  "created": 1778455870,
  "model": "epithre-omni",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Ibu kota Jepang adalah Tokyo."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 9,
    "total_tokens": 27
  }
}

Streaming response (SSE)

data: {"id":"chatcmpl-x","choices":[{"index":0,"delta":{"role":"assistant","content":""}}]}

data: {"id":"chatcmpl-x","choices":[{"index":0,"delta":{"content":"Ibu kota "}}]}

data: {"id":"chatcmpl-x","choices":[{"index":0,"delta":{"content":"Jepang adalah "}}]}

data: {"id":"chatcmpl-x","choices":[{"index":0,"delta":{"content":"Tokyo."}}]}

data: {"id":"chatcmpl-x","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: {"id":"chatcmpl-x","choices":[],"usage":{"prompt_tokens":18,"completion_tokens":9,"total_tokens":27}}

data: [DONE]

Examples

Streaming chat

curl https://api.epithre.com/v1/chat/completions \
  -H "Authorization: Bearer $EPITHRE_KEY" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "epithre-omni",
    "messages": [{"role": "user", "content": "Tulis 3 fakta tentang Jakarta"}],
    "stream": true,
    "max_tokens": 200
  }'

stream = client.chat.completions.create(
    model="epithre-omni",
    messages=[{"role": "user", "content": "Tulis 3 fakta tentang Jakarta"}],
    stream=True,
    max_tokens=200,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

const stream = await client.chat.completions.create({
  model: "epithre-omni",
  messages: [{ role: "user", content: "Tulis 3 fakta tentang Jakarta" }],
  stream: true,
  max_tokens: 200,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Vision (omni/lyt)

import base64
img_b64 = base64.b64encode(open("invoice.jpg", "rb").read()).decode()

resp = client.chat.completions.create(
    model="epithre-omni",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Ekstrak total pembayaran dari invoice ini sebagai angka."},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{img_b64}"}},
        ]
    }],
)
print(resp.choices[0].message.content)

IMG=$(base64 -w0 invoice.jpg)
curl https://api.epithre.com/v1/chat/completions \
  -H "Authorization: Bearer $EPITHRE_KEY" \
  -H "Content-Type: application/json" \
  -d "$(jq -n --arg img "$IMG" '{
    "model": "epithre-omni",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Ekstrak total pembayaran sebagai angka."},
        {"type": "image_url", "image_url": {"url": ("data:image/jpeg;base64," + $img)}}
      ]
    }]
  }')"

Tool calling

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a city in Indonesia",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name in Indonesian"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["city"]
        }
    }
}]

resp = client.chat.completions.create(
    model="epithre-omni",
    messages=[{"role": "user", "content": "Cuaca Jakarta sekarang gimana?"}],
    tools=tools,
)

# Model decides to call the tool:
tool_call = resp.choices[0].message.tool_calls[0]
print(tool_call.function.name)       # → "get_weather"
print(tool_call.function.arguments)  # → '{"city": "Jakarta", "unit": "celsius"}'

# Execute your function, return result:
followup = client.chat.completions.create(
    model="epithre-omni",
    messages=[
        {"role": "user", "content": "Cuaca Jakarta sekarang gimana?"},
        resp.choices[0].message,
        {"role": "tool", "tool_call_id": tool_call.id, "content": '{"temp": 31, "unit": "C", "condition": "cerah berawan"}'},
    ],
    tools=tools,
)
print(followup.choices[0].message.content)
# → "Jakarta sekarang 31°C, cerah berawan."

JSON mode

resp = client.chat.completions.create(
    model="epithre-omni",
    messages=[
        {"role": "system", "content": "Selalu balas dalam JSON."},
        {"role": "user", "content": "Ekstrak nama, umur, pekerjaan dari: 'Pak Budi, 45 tahun, dokter di RSCM'"},
    ],
    response_format={"type": "json_object"},
)
import json
data = json.loads(resp.choices[0].message.content)
# → {"nama": "Pak Budi", "umur": 45, "pekerjaan": "dokter"}

Common errors

HTTP	Code	Cause	Fix
400	`model_not_found`	Bad `model` value	Use one of the 3 chat models
400	`invalid_request_error`	Empty messages array	Send at least one message
429	`backend_busy`	Aggregate Epithre traffic at backend cap	Retry in ~1 second (exponential backoff)
429	`concurrency_exceeded`	Your key's concurrent cap hit	Reduce parallelism or raise cap in dashboard

Embeddings

POST/v1/embeddings

Convert text into 4000-dim L2-normalized vectors for semantic search, RAG, clustering, and classification. Native model: Qwen3-Embed-8B (Indonesian-optimized, MRL-truncatable).

Semantic search

Embed your document corpus once, store in a vector DB (pgvector, Pinecone, Qdrant). Embed each user query and find nearest neighbors.

RAG retrieval

Pre-step before chat. Pair with rerank to push relevance over 95%. See RAG recipe.

Classification

Compute centroid embeddings per class, then assign new texts to nearest centroid. Strong baseline before training a model.

Request body

Field	Type	Required	Description
`model`	string	yes	Must be `epithre-embed`
`input`	string or array	yes	Single text or 1-64 strings per request
`dimensions`	int	no	1-4000. If < 4000, MRL truncates & re-L2-normalizes server-side. Lossless prefix.
`instruction`	string	no	Qwen3 native task instruction prefix (e.g., "Retrieve relevant passages about Indonesian forestry law")

Response shape

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [0.0107, -0.0022, 0.0152, ...],  // 4000 floats (or `dimensions` if set)
      "index": 0
    },
    {
      "object": "embedding",
      "embedding": [...],
      "index": 1
    }
  ],
  "model": "epithre-embed",
  "usage": {"prompt_tokens": 24, "total_tokens": 24}
}

Examples

Embed a batch with MRL truncation

resp = client.embeddings.create(
    model="epithre-embed",
    input=[
        "penebangan liar di hutan lindung",
        "perlindungan satwa dilindungi UU 5/1990",
        "kebijakan ekspor batubara 2024",
    ],
    dimensions=1024,  # Truncate for smaller storage; still good quality
)
import numpy as np
vecs = np.array([e.embedding for e in resp.data])
print(vecs.shape)  # → (3, 1024)
print("cost:", resp.usage.prompt_tokens, "input tokens")

curl https://api.epithre.com/v1/embeddings \
  -H "Authorization: Bearer $EPITHRE_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "epithre-embed",
    "input": ["penebangan liar di hutan lindung", "perlindungan satwa dilindungi UU 5/1990"],
    "dimensions": 1024
  }'

With instruction prefix (Qwen3 native)

resp = client.embeddings.create(
    model="epithre-embed",
    input=["UU 41/1999 tentang Kehutanan pasal 50"],
    instruction="Given a legal query, retrieve relevant Indonesian regulations",
)

Performance tip: Batch up to 64 inputs per request. One batch is ~10× faster + cheaper than 64 individual calls. Embeddings are deterministic — safe to cache by input hash.

Rerank

POST/v1/rerank

Cohere-compatible reranking endpoint. After vector search returns top-K candidates, rerank narrows to the most relevant ones using a heavier cross-encoder model. Boosts Indonesian retrieval quality from ~85% to ~98% in our benchmarks.

Request body

Field	Type	Required	Description
`model`	string	yes	Must be `epithre-rerank`
`query`	string	yes	The user search query
`documents`	array	yes	1-64 candidate documents (typically top-K from embed search)
`top_n`	int	no	Return top N after sorting (default: all, sorted descending)
`return_documents`	bool	no	Include document text in response (default `false` = just indices+scores)
`instruction`	string	no	Custom reranking instruction (overrides default)

Response shape

{
  "id": "2b475d38-c17c-43b9-ba9a-7f485af38e0f",
  "results": [
    {"index": 3, "relevance_score": 0.1480, "document": {"text": "..."}},
    {"index": 0, "relevance_score": 0.0567, "document": {"text": "..."}},
    {"index": 1, "relevance_score": 0.0000}
  ],
  "meta": {"billed_units": {"search_units": 1}}
}

Example

resp = httpx.post(
    "https://api.epithre.com/v1/rerank",
    headers={"Authorization": f"Bearer {EPITHRE_KEY}"},
    json={
        "model": "epithre-rerank",
        "query": "perlindungan hutan lindung dari penebangan",
        "documents": [
            "UU 41/1999 pasal 50 — perusakan hutan",
            "Permen LHK satwa dilindungi",
            "Keppres pemilu 2024",
            "Pasal pengelolaan hutan lindung",
        ],
        "top_n": 3,
        "return_documents": True,
    },
).json()

for r in resp["results"]:
    print(f'{r["relevance_score"]:.3f}  {r["document"]["text"]}')

# → 0.148  Pasal pengelolaan hutan lindung
# → 0.057  UU 41/1999 pasal 50 — perusakan hutan
# → 0.000  Permen LHK satwa dilindungi   ← irrelevant, low score

Score semantics: Scores are P(yes) / (P(yes)+P(no)) from the model. Range [0, 1]. Indonesian queries often produce low absolute values (0.05-0.30 for true matches). Use rank order, not absolute threshold.

Image Generation

POST/v1/images/generations

Text-to-image using FLUX-Klein. Returns inline base64-encoded PNG. Optional style LoRAs for "dark" or "anime" aesthetics.

Request body

Field	Type	Required	Description
`model`	string	yes	Must be `epithre-iris`
`prompt`	string	yes	Max 2000 chars. English works best; Indonesian also supported.
`size`	string	no	`"WxH"`, max 960×960. Default 768×768. Width & height rounded down to nearest multiple of 16.
`n`	int	no	Currently fixed at 1 (batch not yet supported)
`response_format`	string	no	`"b64_json"` (default and only supported)
`num_steps`	int	no	1-50, default 4. More steps = slower but slightly higher quality.
`seed`	int	no	Seed for reproducibility. `-1` = random.
`guidance_scale`	float	no	Default 1.0. Higher = more literal prompt adherence.
`lora`	string	no	`"none"` (default), `"dark"` (moody/cinematic), `"anime"`
`lora_strength`	float	no	0.0-1.5, default 0.6 (for non-none LoRAs)

Response shape

{
  "created": 1778455000,
  "data": [
    {"b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."}  // base64-encoded PNG
  ]
}

Examples

Basic generation

resp = client.images.generate(
    model="epithre-iris",
    prompt="a serene Indonesian beach at sunset, photorealistic, golden hour",
    size="768x768",
)

import base64
img_bytes = base64.b64decode(resp.data[0].b64_json)
with open("output.png", "wb") as f:
    f.write(img_bytes)

With anime LoRA

resp = httpx.post(
    "https://api.epithre.com/v1/images/generations",
    headers={"Authorization": f"Bearer {EPITHRE_KEY}"},
    json={
        "model": "epithre-iris",
        "prompt": "a young samurai in a bamboo forest, cherry blossoms falling",
        "size": "768x768",
        "lora": "anime",
        "lora_strength": 0.8,
        "seed": 42,
    },
).json()

Latency: ~12-19 seconds for 768×768 @ 4 steps. Anime LoRA adds ~5s. For preview/iteration, use 4 steps; for final renders, 20-30 steps.

Image Edit

POST/v1/images/edits

Edit an existing image with a text prompt. Supports single source or up to 5 reference images for compositional editing (e.g., "put the product from img1 in the setting from img2").

Request body

Field	Type	Required	Description
`model`	string	yes	`epithre-iris`
`prompt`	string	yes	Edit instruction (e.g., "change background to sunset", "make the dress red")
`image`	string (base64)	conditional	Single source image. PNG/JPEG/WebP/GIF, max 20 MB decoded. Mutually exclusive with `images`.
`images`	array	conditional	1-5 reference images for compositional editing. Mutually exclusive with `image`.
`size`	string	no	Max 704×704 for edit. Default matches input.
`strength`	float	no	0.0-1.0, default 0.75. Higher = bigger change from source.
`num_inference_steps`	int	no	1-50, default 4
`lora`	string	no	Same options as generation

Response shape

Same as /v1/images/generations: {"created", "data": [{"b64_json"}]}

Example: single-image edit

import base64
src = base64.b64encode(open("original.png", "rb").read()).decode()

resp = httpx.post(
    "https://api.epithre.com/v1/images/edits",
    headers={"Authorization": f"Bearer {EPITHRE_KEY}"},
    json={
        "model": "epithre-iris",
        "prompt": "change the sky to dramatic stormy clouds with lightning",
        "image": src,
        "size": "512x512",
        "strength": 0.7,
    },
).json()

Example: multi-reference compositing

imgs = [base64.b64encode(open(f"ref_{i}.png", "rb").read()).decode()
        for i in range(3)]

resp = httpx.post(
    "https://api.epithre.com/v1/images/edits",
    headers={"Authorization": f"Bearer {EPITHRE_KEY}"},
    json={
        "model": "epithre-iris",
        "prompt": "the product from image 1, displayed in the studio setting from image 2, in the photography style of image 3",
        "images": imgs,
        "size": "640x640",
    },
).json()

Image format check: The API validates magic bytes — only PNG, JPEG, WebP, GIF accepted. Data URIs like data:image/png;base64,... are auto-stripped. URL inputs are rejected (SSRF guard).

End-to-end patterns

Recipes

RAG: embed + rerank + chat

The canonical pattern: index your knowledge base with embed, retrieve top-K with vector search, narrow with rerank, then synthesize an answer with chat.

import httpx, numpy as np
from openai import OpenAI

EK = os.environ["EPITHRE_KEY"]
client = OpenAI(api_key=EK, base_url="https://api.epithre.com/v1")

# 1) ONE-TIME: embed your corpus
corpus = [
    "UU 41/1999 pasal 50 — barangsiapa merusak hutan akan dipidana...",
    "Permen LHK No. 92/2018 tentang pengelolaan satwa liar...",
    "PP No. 23/2021 tentang penyelenggaraan kehutanan...",
    # ... thousands of docs
]
e = client.embeddings.create(model="epithre-embed", input=corpus, dimensions=1024)
corpus_vecs = np.array([row.embedding for row in e.data])  # (N, 1024)

# 2) AT QUERY TIME: embed the question, find top-K nearest
question = "Apa hukuman untuk perusakan hutan lindung?"
qe = client.embeddings.create(model="epithre-embed", input=question, dimensions=1024)
qv = np.array(qe.data[0].embedding)
scores = corpus_vecs @ qv  # cosine sim (already L2-normalized)
top_k_idx = np.argsort(-scores)[:10]
candidates = [corpus[i] for i in top_k_idx]

# 3) Rerank to push the most relevant to the top
r = httpx.post("https://api.epithre.com/v1/rerank",
    headers={"Authorization": f"Bearer {EK}"},
    json={"model": "epithre-rerank", "query": question, "documents": candidates, "top_n": 3, "return_documents": True},
).json()
context = "\n\n".join(item["document"]["text"] for item in r["results"])

# 4) Generate answer with retrieved context
resp = client.chat.completions.create(
    model="epithre-omni",
    messages=[
        {"role": "system", "content": "Jawab berdasarkan konteks yang diberikan. Sebutkan pasal/peraturan yang relevan."},
        {"role": "user", "content": f"Konteks:\n{context}\n\nPertanyaan: {question}"},
    ],
)
print(resp.choices[0].message.content)

Vision QA on documents

Extract structured data from invoices, receipts, KTP, or any document image:

import base64, json
img = base64.b64encode(open("invoice.jpg", "rb").read()).decode()

resp = client.chat.completions.create(
    model="epithre-omni",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Ekstrak data invoice ini sebagai JSON dengan field: vendor, tanggal, nomor_invoice, item (list), subtotal, ppn, total."},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{img}"}}
        ]
    }],
    response_format={"type": "json_object"},
)
data = json.loads(resp.choices[0].message.content)
print(data["total"])

Build an image generation app

def generate_thumbnail(topic: str) -> bytes:
    """Generate a marketing thumbnail for a given topic."""
    resp = httpx.post("https://api.epithre.com/v1/images/generations",
        headers={"Authorization": f"Bearer {EPITHRE_KEY}"},
        json={
            "model": "epithre-iris",
            "prompt": f"professional marketing thumbnail for: {topic}, vibrant colors, no text",
            "size": "768x768",
            "num_steps": 8,  # higher quality
            "guidance_scale": 1.5,
        },
        timeout=60,
    ).json()
    return base64.b64decode(resp["data"][0]["b64_json"])

thumb = generate_thumbnail("Indonesian food blog — rendang recipe")
open("thumb.png", "wb").write(thumb)

Function calling for agents

Build an agent that can call functions. Loop until finish_reason == "stop" or no tool calls remain:

def get_weather(city: str) -> dict:
    # your real implementation
    return {"city": city, "temp_c": 31, "condition": "cerah berawan"}

def search_news(query: str) -> list:
    return [...]

tool_handlers = {"get_weather": get_weather, "search_news": search_news}
tool_defs = [...]  # OpenAI tools schema

messages = [{"role": "user", "content": "Cuaca Jakarta + berita terbaru tentang banjir"}]

for _ in range(5):  # max 5 rounds
    resp = client.chat.completions.create(model="epithre-omni", messages=messages, tools=tool_defs)
    msg = resp.choices[0].message
    messages.append(msg)

    if not msg.tool_calls:
        print(msg.content)
        break

    for tc in msg.tool_calls:
        args = json.loads(tc.function.arguments)
        result = tool_handlers[tc.function.name](**args)
        messages.append({
            "role": "tool",
            "tool_call_id": tc.id,
            "content": json.dumps(result),
        })

Operational reference

Error handling

All errors return JSON with this envelope (same as OpenAI):

{
  "error": {
    "message": "Human-readable description",
    "type": "error_category",
    "code": "specific_error_code"
  }
}

Full error catalog

HTTP	Type	Common codes	What it means	Recovery
400	`invalid_request_error`	`model_not_found`, `tos_required`, `email_not_verified`	Bad parameters in request body	Fix client code, don't retry
401	`authentication_error`	`authentication_error`	Missing/invalid/revoked API key	Check `Authorization` header; create new key if revoked
402	`insufficient_quota`	`insufficient_quota`	API key has $0 balance	Top up via dashboard (email hello@epithre.com)
403	`permission_error`	`account_suspended`, `email_not_verified`, `permission_error`	Account suspended OR endpoint needs admin OR email not verified	Check email inbox or contact support
413	`request_too_large`	`request_too_large`	Body exceeds 1 MB (text) or 50 MB (images)	Reduce payload size; truncate prompts; resize images
429	`rate_limit_error`	`rpm_exceeded`, `rpd_exceeded`, `concurrency_exceeded`, `backend_busy`	Various rate limits hit	Exponential backoff (1s, 2s, 4s, …). Increase limits in dashboard if persistent.
502	`backend_error`	`backend_error`	Inference backend returned 5xx	Retry once with backoff
503	`backend_unavailable`	`backend_unavailable`	Backend not reachable (planned maintenance or outage)	Retry with longer backoff; check dashboard health
504	`backend_timeout`	`backend_timeout`	Backend slow / hung	Retry with shorter prompt or simpler request

Recommended retry pattern

import time

def call_with_retry(call_fn, max_retries=4):
    for attempt in range(max_retries):
        try:
            return call_fn()
        except openai.APIError as e:
            if e.status_code in (429, 502, 503, 504):
                if attempt == max_retries - 1: raise
                wait = (2 ** attempt) + random.uniform(0, 1)  # jitter
                time.sleep(wait)
                continue
            raise  # don't retry 400/401/402/403

Rate limits

Two layers of limits apply to every request:

Per-key limits (yours to control)

Limit	Default	Where to change
Requests per minute (RPM)	60	Dashboard → Keys → edit
Requests per day (RPD)	10,000	Dashboard → Keys → edit
Concurrent requests	10	Dashboard → Keys → edit
Monthly spend cap	$100	Dashboard → Keys → edit

Exceed any → HTTP 429 with code indicating which limit.

Backend capacity (shared)

Each chat model has an aggregate concurrent cap across all Epithre customers, to reserve capacity for other services. When hit, you get HTTP 429 with code: "backend_busy". Wait ~1 second and retry — these clear quickly.

Pricing

All prices in USD. Billed per request, deducted from your active API key's credit balance immediately.

Model	Input	Output	Unit price
`epithre-prme`	$0.40 / 1M tok	$1.60 / 1M tok	token
`epithre-omni`	$0.30 / 1M tok	$1.20 / 1M tok	token
`epithre-lyt`	$0.05 / 1M tok	$0.20 / 1M tok	token
`epithre-embed`	$0.04 / 1M tok	—	input token only
`epithre-rerank`	—	—	$0.000002 / document
`epithre-iris`	—	—	$0.005 / image

Topping up

During alpha, top-up is handled by email: send to hello@epithre.com with the amount and we'll reply with payment details (BCA/Wise/etc). Credit usually applied within 1 business day after payment clears. Stripe/Midtrans integration is planned for general availability.

SDKs & clients

Epithre is OpenAI-wire-compatible. Use any OpenAI SDK and override base_url:

Python — pip install openai (v1.0+)
Node.js / TypeScript — npm install openai
Go — go get github.com/sashabaranov/go-openai
Java / Kotlin — com.openai:openai-java or Spring AI
Ruby — gem install ruby-openai
PHP — openai-php/client on Composer
Rust — async-openai crate

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="epithre-omni",
    api_key=os.environ["EPITHRE_KEY"],
    base_url="https://api.epithre.com/v1",
)
llm.invoke("Halo!")

LlamaIndex

from llama_index.llms.openai_like import OpenAILike
from llama_index.embeddings.openai_like import OpenAILikeEmbedding

Settings.llm = OpenAILike(
    model="epithre-omni",
    api_key=os.environ["EPITHRE_KEY"],
    api_base="https://api.epithre.com/v1",
)
Settings.embed_model = OpenAILikeEmbedding(
    model_name="epithre-embed",
    api_key=os.environ["EPITHRE_KEY"],
    api_base="https://api.epithre.com/v1",
)

For endpoints not in OpenAI SDK (rerank, image edit)

Use plain HTTP via your language's standard client (Python httpx, JS fetch, Go net/http). See examples above.

Data policy (summary)

We do not log prompts or responses. Only metadata: token counts, timestamps, costs, HTTP status.
Images submitted to /v1/images/edits are discarded after the response is returned.
Account data (email, hashed password, API key hashes) retained for account lifetime + 30 days.
Usage metadata retained 90 days for billing & abuse detection.
All infrastructure physically in Jakarta, Indonesia — your data never leaves the country.
TLS 1.2+ enforced on all endpoints.

Full policy: privacy.html.

FAQ

Can I use Epithre to power a customer-facing product?

Yes. The standard ToS allows commercial use. Don't resell raw API access as your own product without a separate reseller agreement (contact us). See Terms.

How does Indonesian quality compare to OpenAI?

Our models are specifically fine-tuned on Indonesian data and outperform comparable OpenAI tier models on Indonesian benchmarks (semantic similarity, instruction following, cultural context). Run your own A/B on real production traffic — that's the only honest benchmark.

What's your uptime SLA?

Best-effort during alpha (no contractual SLA). We monitor 24/7 via Uptime Kuma. In practice we target 99.5%+. For mission-critical workloads with hard SLA, contact us for enterprise terms.

Can I run Epithre on-prem?

Not yet — the platform is currently cloud-only. On-prem licensing is on the roadmap for enterprise.

How do I report abuse or a security issue?

Email hello@epithre.com. For security issues, use subject "Security disclosure". We'll respond within 1 business day.

What happens if I hit my monthly spend cap?

Your key returns 402 insufficient_quota for the rest of the month. Raise the cap in the dashboard or top up to continue.

Can I get my data exported / account deleted?

Yes. Email hello@epithre.com. Export typically delivered as CSV within 7 days. Account deletion erases all linked data within 30 days, except billing records (retained 7 years per Indonesian tax law).