Indonesian-tuned AI inference, OpenAI-compatible.
Chat completions, embeddings, reranking, and image generation — all served from Jakarta, all callable with the same SDK you already use for OpenAI. Six models across three tiers, one API contract.
Drop-in compatible with the OpenAI Python and JavaScript SDKs: just change the base URL. Tuned for Bahasa Indonesia. Self-hosted (your prompts never leave Indonesia).
Make your first request
- Sign up at platform.epithre.com and verify your email (unlocks $5 free credit).
- In the dashboard, create an API key. Copy the
esk_live_…token immediately — the full value is shown only once. - Replace
$EPITHRE_KEYbelow with your key.
curl https://api.epithre.com/v1/chat/completions \
-H "Authorization: Bearer $EPITHRE_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "epithre-omni",
"messages": [
{"role": "system", "content": "Kamu asisten yang menjawab singkat dalam bahasa Indonesia."},
{"role": "user", "content": "Apa ibu kota Jepang dan apa mata uangnya?"}
]
}'
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["EPITHRE_KEY"],
base_url="https://api.epithre.com/v1", # ← only line that changes vs OpenAI
)
resp = client.chat.completions.create(
model="epithre-omni",
messages=[
{"role": "system", "content": "Kamu asisten yang menjawab singkat dalam bahasa Indonesia."},
{"role": "user", "content": "Apa ibu kota Jepang dan apa mata uangnya?"},
],
)
print(resp.choices[0].message.content)
# → "Ibu kota Jepang adalah Tokyo. Mata uangnya yen Jepang (JPY)."
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.EPITHRE_KEY,
baseURL: "https://api.epithre.com/v1", // ← only line that changes vs OpenAI
});
const resp = await client.chat.completions.create({
model: "epithre-omni",
messages: [
{ role: "system", content: "Kamu asisten yang menjawab singkat dalam bahasa Indonesia." },
{ role: "user", content: "Apa ibu kota Jepang dan apa mata uangnya?" },
],
});
console.log(resp.choices[0].message.content);
package main
import (
"context"
"fmt"
"os"
"github.com/sashabaranov/go-openai"
)
func main() {
config := openai.DefaultConfig(os.Getenv("EPITHRE_KEY"))
config.BaseURL = "https://api.epithre.com/v1"
client := openai.NewClientWithConfig(config)
resp, err := client.CreateChatCompletion(context.Background(), openai.ChatCompletionRequest{
Model: "epithre-omni",
Messages: []openai.ChatCompletionMessage{
{Role: "user", Content: "Apa ibu kota Jepang?"},
},
})
if err != nil { panic(err) }
fmt.Println(resp.Choices[0].Message.Content)
}
base_url changes. All standard parameters (temperature, tools, stream, response_format, vision image_url) work the same way.
Migrating from OpenAI in 5 lines
Already using OpenAI? Switch to Epithre by changing two settings:
# Before client = OpenAI(api_key=os.environ["OPENAI_API_KEY"]) resp = client.chat.completions.create(model="gpt-4o", ...) # After client = OpenAI( api_key=os.environ["EPITHRE_KEY"], base_url="https://api.epithre.com/v1", ) resp = client.chat.completions.create(model="epithre-omni", ...)
Model mapping
| OpenAI model | Epithre equivalent | Notes |
|---|---|---|
gpt-4o / gpt-5 | epithre-omni | Multimodal flagship, similar capability tier |
o1-pro / long-context | epithre-prme | 200K context, reasoning + coding |
gpt-4o-mini / gpt-3.5 | epithre-lyt | Fast, cheap, multimodal incl. audio + video |
text-embedding-3-large | epithre-embed | 4000-dim, Indonesian-tuned, MRL-truncatable |
dall-e-3 | epithre-iris | FLUX-based, supports multi-ref editing |
What's the same
- Wire format: identical JSON request and response shapes
- Streaming: SSE format, same delta chunks
- Tool/function calling: same
tools+tool_choicecontract - Vision: same
image_urlcontent type - Errors: same
{"error": {"message", "type"}}envelope
What's different
- Indonesian quality: trained on Indonesian data, fluency on par with native speaker
- Data residency: all inference happens in Jakarta data center — no data leaves Indonesia
- Pricing: ~30-50% cheaper across the board (see Pricing)
- Rerank: Cohere-compatible
/v1/rerankendpoint (OpenAI doesn't have this)
Authentication
Send your API key in the Authorization header on every request to /v1/*:
Authorization: Bearer esk_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Key types
| Prefix | Use for |
|---|---|
esk_live_* | Production workloads. Real billing applies. |
esk_test_* | Sandbox / CI. Same endpoints; separate billing for clarity. |
Best practices
- Store keys in environment variables; never hardcode them in source
- Use a separate key per service/environment; one app, one key
- Revoke immediately on suspected leak (one click in dashboard)
- Set per-key spending caps in the dashboard (monthly USD cap)
Endpoints
Base URL: https://api.epithre.com
Models
List all available models with their tiers, contexts, and capabilities. Public — no auth required.
Response shape
{
"object": "list",
"data": [
{
"id": "epithre-omni",
"object": "model",
"owned_by": "epithre",
"tier": "flagship",
"context_window": 49152,
"max_output_tokens": 16384,
"modalities": ["text", "image"],
"capabilities": ["chat", "tool_use", "vision", "thinking"],
"description": "...",
"pricing_model": "per_token"
},
...
]
}
Available models
| ID | Tier | Context | Best for |
|---|---|---|---|
epithre-prme | Premium | 200,000 | Long-context reasoning, full-codebase analysis, document review |
epithre-omni | Flagship | 49,152 | General chat with vision, agentic tool-use, extended thinking |
epithre-lyt | Compact | 32,768 | High-throughput cheap chat, image + audio + video input |
epithre-embed | Embedding | 4,096 | Semantic search, RAG retrieval (4000-dim) |
epithre-rerank | Reranker | 2,048 | Boosting retrieval quality after embed search |
epithre-iris | Image | — | Text-to-image, multi-reference image editing |
Chat Completions
The workhorse endpoint. Conversational text generation with optional streaming, tool calling, and vision. Compatible with OpenAI's chat-completions SDK.
When to use which model
epithre-omni
Default choice for chat. Handles text + images, supports tool calling, multilingual including strong Bahasa Indonesia.
epithre-prme
When prompts exceed 32K tokens. Full-document analysis, multi-file code review, long agentic chains.
epithre-lyt
High-volume cheap chat. Use when latency matters and you don't need flagship reasoning. Also: audio + video inputs.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | yes | epithre-prme, epithre-omni, or epithre-lyt |
messages | array | yes | Standard chat messages. Roles: system, user, assistant, tool. Each message has content (string or vision array). |
max_tokens | int | no | Max output tokens. Default model-dependent, max 16384 (omni/prme), 4096 (lyt) |
temperature | float | no | 0.0-2.0, default 1.0. Lower = more deterministic |
top_p | float | no | 0.0-1.0 nucleus sampling, default 1.0 |
stream | bool | no | If true, returns SSE chunks. Final chunk includes usage. |
tools | array | no | Function definitions. Standard OpenAI tools schema. |
tool_choice | string or object | no | "auto" (default), "none", or {"type":"function","function":{"name":"..."}} |
response_format | object | no | {"type":"json_object"} for guaranteed JSON output |
chat_template_kwargs | object | no | e.g. {"enable_thinking": true} for extended thinking mode on Omni/PRME |
seed | int | no | For reproducibility (best-effort) |
stop | string or array | no | Stop sequences |
Response shape (non-streaming)
{
"id": "chatcmpl-xxxxxx",
"object": "chat.completion",
"created": 1778455870,
"model": "epithre-omni",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Ibu kota Jepang adalah Tokyo."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 18,
"completion_tokens": 9,
"total_tokens": 27
}
}
Streaming response (SSE)
data: {"id":"chatcmpl-x","choices":[{"index":0,"delta":{"role":"assistant","content":""}}]}
data: {"id":"chatcmpl-x","choices":[{"index":0,"delta":{"content":"Ibu kota "}}]}
data: {"id":"chatcmpl-x","choices":[{"index":0,"delta":{"content":"Jepang adalah "}}]}
data: {"id":"chatcmpl-x","choices":[{"index":0,"delta":{"content":"Tokyo."}}]}
data: {"id":"chatcmpl-x","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: {"id":"chatcmpl-x","choices":[],"usage":{"prompt_tokens":18,"completion_tokens":9,"total_tokens":27}}
data: [DONE]
Examples
Streaming chat
curl https://api.epithre.com/v1/chat/completions \
-H "Authorization: Bearer $EPITHRE_KEY" \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "epithre-omni",
"messages": [{"role": "user", "content": "Tulis 3 fakta tentang Jakarta"}],
"stream": true,
"max_tokens": 200
}'
stream = client.chat.completions.create(
model="epithre-omni",
messages=[{"role": "user", "content": "Tulis 3 fakta tentang Jakarta"}],
stream=True,
max_tokens=200,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
const stream = await client.chat.completions.create({
model: "epithre-omni",
messages: [{ role: "user", content: "Tulis 3 fakta tentang Jakarta" }],
stream: true,
max_tokens: 200,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
Vision (omni/lyt)
import base64
img_b64 = base64.b64encode(open("invoice.jpg", "rb").read()).decode()
resp = client.chat.completions.create(
model="epithre-omni",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "Ekstrak total pembayaran dari invoice ini sebagai angka."},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{img_b64}"}},
]
}],
)
print(resp.choices[0].message.content)
IMG=$(base64 -w0 invoice.jpg)
curl https://api.epithre.com/v1/chat/completions \
-H "Authorization: Bearer $EPITHRE_KEY" \
-H "Content-Type: application/json" \
-d "$(jq -n --arg img "$IMG" '{
"model": "epithre-omni",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Ekstrak total pembayaran sebagai angka."},
{"type": "image_url", "image_url": {"url": ("data:image/jpeg;base64," + $img)}}
]
}]
}')"
Tool calling
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city in Indonesia",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name in Indonesian"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["city"]
}
}
}]
resp = client.chat.completions.create(
model="epithre-omni",
messages=[{"role": "user", "content": "Cuaca Jakarta sekarang gimana?"}],
tools=tools,
)
# Model decides to call the tool:
tool_call = resp.choices[0].message.tool_calls[0]
print(tool_call.function.name) # → "get_weather"
print(tool_call.function.arguments) # → '{"city": "Jakarta", "unit": "celsius"}'
# Execute your function, return result:
followup = client.chat.completions.create(
model="epithre-omni",
messages=[
{"role": "user", "content": "Cuaca Jakarta sekarang gimana?"},
resp.choices[0].message,
{"role": "tool", "tool_call_id": tool_call.id, "content": '{"temp": 31, "unit": "C", "condition": "cerah berawan"}'},
],
tools=tools,
)
print(followup.choices[0].message.content)
# → "Jakarta sekarang 31°C, cerah berawan."
JSON mode
resp = client.chat.completions.create(
model="epithre-omni",
messages=[
{"role": "system", "content": "Selalu balas dalam JSON."},
{"role": "user", "content": "Ekstrak nama, umur, pekerjaan dari: 'Pak Budi, 45 tahun, dokter di RSCM'"},
],
response_format={"type": "json_object"},
)
import json
data = json.loads(resp.choices[0].message.content)
# → {"nama": "Pak Budi", "umur": 45, "pekerjaan": "dokter"}
Common errors
| HTTP | Code | Cause | Fix |
|---|---|---|---|
| 400 | model_not_found | Bad model value | Use one of the 3 chat models |
| 400 | invalid_request_error | Empty messages array | Send at least one message |
| 429 | backend_busy | Aggregate Epithre traffic at backend cap | Retry in ~1 second (exponential backoff) |
| 429 | concurrency_exceeded | Your key's concurrent cap hit | Reduce parallelism or raise cap in dashboard |
Embeddings
Convert text into 4000-dim L2-normalized vectors for semantic search, RAG, clustering, and classification. Native model: Qwen3-Embed-8B (Indonesian-optimized, MRL-truncatable).
Semantic search
Embed your document corpus once, store in a vector DB (pgvector, Pinecone, Qdrant). Embed each user query and find nearest neighbors.
RAG retrieval
Pre-step before chat. Pair with rerank to push relevance over 95%. See RAG recipe.
Classification
Compute centroid embeddings per class, then assign new texts to nearest centroid. Strong baseline before training a model.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | yes | Must be epithre-embed |
input | string or array | yes | Single text or 1-64 strings per request |
dimensions | int | no | 1-4000. If < 4000, MRL truncates & re-L2-normalizes server-side. Lossless prefix. |
instruction | string | no | Qwen3 native task instruction prefix (e.g., "Retrieve relevant passages about Indonesian forestry law") |
Response shape
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [0.0107, -0.0022, 0.0152, ...], // 4000 floats (or `dimensions` if set)
"index": 0
},
{
"object": "embedding",
"embedding": [...],
"index": 1
}
],
"model": "epithre-embed",
"usage": {"prompt_tokens": 24, "total_tokens": 24}
}
Examples
Embed a batch with MRL truncation
resp = client.embeddings.create(
model="epithre-embed",
input=[
"penebangan liar di hutan lindung",
"perlindungan satwa dilindungi UU 5/1990",
"kebijakan ekspor batubara 2024",
],
dimensions=1024, # Truncate for smaller storage; still good quality
)
import numpy as np
vecs = np.array([e.embedding for e in resp.data])
print(vecs.shape) # → (3, 1024)
print("cost:", resp.usage.prompt_tokens, "input tokens")
curl https://api.epithre.com/v1/embeddings \
-H "Authorization: Bearer $EPITHRE_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "epithre-embed",
"input": ["penebangan liar di hutan lindung", "perlindungan satwa dilindungi UU 5/1990"],
"dimensions": 1024
}'
With instruction prefix (Qwen3 native)
resp = client.embeddings.create(
model="epithre-embed",
input=["UU 41/1999 tentang Kehutanan pasal 50"],
instruction="Given a legal query, retrieve relevant Indonesian regulations",
)
Rerank
Cohere-compatible reranking endpoint. After vector search returns top-K candidates, rerank narrows to the most relevant ones using a heavier cross-encoder model. Boosts Indonesian retrieval quality from ~85% to ~98% in our benchmarks.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | yes | Must be epithre-rerank |
query | string | yes | The user search query |
documents | array | yes | 1-64 candidate documents (typically top-K from embed search) |
top_n | int | no | Return top N after sorting (default: all, sorted descending) |
return_documents | bool | no | Include document text in response (default false = just indices+scores) |
instruction | string | no | Custom reranking instruction (overrides default) |
Response shape
{
"id": "2b475d38-c17c-43b9-ba9a-7f485af38e0f",
"results": [
{"index": 3, "relevance_score": 0.1480, "document": {"text": "..."}},
{"index": 0, "relevance_score": 0.0567, "document": {"text": "..."}},
{"index": 1, "relevance_score": 0.0000}
],
"meta": {"billed_units": {"search_units": 1}}
}
Example
resp = httpx.post(
"https://api.epithre.com/v1/rerank",
headers={"Authorization": f"Bearer {EPITHRE_KEY}"},
json={
"model": "epithre-rerank",
"query": "perlindungan hutan lindung dari penebangan",
"documents": [
"UU 41/1999 pasal 50 — perusakan hutan",
"Permen LHK satwa dilindungi",
"Keppres pemilu 2024",
"Pasal pengelolaan hutan lindung",
],
"top_n": 3,
"return_documents": True,
},
).json()
for r in resp["results"]:
print(f'{r["relevance_score"]:.3f} {r["document"]["text"]}')
# → 0.148 Pasal pengelolaan hutan lindung
# → 0.057 UU 41/1999 pasal 50 — perusakan hutan
# → 0.000 Permen LHK satwa dilindungi ← irrelevant, low score
Image Generation
Text-to-image using FLUX-Klein. Returns inline base64-encoded PNG. Optional style LoRAs for "dark" or "anime" aesthetics.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | yes | Must be epithre-iris |
prompt | string | yes | Max 2000 chars. English works best; Indonesian also supported. |
size | string | no | "WxH", max 960×960. Default 768×768. Width & height rounded down to nearest multiple of 16. |
n | int | no | Currently fixed at 1 (batch not yet supported) |
response_format | string | no | "b64_json" (default and only supported) |
num_steps | int | no | 1-50, default 4. More steps = slower but slightly higher quality. |
seed | int | no | Seed for reproducibility. -1 = random. |
guidance_scale | float | no | Default 1.0. Higher = more literal prompt adherence. |
lora | string | no | "none" (default), "dark" (moody/cinematic), "anime" |
lora_strength | float | no | 0.0-1.5, default 0.6 (for non-none LoRAs) |
Response shape
{
"created": 1778455000,
"data": [
{"b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."} // base64-encoded PNG
]
}
Examples
Basic generation
resp = client.images.generate(
model="epithre-iris",
prompt="a serene Indonesian beach at sunset, photorealistic, golden hour",
size="768x768",
)
import base64
img_bytes = base64.b64decode(resp.data[0].b64_json)
with open("output.png", "wb") as f:
f.write(img_bytes)
With anime LoRA
resp = httpx.post(
"https://api.epithre.com/v1/images/generations",
headers={"Authorization": f"Bearer {EPITHRE_KEY}"},
json={
"model": "epithre-iris",
"prompt": "a young samurai in a bamboo forest, cherry blossoms falling",
"size": "768x768",
"lora": "anime",
"lora_strength": 0.8,
"seed": 42,
},
).json()
Image Edit
Edit an existing image with a text prompt. Supports single source or up to 5 reference images for compositional editing (e.g., "put the product from img1 in the setting from img2").
Request body
| Field | Type | Required | Description |
|---|---|---|---|
model | string | yes | epithre-iris |
prompt | string | yes | Edit instruction (e.g., "change background to sunset", "make the dress red") |
image | string (base64) | conditional | Single source image. PNG/JPEG/WebP/GIF, max 20 MB decoded. Mutually exclusive with images. |
images | array | conditional | 1-5 reference images for compositional editing. Mutually exclusive with image. |
size | string | no | Max 704×704 for edit. Default matches input. |
strength | float | no | 0.0-1.0, default 0.75. Higher = bigger change from source. |
num_inference_steps | int | no | 1-50, default 4 |
lora | string | no | Same options as generation |
Response shape
Same as /v1/images/generations: {"created", "data": [{"b64_json"}]}
Example: single-image edit
import base64
src = base64.b64encode(open("original.png", "rb").read()).decode()
resp = httpx.post(
"https://api.epithre.com/v1/images/edits",
headers={"Authorization": f"Bearer {EPITHRE_KEY}"},
json={
"model": "epithre-iris",
"prompt": "change the sky to dramatic stormy clouds with lightning",
"image": src,
"size": "512x512",
"strength": 0.7,
},
).json()
Example: multi-reference compositing
imgs = [base64.b64encode(open(f"ref_{i}.png", "rb").read()).decode()
for i in range(3)]
resp = httpx.post(
"https://api.epithre.com/v1/images/edits",
headers={"Authorization": f"Bearer {EPITHRE_KEY}"},
json={
"model": "epithre-iris",
"prompt": "the product from image 1, displayed in the studio setting from image 2, in the photography style of image 3",
"images": imgs,
"size": "640x640",
},
).json()
data:image/png;base64,... are auto-stripped. URL inputs are rejected (SSRF guard).
Recipes
RAG: embed + rerank + chat
The canonical pattern: index your knowledge base with embed, retrieve top-K with vector search, narrow with rerank, then synthesize an answer with chat.
import httpx, numpy as np
from openai import OpenAI
EK = os.environ["EPITHRE_KEY"]
client = OpenAI(api_key=EK, base_url="https://api.epithre.com/v1")
# 1) ONE-TIME: embed your corpus
corpus = [
"UU 41/1999 pasal 50 — barangsiapa merusak hutan akan dipidana...",
"Permen LHK No. 92/2018 tentang pengelolaan satwa liar...",
"PP No. 23/2021 tentang penyelenggaraan kehutanan...",
# ... thousands of docs
]
e = client.embeddings.create(model="epithre-embed", input=corpus, dimensions=1024)
corpus_vecs = np.array([row.embedding for row in e.data]) # (N, 1024)
# 2) AT QUERY TIME: embed the question, find top-K nearest
question = "Apa hukuman untuk perusakan hutan lindung?"
qe = client.embeddings.create(model="epithre-embed", input=question, dimensions=1024)
qv = np.array(qe.data[0].embedding)
scores = corpus_vecs @ qv # cosine sim (already L2-normalized)
top_k_idx = np.argsort(-scores)[:10]
candidates = [corpus[i] for i in top_k_idx]
# 3) Rerank to push the most relevant to the top
r = httpx.post("https://api.epithre.com/v1/rerank",
headers={"Authorization": f"Bearer {EK}"},
json={"model": "epithre-rerank", "query": question, "documents": candidates, "top_n": 3, "return_documents": True},
).json()
context = "\n\n".join(item["document"]["text"] for item in r["results"])
# 4) Generate answer with retrieved context
resp = client.chat.completions.create(
model="epithre-omni",
messages=[
{"role": "system", "content": "Jawab berdasarkan konteks yang diberikan. Sebutkan pasal/peraturan yang relevan."},
{"role": "user", "content": f"Konteks:\n{context}\n\nPertanyaan: {question}"},
],
)
print(resp.choices[0].message.content)
Vision QA on documents
Extract structured data from invoices, receipts, KTP, or any document image:
import base64, json
img = base64.b64encode(open("invoice.jpg", "rb").read()).decode()
resp = client.chat.completions.create(
model="epithre-omni",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "Ekstrak data invoice ini sebagai JSON dengan field: vendor, tanggal, nomor_invoice, item (list), subtotal, ppn, total."},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{img}"}}
]
}],
response_format={"type": "json_object"},
)
data = json.loads(resp.choices[0].message.content)
print(data["total"])
Build an image generation app
def generate_thumbnail(topic: str) -> bytes:
"""Generate a marketing thumbnail for a given topic."""
resp = httpx.post("https://api.epithre.com/v1/images/generations",
headers={"Authorization": f"Bearer {EPITHRE_KEY}"},
json={
"model": "epithre-iris",
"prompt": f"professional marketing thumbnail for: {topic}, vibrant colors, no text",
"size": "768x768",
"num_steps": 8, # higher quality
"guidance_scale": 1.5,
},
timeout=60,
).json()
return base64.b64decode(resp["data"][0]["b64_json"])
thumb = generate_thumbnail("Indonesian food blog — rendang recipe")
open("thumb.png", "wb").write(thumb)
Function calling for agents
Build an agent that can call functions. Loop until finish_reason == "stop" or no tool calls remain:
def get_weather(city: str) -> dict:
# your real implementation
return {"city": city, "temp_c": 31, "condition": "cerah berawan"}
def search_news(query: str) -> list:
return [...]
tool_handlers = {"get_weather": get_weather, "search_news": search_news}
tool_defs = [...] # OpenAI tools schema
messages = [{"role": "user", "content": "Cuaca Jakarta + berita terbaru tentang banjir"}]
for _ in range(5): # max 5 rounds
resp = client.chat.completions.create(model="epithre-omni", messages=messages, tools=tool_defs)
msg = resp.choices[0].message
messages.append(msg)
if not msg.tool_calls:
print(msg.content)
break
for tc in msg.tool_calls:
args = json.loads(tc.function.arguments)
result = tool_handlers[tc.function.name](**args)
messages.append({
"role": "tool",
"tool_call_id": tc.id,
"content": json.dumps(result),
})
Error handling
All errors return JSON with this envelope (same as OpenAI):
{
"error": {
"message": "Human-readable description",
"type": "error_category",
"code": "specific_error_code"
}
}
Full error catalog
| HTTP | Type | Common codes | What it means | Recovery |
|---|---|---|---|---|
| 400 | invalid_request_error | model_not_found, tos_required, email_not_verified | Bad parameters in request body | Fix client code, don't retry |
| 401 | authentication_error | authentication_error | Missing/invalid/revoked API key | Check Authorization header; create new key if revoked |
| 402 | insufficient_quota | insufficient_quota | API key has $0 balance | Top up via dashboard (email hello@epithre.com) |
| 403 | permission_error | account_suspended, email_not_verified, permission_error | Account suspended OR endpoint needs admin OR email not verified | Check email inbox or contact support |
| 413 | request_too_large | request_too_large | Body exceeds 1 MB (text) or 50 MB (images) | Reduce payload size; truncate prompts; resize images |
| 429 | rate_limit_error | rpm_exceeded, rpd_exceeded, concurrency_exceeded, backend_busy | Various rate limits hit | Exponential backoff (1s, 2s, 4s, …). Increase limits in dashboard if persistent. |
| 502 | backend_error | backend_error | Inference backend returned 5xx | Retry once with backoff |
| 503 | backend_unavailable | backend_unavailable | Backend not reachable (planned maintenance or outage) | Retry with longer backoff; check dashboard health |
| 504 | backend_timeout | backend_timeout | Backend slow / hung | Retry with shorter prompt or simpler request |
Recommended retry pattern
import time
def call_with_retry(call_fn, max_retries=4):
for attempt in range(max_retries):
try:
return call_fn()
except openai.APIError as e:
if e.status_code in (429, 502, 503, 504):
if attempt == max_retries - 1: raise
wait = (2 ** attempt) + random.uniform(0, 1) # jitter
time.sleep(wait)
continue
raise # don't retry 400/401/402/403
Rate limits
Two layers of limits apply to every request:
Per-key limits (yours to control)
| Limit | Default | Where to change |
|---|---|---|
| Requests per minute (RPM) | 60 | Dashboard → Keys → edit |
| Requests per day (RPD) | 10,000 | Dashboard → Keys → edit |
| Concurrent requests | 10 | Dashboard → Keys → edit |
| Monthly spend cap | $100 | Dashboard → Keys → edit |
Exceed any → HTTP 429 with code indicating which limit.
Backend capacity (shared)
Each chat model has an aggregate concurrent cap across all Epithre customers, to reserve capacity for other services. When hit, you get HTTP 429 with code: "backend_busy". Wait ~1 second and retry — these clear quickly.
Pricing
All prices in USD. Billed per request, deducted from your active API key's credit balance immediately.
| Model | Input | Output | Unit price |
|---|---|---|---|
epithre-prme | $0.40 / 1M tok | $1.60 / 1M tok | token |
epithre-omni | $0.30 / 1M tok | $1.20 / 1M tok | token |
epithre-lyt | $0.05 / 1M tok | $0.20 / 1M tok | token |
epithre-embed | $0.04 / 1M tok | — | input token only |
epithre-rerank | — | — | $0.000002 / document |
epithre-iris | — | — | $0.005 / image |
Topping up
During alpha, top-up is handled by email: send to hello@epithre.com with the amount and we'll reply with payment details (BCA/Wise/etc). Credit usually applied within 1 business day after payment clears. Stripe/Midtrans integration is planned for general availability.
SDKs & clients
Epithre is OpenAI-wire-compatible. Use any OpenAI SDK and override base_url:
- Python —
pip install openai(v1.0+) - Node.js / TypeScript —
npm install openai - Go —
go get github.com/sashabaranov/go-openai - Java / Kotlin —
com.openai:openai-javaor Spring AI - Ruby —
gem install ruby-openai - PHP —
openai-php/clienton Composer - Rust —
async-openaicrate
LangChain
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="epithre-omni",
api_key=os.environ["EPITHRE_KEY"],
base_url="https://api.epithre.com/v1",
)
llm.invoke("Halo!")
LlamaIndex
from llama_index.llms.openai_like import OpenAILike
from llama_index.embeddings.openai_like import OpenAILikeEmbedding
Settings.llm = OpenAILike(
model="epithre-omni",
api_key=os.environ["EPITHRE_KEY"],
api_base="https://api.epithre.com/v1",
)
Settings.embed_model = OpenAILikeEmbedding(
model_name="epithre-embed",
api_key=os.environ["EPITHRE_KEY"],
api_base="https://api.epithre.com/v1",
)
For endpoints not in OpenAI SDK (rerank, image edit)
Use plain HTTP via your language's standard client (Python httpx, JS fetch, Go net/http). See examples above.
Data policy (summary)
- We do not log prompts or responses. Only metadata: token counts, timestamps, costs, HTTP status.
- Images submitted to
/v1/images/editsare discarded after the response is returned. - Account data (email, hashed password, API key hashes) retained for account lifetime + 30 days.
- Usage metadata retained 90 days for billing & abuse detection.
- All infrastructure physically in Jakarta, Indonesia — your data never leaves the country.
- TLS 1.2+ enforced on all endpoints.
Full policy: privacy.html.
FAQ
Can I use Epithre to power a customer-facing product?
Yes. The standard ToS allows commercial use. Don't resell raw API access as your own product without a separate reseller agreement (contact us). See Terms.
How does Indonesian quality compare to OpenAI?
Our models are specifically fine-tuned on Indonesian data and outperform comparable OpenAI tier models on Indonesian benchmarks (semantic similarity, instruction following, cultural context). Run your own A/B on real production traffic — that's the only honest benchmark.
What's your uptime SLA?
Best-effort during alpha (no contractual SLA). We monitor 24/7 via Uptime Kuma. In practice we target 99.5%+. For mission-critical workloads with hard SLA, contact us for enterprise terms.
Can I run Epithre on-prem?
Not yet — the platform is currently cloud-only. On-prem licensing is on the roadmap for enterprise.
How do I report abuse or a security issue?
Email hello@epithre.com. For security issues, use subject "Security disclosure". We'll respond within 1 business day.
What happens if I hit my monthly spend cap?
Your key returns 402 insufficient_quota for the rest of the month. Raise the cap in the dashboard or top up to continue.
Can I get my data exported / account deleted?
Yes. Email hello@epithre.com. Export typically delivered as CSV within 7 days. Account deletion erases all linked data within 30 days, except billing records (retained 7 years per Indonesian tax law).