Migrate from Cohere

Currency note: Epithre prices in IDR (Rupiah). Competitor prices left in USD for comparison reference.

Cohere migration is split: our rerank endpoint is wire-compatible with Cohere's, so that part is just a base-URL change. Chat and embed require switching to OpenAI shape, which is more code change but mechanical.

Model mapping

Cohere	Epithre	Notes
`command-r-plus`, `command-r`	`epithre-omni`	General chat / RAG / tool use.
`command-light`	`epithre-lyt`	Fast cheap chat.
`embed-multilingual-v3`, `embed-english-v3`	`epithre-embed`	4000-dim, multimodal.
`rerank-multilingual-v3`, `rerank-english-v3`	`epithre-rerank`	Cohere-wire-compatible.
`classify-*`	(no direct equivalent)	Use chat-completion with few-shot examples; see classification cookbook.

Rerank (drop-in)

Cohere customers who use /rerank get the easiest migration: same request shape, same response shape, just a base URL change.

Cohere:

import cohere
co = cohere.Client(api_key="...")
r = co.rerank(
    model="rerank-multilingual-v3.0",
    query="hutan lindung",
    documents=["UU 41/1999 ...", "Permen LHK ...", "resep nasi goreng"],
    top_n=2,
    return_documents=True,
)

Epithre:

import httpx
r = httpx.post(
    "https://api.epithre.com/v1/rerank",
    headers={"Authorization": f"Bearer {os.environ['EPITHRE_KEY']}"},
    json={
        "model": "epithre-rerank",
        "query": "hutan lindung",
        "documents": ["UU 41/1999 ...", "Permen LHK ...", "resep nasi goreng"],
        "top_n": 2,
        "return_documents": True,
    },
).json()
# Same response shape: r["results"] = [{"index", "relevance_score", "document"?}, ...]

Optional instruction field for custom reranking criteria:

{"instruction": "Given a query about Indonesian environmental law, rank the most legally-authoritative documents first."}

Embed

Cohere's embed accepts input_type ("search_document", "search_query", "classification", "clustering"). Epithre's epithre-embed uses a free-form instruction field instead.

Cohere:

r = co.embed(
    model="embed-multilingual-v3.0",
    texts=texts,
    input_type="search_document",
    truncate="END",
)
vectors = r.embeddings

Epithre:

r = client.embeddings.create(
    model="epithre-embed",
    input=texts,
    extra_body={"instruction": "Represent this document for retrieval:"},
)
vectors = [d.embedding for d in r.data]

Mapping input_type to instruction:

Cohere `input_type`	Epithre instruction
`search_document`	`"Represent this document for retrieval:"`
`search_query`	`"Represent this query for retrieving relevant documents:"`
`classification`	`"Represent this text for classification:"`
`clustering`	`"Represent this text for clustering similar items:"`

The truncate field also works in Epithre with different vocab: "END" (default, keep head), "START" (keep tail), "NONE" (return 422 if too long). Char-based not token-based.

Dimensions

Cohere embed returns 1024-dim by default for v3. Epithre epithre-embed returns 4000-dim. To get 1024-dim for compatibility:

r = client.embeddings.create(model="epithre-embed", input=texts, dimensions=1024)

The 1024-dim slice is the same first 1024 of the 4000-dim native, re-normalized. Lossless prefix per Matryoshka.

Chat

Cohere's chat differs significantly from OpenAI shape. Migrate to OpenAI-shape via the openai SDK.

Cohere:

r = co.chat(
    model="command-r-plus",
    message="Apa hukuman buat penebangan ilegal?",
    chat_history=[{"role": "USER", "message": "..."}, {"role": "CHATBOT", "message": "..."}],
    preamble="You are a legal assistant.",
    documents=[{"title": "UU 41/1999", "snippet": "..."}],
)

Epithre:

r = client.chat.completions.create(
    model="epithre-omni",
    messages=[
        {"role": "system", "content": "You are a legal assistant.\n\nContext:\n[UU 41/1999] ..."},
        {"role": "user", "content": "..."},
        {"role": "assistant", "content": "..."},
        {"role": "user", "content": "Apa hukuman buat penebangan ilegal?"},
    ],
)

Cohere's documents parameter (with automatic grounding/citations) doesn't have a direct Epithre equivalent. The pattern: inline-include retrieved context in the system or user message, and either ask the model to cite explicitly or use retrieval-augmented patterns.

Pricing comparison

Approximate, 2026-05:

Workload	Cohere typical	Epithre
Command-R+ input	$2.50 / 1M tok	`epithre-omni` Rp7,000 / 1M tok
Command-R+ output	$10 / 1M tok	`epithre-omni` Rp25,000 / 1M tok
Embed v3	$0.10 / 1M tok	`epithre-embed` Rp1,500 / 1M tok
Rerank v3	$2 / 1000 searches	`epithre-rerank` Rp5,000 / 1000 docs

Rerank pricing scheme differs: Cohere charges per "search unit" (a single query). Epithre charges per document. With typical 10 docs per query, Cohere's $2/1000 ~= $200/1M docs, vs. Epithre's Rp5,000,000/1M docs (~$280/1M) — roughly comparable per-doc, but rerank in Epithre is structurally simpler since you can pre-batch hundreds of docs against one query.

What Cohere has that Epithre doesn't

Classify endpoint as a first-class API. Use Epithre chat + few-shot for the same result; slightly more tokens but typically equivalent quality.
Compass (multimodal embed + LLM combined product). Use epithre-embed + epithre-omni separately for the same flow.
Fine-tuning UI in their dashboard. Epithre offers via email for now.

What Epithre has that Cohere doesn't

Image generation + editing (epithre-iris). Cohere doesn't offer image generation.
Indonesian-first models. Cohere's embed-multilingual is good but Epithre is purpose-built.
Prompt caching with 90% read discount.
Data residency Jakarta.

Working example: RAG pipeline migration

# BEFORE (Cohere)
import cohere
co = cohere.Client(os.environ["COHERE_API_KEY"])

def index_docs(docs):
    return co.embed(model="embed-multilingual-v3.0", texts=docs,
                    input_type="search_document").embeddings

def query(question, indexed_docs, doc_vectors):
    qv = co.embed(model="embed-multilingual-v3.0", texts=[question],
                  input_type="search_query").embeddings[0]
    # ... cosine search ...
    candidates = [indexed_docs[i] for i in top_k_idx]
    reranked = co.rerank(model="rerank-multilingual-v3.0",
                         query=question, documents=candidates,
                         top_n=3, return_documents=True)
    context = [r.document.text for r in reranked.results]
    return co.chat(model="command-r-plus",
                   message=question,
                   preamble="Answer based on context.",
                   documents=[{"title": f"src_{i}", "snippet": c}
                              for i, c in enumerate(context)]).text

# AFTER (Epithre)
from openai import OpenAI
import httpx
client = OpenAI(api_key=os.environ["EPITHRE_KEY"],
                base_url="https://api.epithre.com/v1")

def index_docs(docs):
    r = client.embeddings.create(
        model="epithre-embed",
        input=docs,
        extra_body={"instruction": "Represent this document for retrieval:"},
    )
    return [d.embedding for d in r.data]

def query(question, indexed_docs, doc_vectors):
    qv = client.embeddings.create(
        model="epithre-embed", input=[question],
        extra_body={"instruction": "Represent this query for retrieving relevant documents:"},
    ).data[0].embedding
    # ... cosine search ...
    candidates = [indexed_docs[i] for i in top_k_idx]
    rerank_r = httpx.post(
        "https://api.epithre.com/v1/rerank",
        headers={"Authorization": f"Bearer {os.environ['EPITHRE_KEY']}"},
        json={"model": "epithre-rerank", "query": question, "documents": candidates,
              "top_n": 3, "return_documents": True},
    ).json()
    context = [r["document"]["text"] for r in rerank_r["results"]]
    return client.chat.completions.create(
        model="epithre-omni",
        messages=[
            {"role": "system", "content": "Answer based on the provided context."},
            {"role": "user", "content": f"Context:\n{chr(10).join(context)}\n\nQ: {question}"},
        ],
    ).choices[0].message.content

Migration checklist

[ ] Rerank: change URL, headers, keep same body shape. Zero-logic change.
[ ] Embed: switch SDK or HTTP shape, map input_type to instruction text.
[ ] Chat: switch to OpenAI shape, replace preamble with system message, replace documents with inline context.
[ ] If using classify: re-implement as few-shot chat. See classification cookbook.
[ ] Re-embed corpus with epithre-embed for native-quality vectors.
[ ] Update rate-limit retry handling: same 429 pattern.

Email hello@epithre.com with subject "Cohere migration" if you hit unexpected differences.