`epithre-rerank`

Cross-encoder reranker. Use after embedding-based retrieval to boost top-K precision.

Capabilities

Capability	Notes
Tier	Reranker
Max input	2,048 tokens (query + doc combined)
Modalities	Text only
Wire format	Industry-standard rerank shape (`query`, `documents`, `top_n`, `return_documents`)
Instruction-aware	Yes; `instruction` field for custom rerank criteria

When to use

After embedding-based search returns top-K candidates
When you need higher precision than embed-only (~85% -> ~98% in our benchmarks)
Indonesian-language retrieval where dense embed alone struggles

Pricing

Rp5 per document
Batch: 0.5x

Quality

On our 25-question Indonesian legal gold set: - Embed-only top-1 recall: 64% - Embed + rerank top-1 recall: 100%

The pattern: embed retrieves broadly (recall@50 ~95%), rerank narrows precisely (top-3 from those 50).

Score interpretation

Scores are P(yes) / (P(yes) + P(no)) from the cross-encoder. Range [0, 1].

Indonesian queries often produce low absolute values: a true match might score 0.10-0.30. Use rank order, not absolute thresholds. Don't filter by score absolute value.

Custom instructions

Default instruction is tuned for Indonesian general retrieval. For specialized domains:

{"instruction": "Given a query about Indonesian environmental law, "
                "rank the most legally-authoritative documents first."}

Custom instructions adjust how the reranker weighs aspects like recency, authority, specificity.

Limits

Max 64 documents per request
Combined query+doc tokens per pair limited by the 2048 context window. Long docs get truncated server-side.

Capacity

Stress tested at 27 req/s sustained with p95 latency 1.7s.