epithre-rerank
Cross-encoder reranker. Use after embedding-based retrieval to boost top-K precision.
Capabilities
| Capability | Notes |
|---|---|
| Tier | Reranker |
| Max input | 2,048 tokens (query + doc combined) |
| Modalities | Text only |
| Wire format | Industry-standard rerank shape (query, documents, top_n, return_documents) |
| Instruction-aware | Yes; instruction field for custom rerank criteria |
When to use
- After embedding-based search returns top-K candidates
- When you need higher precision than embed-only (~85% -> ~98% in our benchmarks)
- Indonesian-language retrieval where dense embed alone struggles
Pricing
- Rp5 per document
- Batch: 0.5x
Quality
On our 25-question Indonesian legal gold set: - Embed-only top-1 recall: 64% - Embed + rerank top-1 recall: 100%
The pattern: embed retrieves broadly (recall@50 ~95%), rerank narrows precisely (top-3 from those 50).
Score interpretation
Scores are P(yes) / (P(yes) + P(no)) from the cross-encoder. Range [0, 1].
Indonesian queries often produce low absolute values: a true match might score 0.10-0.30. Use rank order, not absolute thresholds. Don't filter by score absolute value.
Custom instructions
Default instruction is tuned for Indonesian general retrieval. For specialized domains:
{"instruction": "Given a query about Indonesian environmental law, "
"rank the most legally-authoritative documents first."}
Custom instructions adjust how the reranker weighs aspects like recency, authority, specificity.
Limits
- Max 64 documents per request
- Combined query+doc tokens per pair limited by the 2048 context window. Long docs get truncated server-side.
Capacity
Stress tested at 27 req/s sustained with p95 latency 1.7s.