epithre-lyt
Fast, cheap, multimodal. The "do it in volume" model.
Capabilities
| Capability | Notes |
|---|---|
| Tier | Compact |
| Context window | 32,768 tokens |
| Max output | 4,096 tokens |
| Modalities | Text, image, audio, video |
| Tool use | Basic (less reliable for multi-step than omni/prme) |
| Extended thinking | No |
| Structured output | json_object + json_schema |
| Prompt caching | Yes |
| Streaming | Yes |
When to use
- High-volume classification, sentiment, tagging
- Translation pipelines (batch)
- Quick chat where flagship quality isn't needed
- Audio transcription / understanding (if you have audio input)
- Cost-sensitive features
When NOT to use
- Complex agentic tool loops - reliability degrades beyond 2-3 steps.
- Deep reasoning - use
epithre-omniorepithre-prme. - High-precision Indonesian legal/medical - use
epithre-omni. - Long-document analysis (>16K input) - use
epithre-omniorprme.
Pricing
- Input: Rp1,000 / 1M tokens
- Output: Rp4,000 / 1M tokens
- Cache: same multipliers
- Batch: 0.5x
6x cheaper than epithre-omni on output, comparable on input. For high-volume classification, you can run millions of classifications for ~Rp20,000.
Performance characteristics
- Near-instant latency: ~0.2-0.5s first-token, ~30-40 tokens/sec generation.
- Solid Indonesian fluency at the casual register; sometimes weaker on formal/legal register vs omni.
- Reliable for simple JSON / enum classification.
Caveats
- Output cap is 4096 tokens (vs 16384 for omni/prme). Plan for short replies.
- Audio + video input is supported but the model summarizes rather than transcribes. For high-fidelity transcription, look elsewhere.
- Tool calling works for single calls but multi-step chains are unreliable - escalate to omni.