Troubleshooting

"I keep getting 401 even though my key is correct"

"My streaming output arrives in chunks of 50+ tokens, not per-token"

A buffering proxy is in between. From the client side: not much you can do (the buffering is upstream). Try:

We set X-Accel-Buffering: no server-side; respected by most modern proxies.

"I see 429 backend_busy frequently"

This is shared-pool back-pressure across all Epithre customers, not your per-key limit.

"I see 429 concurrency_exceeded"

Your per-key concurrency cap (default 10) is hit. Either:

"Latency is suddenly slow"

Possible causes:

  1. Long prompt: tokens grow superlinearly with context. A 50K-token prompt takes much longer than 5K.
  2. Long output: tokens-per-second of generation is ~30-50 on Omni. A 4000-token reply takes ~80-130s.
  3. Backend transient slowness: rare but happens during high load. Email us with your request ID; we can correlate.
  4. tool_choice="required" slowness on long prompts (updated May 2026): the prior long-prompt stall is fixed across all backends. "required" now works reliably on epithre-omni and epithre-prme. For low-latency paths, "auto" and named tool choice still have slightly lower TTFT.
  5. response_format json_schema on long prompt: works on all backends post-fix. Fall back to json_object only if you hit a rare edge case at very long inputs (>10K tokens combined with deeply nested schema).

"JSON output is malformed or has finish_reason: length"

"Embeddings I created yesterday don't match similar embeddings I create today"

"Files API upload returns 413"

You're over a body-size or quota limit:

Delete old files first via DELETE /v1/files/{id}, or raise the quota by emailing us.

"Batch is stuck in validating"

The worker checks every 5 seconds. If you create a batch and immediately fetch it, you'll see validating. Wait 5-10 seconds; should transition to in_progress.

If still validating after a minute, check the input file's lines for malformed JSON. The worker rejects the whole batch if it can't parse the input.

"Webhook isn't firing"

"Cache hit rate is lower than expected"

"Iris keeps returning content_policy_violation"

epithre-iris has safety filters:

If you believe the rejection is a false positive on your legitimate use case, email us with the prompt. We refine the filters periodically.

"I'm reaching my Rp50,000 free credit, can I get more for testing?"

Email hello@epithre.com with subject "Extended trial". Include a description of your use case. We routinely grant additional credit for serious evaluation.

How to report a bug

Include:

Email: hello@epithre.com with subject "Bug report".

See also