Image generation app
epithre-iris is our image generation model with a 3-LoRA style registry (none / dark / anime).
Basic generation
import base64
resp = client.images.generate(
model="epithre-iris",
prompt="warung kopi pinggir jalan Jakarta malam, cinematic, golden hour, wide shot",
size="768x768",
num_steps=20,
)
open("warkop.png", "wb").write(base64.b64decode(resp.data[0].b64_json))
size: max 960x960. Default 768x768. Rounds down to multiple of 16.num_steps: 4 (fast preview) to 30 (final render). 20 is the typical production setting.seed: -1 random, integer for reproducibility.
Style LoRAs
# Anime style
resp = client.images.generate(
model="epithre-iris",
prompt="a young samurai in a bamboo forest, cherry blossoms falling",
lora="anime",
lora_strength=0.8, # 0-1.5, default 0.6
)
# Dark / cinematic style
resp = client.images.generate(
model="epithre-iris",
prompt="abandoned colonial-era building in Bandung, mist, dramatic shadows",
lora="dark",
lora_strength=0.7,
)
LoRAs are triggerless: just set lora and the model picks up the style. No keyword-stuffing your prompt.
Multi-reference editing
import base64, httpx, os
refs = [base64.b64encode(open(f"ref_{i}.png", "rb").read()).decode()
for i in range(3)]
resp = httpx.post(
"https://api.epithre.com/v1/images/edits",
headers={"Authorization": f"Bearer {os.environ['EPITHRE_KEY']}"},
json={
"model": "epithre-iris",
"prompt": "the product from image 1, in the studio setting from image 2, "
"shot in the style of image 3",
"images": refs,
"size": "640x640",
},
timeout=60,
).json()
Useful for: product-in-context shots, style transfer, A/B compositions.
Common prompt patterns
| Goal | Prompt pattern |
|---|---|
| Photorealistic | "photorealistic, [subject], [lighting], shot on [camera/lens], [setting]" |
| Illustration | "illustration, [subject], [art style], [palette]" + lora="anime" for anime |
| Marketing thumbnail | "professional marketing thumbnail for: [topic], vibrant colors, no text" |
| Product shot | "studio product photo of [product], white background, soft lighting" |
| Concept art | "concept art, [subject], cinematic, [mood], [color scheme]" |
What works well
- Indonesian scenes (warung, traditional architecture, batik patterns)
- Photorealistic landscapes
- Stylized portraits (especially with anime LoRA)
- Product shots
- Concept art
What works less well
- Text in images: don't expect readable text on signs/billboards. The model will produce text-like shapes but rarely legible words.
- Exact logos: it'll produce plausible-but-wrong company logos. Not for licensed brand work.
- Exact faces of real people: refused for real public figures (
IRIS_SAFETY_V2filter). - Counting: "5 cats" reliably gets 3-7 cats. The model isn't a precise counter.
Latency
- 4 steps @ 768x768: 12-19s
- 20 steps @ 768x768: 45-60s
- Anime LoRA adds ~5s
For interactive UX: stream-of-previews using 4 steps, render final at 20 steps when user confirms.
Cost
Rp750 per image. A typical app generating 10 previews + 1 final per user-session = Rp8,250 / session.
Batch generation
For nightly poster/thumbnail/asset generation jobs, you can fire many requests in parallel within your concurrency cap. Image generation isn't yet a batch-API target (the input format is different from chat/embed). Use direct realtime calls with a semaphore.
import asyncio
sem = asyncio.Semaphore(3) # Iris parallel cap
async def generate_one(prompt):
async with sem:
return await async_client.images.generate(
model="epithre-iris", prompt=prompt, size="768x768")
results = await asyncio.gather(*[generate_one(p) for p in prompts])