Summarization with citations
Two summarization patterns: simple (just compress) and grounded (with citations). For production, you usually want grounded.
Simple summarization
resp = client.chat.completions.create(
model="epithre-omni",
messages=[
{"role": "system", "content":
"Ringkas artikel berikut dalam 3 paragraf. Bahasa Indonesia formal."},
{"role": "user", "content": article_text},
],
)
print(resp.choices[0].message.content)
For long articles (>30K tokens), use epithre-prme:
resp = client.chat.completions.create(
model="epithre-prme",
messages=[
{"role": "system", "content": "Ringkas dokumen panjang dalam 5 poin kunci."},
{"role": "user", "content": long_document},
],
)
Summarization with citations
The pattern: chunk the doc, label chunks, ask the model to cite which chunk(s) each sentence comes from.
# 1. Split into labeled chunks
chunks = split_into_chunks(article, max_chars=1500)
labeled = "\n\n".join(f"[{i+1}] {c}" for i, c in enumerate(chunks))
# 2. Ask for cited summary
resp = client.chat.completions.create(
model="epithre-omni",
messages=[
{"role": "system", "content":
"Ringkas dokumen ini. Setiap kalimat ringkasan harus diikuti "
"dengan kutipan dalam kurung siku, contoh [1] atau [2,3]. "
"Hanya gunakan informasi dari dokumen, jangan tambah pengetahuan luar."},
{"role": "user", "content": labeled},
],
)
print(resp.choices[0].message.content)
# "Pemerintah menetapkan kebijakan baru terkait pajak [1]. Berdasarkan
# data Q3, penerimaan pajak naik 12% YoY [2,4]. ..."
Citations as structured output
For machine parsing, structured output is cleaner:
schema = {
"type": "object",
"properties": {
"summary_sentences": {
"type": "array",
"items": {
"type": "object",
"properties": {
"text": {"type": "string"},
"citations": {"type": "array", "items": {"type": "integer"}},
},
"required": ["text", "citations"],
"additionalProperties": False,
},
},
"key_topics": {"type": "array", "items": {"type": "string"}},
},
"required": ["summary_sentences", "key_topics"],
"additionalProperties": False,
}
resp = client.chat.completions.create(
model="epithre-omni",
messages=[
{"role": "system", "content":
"Ringkas dokumen sebagai array kalimat. Tiap kalimat punya array citations berisi index chunk."},
{"role": "user", "content": labeled},
],
response_format={"type": "json_schema",
"json_schema": {"name": "summary", "strict": True, "schema": schema}},
)
import json
data = json.loads(resp.choices[0].message.content)
for s in data["summary_sentences"]:
print(s["text"])
print(f" citing chunks: {s['citations']}")
Multi-doc summarization
When you have several related docs (e.g. news articles on the same event):
docs = [news1_text, news2_text, news3_text]
labeled = "\n\n".join(f"=== Dokumen {i+1} ===\n{d}" for i, d in enumerate(docs))
resp = client.chat.completions.create(
model="epithre-omni",
messages=[
{"role": "system", "content":
"Ringkas beberapa dokumen tentang topik yang sama. Sertakan: "
"(1) fakta yang konsisten di semua sumber, "
"(2) perbedaan signifikan antar sumber, "
"(3) hal yang hanya disebut satu sumber. "
"Sertakan rujukan ke nomor dokumen."},
{"role": "user", "content": labeled},
],
)
Periodic summarization (e.g. daily newsletter)
Use Batch API for cost savings:
import json
from datetime import date
today = date.today()
articles = fetch_today_news() # your news source
with open("daily.jsonl", "w") as f:
for i, article in enumerate(articles):
f.write(json.dumps({
"custom_id": f"news-{today}-{i}",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "epithre-lyt",
"messages": [
{"role": "system", "content": "Ringkas artikel dalam 2 kalimat."},
{"role": "user", "content": article},
],
}
}) + "\n")
# Submit and poll, or use webhook for batch.completed
50% off realtime. Most batches complete in minutes.
Common pitfalls
- Hallucination of facts not in source: tell the model explicitly "Hanya gunakan informasi dari dokumen." For high-stakes, also pass
temperature=0.1. - Length runaway: set max_tokens. A 1500-word article should summarize to 300-500 tokens, not 2000.
- Citation drift: the model occasionally cites the wrong chunk number. For high-precision use, validate citations programmatically (post-process: check each cited chunk supports the sentence).