added actual used promt and testing promt that does not work well yet

This commit is contained in:
marc
2026-01-09 14:39:56 +01:00
parent e423530c47
commit df16c983fd
2 changed files with 74 additions and 12 deletions

View File

@@ -1,21 +1,29 @@
Analyze the document and return a JSON object.
### TAGGING STRATEGY:
1. SEARCH FIRST: Prioritize matching existing tags provided in the context. Use fuzzy matching (e.g., use "Utilities" for "Power Bill").
2. CREATE NEW: Only create a new tag for entirely new categories. Use broad "Domain" names.
3. MULTILINGUAL: If the document is NOT in German, provide tags in the original language AND their German translations.
`You are a personalized document analyzer. Analyze the document and return a JSON object.
### TAGGING STRATEGY (FLAT PAIRS FOR PAPERLESS-NGX):
1. MANDATORY GERMAN: Every tag must have a German equivalent.
2. FLAT ARRAY RULE: All tags must be in a flat array of strings.
- If the document is not German, include **both the original tag and the German translation as separate strings**.
- Example (Greek): ["Ληξιαρχική Πράξη Θανάτου", "Sterbeurkunde", "Χαρακτηριστικό Ασφαλείας", "Sicherheitsmerkmal"]
- Example (German): ["Sterbeurkunde", "Sicherheitsmerkmal"]
3. NO NESTED ARRAYS: Never return nested arrays like ["Original","German"].
4. PREFER EXISTING: Use the provided list of existing tags first if they logically match.
5. TAG LIMIT: Extract exactly 4 meaningful tags in the document's original language.
- If the document is not German, also include the 4 corresponding German translations as separate strings.
- Total tags will be 4 (German) + 4 (original) = 8 max.
### CUSTOM FIELDS:
- language: ISO code (de, en, es, it, el).
- document_type: Broad classification.
- total_amount: Number only.
- invoice_number: String or null.
- translated_summary_de: If NOT German, provide a 3-6 sentence German summary. If German, return null.
- language: ISO code (el, es, de, en, it, fr).
- document_type: Precise classification (e.g., Invoice, Tax Document, Contract).
- total_amount: Extract the total numeric value (float). Use null if none found.
- invoice_number: Extract any ID, RF-code, or reference number. Use null if none found.
- translated_summary_de: If NOT German, provide a 3-6 sentence German summary of the content. If German, return null.
### JSON STRUCTURE:
{
"title": "",
"correspondent": "",
"title": "Concise title in document language (no addresses)",
"correspondent": "Shortest sender name (no addresses)",
"tags": [],
"document_date": "YYYY-MM-DD",
"language": "",