BelegLotse
Extracts receipts and invoices into DATEV structure — §14 UStG-validated, optionally 100% local.
Problem & context
Typing in receipts is slow and error-prone
Every receipt must be captured, checked and exported to DATEV. An extractor that recognises fields, validates against §14 UStG and exports cleanly — optionally cloud-free — speeds up bookkeeping.
Solution
OCR + LLM + validation instead of typing
Schema-bound structured output, mandatory-field validation, DATEV export.
Architecture
Clean Architecture, four layers
Receipt fields & §14 UStG rules
Extract → validate → export
OCR, Mistral/Ollama, DATEV mapper
FastAPI + HTMX upload
Process history
From plan to deploy — six phases
- 01
Setup & architecture
IN PROGRESSClean-arch skeleton, Docker, CI. ADR-0001: local option (Ollama) for sensitive receipts.
- 02
OCR & preprocessing
PLANNEDScan/PDF → text, layout detection.
- 03
Field extraction
PLANNEDStructured output (amount, VAT, date, supplier) via LLM.
- 04
Validation
PLANNED§14 UStG mandatory fields, totals check, BGB §288.
- 05
DATEV export & eval
PLANNEDDATEV format, measure field precision/recall.
- 06
Deploy & docs
PLANNEDDocker deploy, GoBD-compliant storage, README, ADRs.
Results
Made measurable
Will be filled with real numbers after the eval phase — and then feeds into the CV.
Stack & compliance
Receipts contain personal & tax-relevant data → optionally 100% local (Ollama), GoBD-compliant storage, retention per AO §147. Disclaimer: no tax advice (StBerG).