Receipt photo = recorded: how AI processes documents

You take a photo of a receipt, send it, and within seconds you have a complete accounting record with all the details. It sounds like magic, but behind it lies a sophisticated combination of technologies that have been evolving for decades. OCR (optical character recognition), machine learning, computer vision, and natural language processing work together to create a system that can "read" a paper document and turn it into structured data.
In this article, we'll look under the hood of the entire process. We'll explain exactly how AI "sees" and "reads" your receipt, what information it extracts, what level of accuracy you can expect, and what to watch out for. All technically precise, yet easy to understand — no academic jargon, just a practical explanation of what happens from the moment you press the shutter button to the moment the record is saved in your books.
Step 1: Taking the photo — what affects processing quality
Everything starts with a photograph. And while modern AI systems can handle even fairly poor-quality images, following a few simple guidelines helps achieve better results.
What AI needs to see
From an accounting perspective, the following details matter on a receipt or invoice:
- Supplier (company name, registration number, VAT number)
- Issue date and, where applicable, the tax point date
- Total amount
- VAT rate and amount (if you are VAT-registered)
- Tax base
- Description of goods or services (for correct categorization)
- Document number (for verification purposes)
Tips for better document photos
- Straight-on view from above — photograph the document head-on, not at an angle
- Good lighting — natural daylight works best; avoid harsh shadows
- Full document in frame — make sure all edges are visible
- Steady hands — hold your phone with both hands or rest it against a surface
- Flat document — if the receipt is crumpled, smooth it out before taking the photo
- Watch out for glare — glossy thermal paper can reflect light
Even if you can't tick every box, modern AI handles most situations well. But the better the input, the faster and more accurate the result.
What happens on your phone before sending
Even before you send the photo, your phone performs basic optimizations: autofocus, exposure correction, and image compression. WhatsApp then compresses the photo slightly for faster delivery, but retains enough quality for text recognition. The resolution of a typical WhatsApp photo (approximately 1600 x 1200 pixels) is more than sufficient for OCR processing.
Step 2: Image pre-processing — preparing for recognition
When the AI system receives your image, it doesn't immediately start "reading" the text. First, it performs a series of adjustments that dramatically improve recognition accuracy.
Geometric correction
If you photographed the document at an angle, the system detects the perspective distortion and "straightens" the image. Imagine photographing a receipt lying on a table at a slight angle — AI can calculate what the document would look like viewed directly from above, and transforms the image accordingly.
Lighting correction
Uneven lighting (such as a shadow from your hand) can cause part of the text to appear dark and another part bright. The system applies adaptive brightness equalization to give the entire document consistent contrast.
Conversion to black and white
For OCR purposes, color information is irrelevant — what matters is only the contrast between text and background. The system converts the image to black and white (binary) form, where text is black on a white background. This process is called binarization, and there are sophisticated algorithms that adapt to varying photographic conditions.
Orientation detection
If you photographed the document upside down or rotated 90 degrees, the system automatically detects the correct text orientation and rotates the image accordingly.
Noise removal
Paper texture, small spots, table surfaces in the background — all of this is "noise" that AI must filter out in order to focus solely on the relevant text and numbers.
Why pre-processing matters so much
Without pre-processing, OCR accuracy for typical smartphone photos would reach only 70–80%. Thanks to automatic corrections, accuracy increases to 95–99%+. The difference between 80% and 99% accuracy means that on a receipt with 10 fields, the gap is between 2 errors and practically none.
Step 3: OCR — recognizing text from an image
Now comes the heart of the entire process: optical character recognition (OCR). Simply put, the system "reads" text from an image. But modern OCR is far more sophisticated than most people imagine.
How OCR "sees" letters
Traditional OCR systems compared shapes in an image against a pre-stored database of fonts. Modern systems based on deep learning work differently — they have learned to recognize characters from hundreds of millions of examples, much like a child learning to read.
The process unfolds in several stages:
Text region detection — A neural network first identifies where text appears in the image. It distinguishes text from logos, barcodes, images, and decorative elements.
Line segmentation — The text is split into individual lines. On receipts this is usually straightforward (lines are clearly separated); on invoices with tables it becomes more complex.
Character recognition — Each line is analyzed and individual characters (letters, digits, punctuation) are identified. Modern systems process entire lines at once, which improves accuracy — the context of surrounding characters helps identify even less legible ones.
Language correction — The recognized text is compared against a language model. If OCR "reads" a word incorrectly, the language model suggests a correction. This post-processing stage significantly reduces error rates.
Modern OCR accuracy in numbers
| Document type | Character-level accuracy | Field-level accuracy | |--------------|--------------------------|----------------------| | Printed invoice (PDF quality) | 99.5%+ | 99%+ | | Printed receipt (thermal paper, good quality) | 98–99% | 95–99% | | Printed receipt (faded, crumpled) | 93–97% | 85–95% | | Handwritten document | 85–92% | 70–85% | | Angled document (correction enabled) | 96–99% | 93–98% | | Poorly lit document | 90–96% | 82–93% |
Character-level accuracy = percentage of individual characters correctly recognized. Field-level accuracy = percentage of fields (date, amount, supplier) where the entire value is recognized correctly.
For comparison: manual data entry by humans has an average error rate of 1–4% at the field level. With repetitive transcription (such as 50 receipts in a row), error rates increase with fatigue.
Step 4: Intelligent data extraction — from text to structured data
OCR gives you "raw" text. But the text "Total: 1,234.50 CZK" is still just a string of characters to a computer. The next step is understanding what that text means — assigning the correct values to the correct fields.
How AI understands the structure of a document
Every shop, restaurant, or supplier uses a different receipt format. Yet AI can extract the same information from all of them. How?
Layout analysis — AI analyzes the spatial arrangement of text on the document. It recognizes the header (seller information), the body (list of items), and the footer (totals, VAT, payment).
Contextual recognition — The system looks for keywords and patterns. "Total", "Sum", "Amount due" signal the total amount. "VAT", "21%", "15%" signal tax information. "Date", a space, and a DD.MM.YYYY format signal the issue date.
Relational mapping — AI understands relationships between data points. The number after "Total" is the total amount. The percentage after "VAT" is the tax rate. The amount on a line containing "VAT 21%" is the tax amount at that rate.
A concrete extraction example
Imagine this electronics store receipt:
DATART International, a.s.
Company reg. no.: 64828824
VAT no.: CZ64828824
Store: OC Chodov, Praha 4
Date: 15.02.2026 Time: 14:23
Register: 3 Receipt no.: 2026-00847
USB-C cable 2m 249.00
Wireless mouse 599.00
Mouse pad 149.00
----------
Subtotal: 997.00
VAT 21%: 172.89
Base 21%: 824.11
TOTAL: 997.00
Card payment: 997.00
Card: **** **** **** 4521
From this text, AI extracts:
📋
- Supplier: DATART International, a.s. (recognized from the header)
- Company reg. no.: 64828824 (pattern of 8 digits after "Company reg. no.:" recognized)
- VAT no.: CZ64828824 (pattern of CZ + digits after "VAT no.:" recognized)
- Date: 15.02.2026 (date format recognized)
- Document number: 2026-00847 (recognized from the "Receipt no.:" line)
- Items: USB-C cable (249 CZK), Wireless mouse (599 CZK), Mouse pad (149 CZK)
- Total amount: 997.00 CZK (recognized from the "TOTAL:" line)
- VAT rate: 21% (recognized from the "VAT 21%:" line)
- VAT amount: 172.89 CZK
- Tax base: 824.11 CZK
- Payment method: Card (recognized from the "Card payment" text)
Data validation — checking for correctness
After extraction, AI performs automatic checks:
- Mathematical check: Base (824.11) + VAT (172.89) = Total (997.00)? Yes, correct.
- VAT rate check: 21% of 824.11 = 173.06? A minor difference due to rounding — within tolerance.
- Company reg. no. check: 64828824 — matches the format (8 digits), can be verified in the ARES business register.
- Date check: 15.02.2026 — valid date, not in the future, not too old.
- VAT no. check: CZ64828824 — matches the format, corresponds to the company registration number.
If any check fails, the system alerts you and asks for manual verification.
Step 5: Intelligent categorization — where the document belongs
Extracting data alone is not enough. For correct tax records, every document must be assigned to the right category. And this is where AI really shows its strength over basic OCR.
How AI selects a category
Categorization is based on multiple signals at once:
Supplier — If the supplier is a petrol station chain, there's a high probability it's fuel. Electronics retailers signal IT equipment. Wholesale suppliers signal bulk purchasing.
Items on the document — Recognized items ("USB cable", "mouse", "mouse pad") allow more precise categorization: IT equipment / office supplies.
Amount and context — A small amount at a food supplier (roughly £8–25) suggests a meal. A large amount at the same supplier (£250+) might be corporate catering.
User history — If you've previously assigned documents from the same supplier to a specific category, AI respects that preference.
Time and frequency — A receipt from a petrol station every Monday morning is probably a regular fill-up of a company vehicle.
📊
| Signal on document | Suggested category | Confidence | |-------------------|-------------------|------------| | Supplier: petrol station, product: unleaded fuel | Fuel | 99% | | Supplier: electronics retailer, product: 27" monitor | IT equipment | 97% | | Supplier: wholesaler, items: office paper | Office supplies | 95% | | Supplier: restaurant | Meals / Entertainment | 85% (requires clarification) | | Supplier: unknown, items: unclear | Other expenses | 60% (requires manual assignment) |
Learning from your corrections
A key feature of modern AI systems is the ability to learn. When you correct a category — for example, recategorizing a restaurant receipt from "Meals" to "Client entertainment — business lunch" — the system remembers the correction. Next time it encounters a similar document, it will suggest the right category with greater confidence.
This process is known as "reinforcement learning from human feedback" and is the reason why an AI assistant gets better the longer you use it.
Step 6: Saving and archiving
The final step is saving the structured data and the original image.
What gets stored
- Structured data: All extracted fields in database format (date, amount, supplier, category, VAT…)
- Original photograph: An archived copy at full resolution as proof of the document's existence
- Metadata: Processing time, AI model version, confidence scores for individual fields
- Change history: If you made any corrections, the edit history is also stored
Legal archiving requirements
How long you must keep documents
Under Czech accounting law and the Tax Code:
- Tax documents (invoices, receipts): at least 10 years from the end of the tax period in which the VAT liability arose
- Accounting records in tax records: at least 5 years (the general archiving period for tax obligations is 3 years, running from the end of the year of filing — in practice a minimum of 5 years)
- Payroll records: up to 30 years (for pension insurance purposes)
Digital archiving is fully compliant with the law, provided that the legibility, authenticity, and durability of the record are ensured. A quality photo taken with a modern smartphone meets these requirements.
Special cases: what AI handles well and where it struggles
Thermal paper receipts
Thermal paper (the kind used for standard shop and restaurant receipts) fades over time. AI can process even a partially faded receipt, but the sooner you photograph it, the better the result. For very old, heavily faded receipts, accuracy may be lower.
Recommendation: Photograph receipts as soon as possible after receiving them — ideally right at the till.
Multilingual documents
If you shop abroad or deal with foreign suppliers, AI systems handle text recognition in most European languages. Modern OCR models are trained on dozens of languages simultaneously.
Documents with multiple VAT rates
Some purchases include items at different VAT rates (for example, food at a reduced rate and non-food items at the standard rate). AI can recognize and correctly distinguish the individual rates.
Credit notes and corrective documents
AI recognizes when a document is a credit note (negative amount, text such as "credit note" or "corrective document") and correctly records it as a reduction in expenses rather than a new expense.
PDF invoices
Electronic invoices in PDF format are paradoxically easier for AI than photos of paper documents — the text is directly machine-readable, so OCR in the traditional sense isn't required. Extraction accuracy from PDF files approaches 100%.
PDF vs. photo: accuracy comparison
| Document source | Extraction accuracy | Processing speed | |----------------|---------------------|-----------------| | PDF invoice (structured) | 99.5%+ | Under 2 seconds | | PDF invoice (scanned image) | 97–99% | 3–5 seconds | | Document photo (good quality) | 95–99% | 3–8 seconds | | Document photo (lower quality) | 85–95% | 5–15 seconds |
PDF invoices are processed fastest and most accurately, because the text does not need to be recognized from an image.
How AI keeps improving
One of the greatest benefits of AI-based document processing is continuous improvement. Every document processed contributes to making the system more accurate.
Global learning
When thousands of users send receipts from the same chain, AI learns to recognize that chain's specific receipt format. A new user immediately benefits from the fact that the system already knows that format.
Personalized learning
Your corrections and confirmations help AI better understand your specific needs. If you're an IT consultant and always categorize restaurant receipts as "Client entertainment — business lunch", the system adapts to your profile.
Model updates
Developers regularly retrain AI models on new data, add support for new document formats, and improve accuracy based on anonymized user feedback.
Practical test: processing 5 different documents
To keep this article grounded in reality, let's look at a typical set of documents a self-employed person might collect in a single day:
| Document | Type | Expected AI result | Potential issue | |----------|------|--------------------|----------------| | Petrol station receipt for fuel | Thermal paper, printed | Fuel, 1,580 CZK, VAT 21% | None — standard format | | Web hosting invoice (PDF) | Electronic PDF | IT services, 290 CZK/month, VAT 21% | None — PDF is the ideal input | | Stationery shop receipt | Thermal paper, small | Office supplies, 347 CZK | Small format, possible cropping | | Restaurant bill (business lunch) | Printed on plain paper | Meals/Entertainment, 890 CZK | Category clarification required | | Graphic designer invoice for logo (handwritten) | Semi-handwritten, non-standard | Services/Marketing, 5,000 CZK | Handwriting — lower accuracy |
Out of five documents, 3–4 will likely be processed fully automatically with no intervention needed. For 1–2 you may need to confirm or refine the category. In practice, that's a matter of seconds, not minutes.
Frequently asked questions
How accurate is receipt recognition? For standard printed receipts from retail chains and petrol stations, accuracy reaches 97–99%. Language-specific characters (accents, special letters), date formats like DD.MM.YYYY, and amount formats using a decimal comma are all fully supported in modern AI models.
Will AI recognize a receipt where the paper has been double-printed? Partially. If the overprint is minor and the key details (amount, date) are legible, AI will manage. For heavily overprinted or folded documents, accuracy may be lower — in such cases the system will ask you to fill in the details manually.
What if a receipt contains a mix of personal and business purchases? AI extracts all items. You can then mark which items are business-related and which are personal. The system separates the accounting-relevant portion accordingly.
Does it work for foreign documents? Yes, modern OCR systems support dozens of languages and currencies. A document in German, English, or Slovak will be processed just as reliably as a local one.
What happens if AI recognizes something incorrectly? You receive a summary of the recognized data for confirmation. If any detail is wrong, you simply correct it. AI remembers the correction for next time. Every document processed contributes to greater accuracy in the future.
Will AI document scanning replace a traditional scanner? For self-employed individuals and tax record purposes, yes. The camera on a modern smartphone captures images at sufficient quality for archiving. Professional accounting firms processing thousands of documents daily may still prefer dedicated scanners, but for an individual sole trader, a phone is more than adequate.
Stop transcribing — start photographing
In 2026, the technology for automatic document processing is mature enough to reliably replace manual data entry for everyday use by self-employed individuals. Accuracy above 99% for printed documents, processing in seconds, and continuous improvement through machine learning — all of this makes AI document processing a practical tool, not a futuristic vision.
DokladBot brings this technology directly into WhatsApp. No special app, no scanner, no software. Just your phone, its camera, and the WhatsApp you already have.
Send your first receipt and see for yourself how quickly and accurately AI processing works.
Try DokladBot — from photo to accounting record in 5 seconds.
Official sources
- Czech Financial Administration — document archiving — rules for retaining accounting and tax documents
- Act No. 563/1991 Coll., on Accounting — legislative framework for keeping accounting records
- ARES — business entities register — verifying company registration numbers and business details
- Czech Statistical Office — statistics on the Czech business environment
This article is for informational purposes only and does not constitute accounting or tax advice. AI processing accuracy may vary depending on the quality of the source documents and the specific service provider. Information current as of February 2026.
Nechcete ztrácet čas s papírováním?
Vyzkoušejte DokladBot - účetnictví přes WhatsApp. První týden zdarma.
Related articles

5 Things an AI Accounting Assistant Will Handle for You
An AI accounting assistant can take over a surprisingly large chunk of your admin work. Here are 5 specific things it handles for you — from categorizing receipts to estimating your income tax.

Accountant vs. AI tool: a cost comparison
How much do you actually pay to manage your accounting as a self-employed person? We compare the costs of hiring an external accountant, using accounting software, and AI tools. Find out what makes the most financial sense in 2026.

Accounting Digitization for a One-Person Business
Enterprise accounting software is overkill for a one-person business. We'll show you how to digitize your accounting the simple way – with the phone you already have in your pocket.