What Vision Models See_
LLMs and OCR engines parse text at face value. If a receipt says $9,420.00, they believe it. Zero pixel inspection occurs.
STAGE 02 // TARGET ACQUISITION Isolating The Anomaly_
DocVerify locks onto the mathematically suspicious total. The digits exhibit anti-aliasing patterns inconsistent with the surrounding font — a telltale sign of post-production editing.
Into The Pixel Grid_
As we zoom past the character boundary, the smooth typography dissolves into raw raster data. Each cell is a single pixel with an RGB color value captured from the receipt surface.
STAGE 04 // RASTER ANALYSIS Reading The Raw Data_
R,G,B triplets reveal the true color of each pixel. Normal receipt pixels show uniform paper tones (230+). The highlighted anomaly region shows shifted warmth — evidence of re-rendered glyphs.
STAGE 05 // SENSOR FINGERPRINT Cluster Mismatch_
Real photos carry a coherent noise fingerprint — every pixel matches the camera sensor that captured it. Our model groups pixels by their statistical signature. The anomaly region clusters separately, exposing itself as foreign content welded onto the original.
STAGE 06 // 3D ARCHITECTURE Dimensional Proof_
Each pixel’s height now represents how far it sits from its local cluster. The anomaly region literally rises above the baseline — a 3D topographic map of manipulation that no human eye could detect.
Proof Of Manipulation_
Cluster outliers, font edge anti-alias mismatch, and DCT block boundary shifts converge — three independent forensic signals pointing to the same region. But edited photos are only one threat. Every AI image generation model leaves its own fingerprint.
Threats Beyond Editing_
After we catch the edit, we step back. Generative AI is a different kind of attack — entire documents conjured from nothing. Two families. Two distinct fingerprints.
Refined From Noise_
Imagen 4 Ultra, Midjourney v7, and Stable Diffusion 3.5 start from pure noise and iteratively refine. The trajectory leaves frequency residue — peaks in the spectrum no real camera produces.
STAGE 10 // AUTOREGRESSIVE Token By Token_
GPT Image 2 and Gemini Nano Banana 2 generate one token at a time, scanning left-to-right. Every token boundary is a tiny seam. Stitched together they form a predictable raster pattern.
STAGE 11 // BEYOND PIXELS Document Forensics_
Even untouched photos may sit inside a doctored container. PDFs append every edit as a new layer — each one a forensic trail.
What A PDF Really Is_
Five revisions stack on top of the original: original document, font subset swap, content rewrite, annotation overlay, metadata rewrite. We read every layer.
STAGE 13 // REWRITTEN ROW Two Thousand Becomes Thirty_
One transaction was re-stroked between revisions 2 and 3. The original glyph metrics still live inside the file — and the new metrics don’t match.
STAGE 14 // ASSEMBLED PROOF Every Trace Examined_
Font subset hash, annotation rectangles, /ModDate vs /CreationDate gap, /Producer string. Each line confirms when and how the file was tampered.
STAGE 15 // FRAUD SIGNALS Caught Before Trust_
Cluster mismatch. Anti-alias break. DCT shift. Diffusion FFT peak. Autoregressive seam. PDF revision delta. Every signal collected. Every agent informed before a fake gets believed.
What Vision Models See_
LLMs and OCR engines parse text at face value. If a receipt says $9,420.00, they believe it. Zero pixel inspection occurs.
STAGE 02 // TARGET ACQUISITION Isolating The Anomaly_
DocVerify locks onto the mathematically suspicious total. The digits exhibit anti-aliasing patterns inconsistent with the surrounding font — a telltale sign of post-production editing.
Into The Pixel Grid_
As we zoom past the character boundary, the smooth typography dissolves into raw raster data. Each cell is a single pixel with an RGB color value captured from the receipt surface.
STAGE 04 // RASTER ANALYSIS Reading The Raw Data_
R,G,B triplets reveal the true color of each pixel. Normal receipt pixels show uniform paper tones (230+). The highlighted anomaly region shows shifted warmth — evidence of re-rendered glyphs.
STAGE 05 // SENSOR FINGERPRINT Cluster Mismatch_
Real photos carry a coherent noise fingerprint — every pixel matches the camera sensor that captured it. Our model groups pixels by their statistical signature. The anomaly region clusters separately, exposing itself as foreign content welded onto the original.
STAGE 06 // 3D ARCHITECTURE Dimensional Proof_
Each pixel’s height now represents how far it sits from its local cluster. The anomaly region literally rises above the baseline — a 3D topographic map of manipulation that no human eye could detect.
Proof Of Manipulation_
Cluster outliers, font edge anti-alias mismatch, and DCT block boundary shifts converge — three independent forensic signals pointing to the same region. But edited photos are only one threat. Every AI image generation model leaves its own fingerprint.
Threats Beyond Editing_
After we catch the edit, we step back. Generative AI is a different kind of attack — entire documents conjured from nothing. Two families. Two distinct fingerprints.
Refined From Noise_
Imagen 4 Ultra, Midjourney v7, and Stable Diffusion 3.5 start from pure noise and iteratively refine. The trajectory leaves frequency residue — peaks in the spectrum no real camera produces.
STAGE 10 // AUTOREGRESSIVE Token By Token_
GPT Image 2 and Gemini Nano Banana 2 generate one token at a time, scanning left-to-right. Every token boundary is a tiny seam. Stitched together they form a predictable raster pattern.
STAGE 11 // BEYOND PIXELS Document Forensics_
Even untouched photos may sit inside a doctored container. PDFs append every edit as a new layer — each one a forensic trail.
What A PDF Really Is_
Five revisions stack on top of the original: original document, font subset swap, content rewrite, annotation overlay, metadata rewrite. We read every layer.
STAGE 13 // REWRITTEN ROW Two Thousand Becomes Thirty_
One transaction was re-stroked between revisions 2 and 3. The original glyph metrics still live inside the file — and the new metrics don’t match.
STAGE 14 // ASSEMBLED PROOF Every Trace Examined_
Font subset hash, annotation rectangles, /ModDate vs /CreationDate gap, /Producer string. Each line confirms when and how the file was tampered.
STAGE 15 // FRAUD SIGNALS Caught Before Trust_
Cluster mismatch. Anti-alias break. DCT shift. Diffusion FFT peak. Autoregressive seam. PDF revision delta. Every signal collected. Every agent informed before a fake gets believed.