First Look: A Pathology Lab

I don't know pathology. I got curious about where AI is in the field and used radiology as a comparison point because AI is further along there. Here's what I found. I may have things wrong.


The Images Are Different

A chest X-ray is a few megabytes. A whole-slide pathology image at 40x magnification can exceed 10 gigapixels and occupy 1–30 GB uncompressed. A prostatectomy case can involve 25–100 slides. These images don't fit in GPU memory — they have to be split into patches, sometimes 100,000+ per slide, processed individually, and reassembled.

A slide scanner costs $150,000–$300,000. A workstation that can run patch-level inference costs $5,000–$10,000. Storage is a separate problem — large centers may produce a petabyte of image data per year.


The Staining Problem

Pathology images are full color, and their appearance depends on chemical dyes, fixation time, staining protocols, tissue thickness, and which scanner digitized the slide. Different manufacturers — Leica, Hamamatsu, Philips, 3DHISTECH — produce visibly different color profiles from the same tissue.

A 2023 review of stain normalization methods found that variations from "different scanning equipment, staining methods, and tissue reactivity" decrease the accuracy of computer-aided diagnosis systems (Salvi et al., Information Fusion, 2023). A 2025 study proposed StainLUT, a self-supervised model that achieves cross-center tumor classification without requiring data sharing between institutions (He et al., npj Digital Medicine, 2025).

Radiology has domain shift too — different CT manufacturers, reconstruction kernels, MRI field strengths. But pathology adds color variability from staining as an additional source that radiology doesn't have.


What the Research Has Produced

Archive and Visual Search

SISH (Self-Supervised Image Search for Histology) encodes whole-slide images into compact representations and retrieves similar slides at constant speed regardless of database size — O(1) search across 22,000+ patient cases and 56 disease subtypes. The code is open source (Chen et al., Nature Biomedical Engineering, 2022).

Google's SMILY retrieves histopathology patches by visual similarity using deep learning embeddings. In blinded studies, pathologists confirmed that SMILY retrieved results with similar histologic features, organ site, and Gleason grade compared with the query. The team found that interactive refinement tools — cropping, selecting examples, adjusting sliders — significantly increased clinical usefulness and trust (Hegde et al., npj Digital Medicine, 2019).

False Negative Detection

At Yale, researchers deployed a system combining AI and NLP across 19,246 chest and abdominal CT exams. The AI flagged findings; NLP parsed radiology reports; discrepancies were reviewed. Of flagged cases, 0.26% had clinically significant discrepancies, and 68% of those resulted in addenda to the original reports (Cavallo et al., Academic Radiology, 2023).


Where the Radiology Comparison Holds and Breaks

The Yale system targeted detection errors: something is on the image and wasn't mentioned in the report. Binary problem, clean ground truth.

Pathology has detection errors too — cancer present but not called in the report. For that case, the same approach could apply.

But pathology also has error types where the comparison breaks down:

Grading disagreements. Gleason 3+4 versus 4+3 prostate cancer determines whether a patient gets active surveillance or surgery. Pathologists disagree at meaningful rates: one study found concordance of 64% with κ = 0.34 (Ozkan et al., 2016); another measured κ = 0.69 with 46% discrepancy in annotation area (Duarte et al., 2023); a central review audit found complete agreement in only 72% of cases, with most discrepancies at the 3/4 boundary (Salmo, 2015). If experts disagree on ground truth, training AI to catch grading errors is a different problem than catching missed lesions.

Subtyping, interpretation, and ancillary test errors require context beyond the slide — clinical history, immunohistochemistry, molecular data — and involve reasoning, not just perception. I haven't found research that addresses these in a deployable way.


Malpractice Data

An analysis of 378 pathology malpractice claims (1998–2003) found that 63% involved failure to diagnose cancer. False-negative melanoma was the single most common reason for a claim against a pathologist. Breast specimens were second; false-negative Pap smears were third (Troxel, Archives of Pathology & Laboratory Medicine, 2006).


What's Happening Commercially

Paige received the first FDA authorization for AI in pathology in 2021 (Paige Prostate) and Breakthrough Device designation for PanCancer Detect in 2025. PathAI received 510(k) clearance for AISight Dx in 2025. Proscia secured FDA clearance for Concentriq AP-Dx in 2024. The pathology AI market was valued at roughly $135 million in 2024 and is projected to reach $1.15 billion by 2033.


What I Don't Know

Whether pathologists would actually want any of this. A tool can work technically and still fail because it interrupts workflows, creates extra clicks, or feels like surveillance. I have no idea how pathologists would react to an AI flagging their cases. After that: whether a lab's own archive could be used to fine-tune models for that specific environment. Whether pathology reports are structured enough to extract labels from programmatically. What the storage economics look like. How LIS integration works.

If you work in a lab and any of this seems right or wrong, I'd like to hear about it.

Research and drafting assisted by AI. Assumptions are mine.


References

  • Chen, C. et al. "Fast and scalable search of whole-slide images via self-supervised deep learning." Nature Biomedical Engineering 6, 1420–1434 (2022).
  • Hegde, N. et al. "Similar image search for histopathology: SMILY." npj Digital Medicine 2, 56 (2019).
  • Cavallo, J. et al. "Clinical Implementation of a Combined Artificial Intelligence and Natural Language Processing Quality Assurance Program." Academic Radiology (2023).
  • Ozkan, T.A. et al. "Interobserver variability in Gleason histological grading of prostate cancer." Scandinavian Journal of Urology 50(6), 420–424 (2016).
  • Duarte, M.B.O. et al. "A comparative study of the inter-observer variability on Gleason grading." Computers in Biology and Medicine 159, 106895 (2023).
  • Salmo, E.N. "An audit of inter-observer variability in Gleason grading of prostate cancer biopsies." Integrative Cancer Science and Therapeutics 2 (2015).
  • Salvi, M. et al. "Stain normalization methods for histopathology image analysis." Information Fusion 100, 101922 (2023).
  • He, K. et al. "Self-supervised stain normalization empowers privacy-preserving and model generalization in digital pathology." npj Digital Medicine (2025).
  • Troxel, D.B. "Medicolegal Aspects of Error in Pathology." Archives of Pathology & Laboratory Medicine 130(5), 617–625 (2006).

Research and drafting assisted by AI. Assumptions are mine.

← Back to Blog