The FDA granted Breakthrough Device Designation to two independent companies, Mosaic Clinical Technologies (Cognita CXR) and Aidoc (First Read), for foundation model systems that do essentially the same thing: analyze a chest radiograph, generate a draft radiology report, and hand it to a radiologist for review, edit and sign-off.

Two separate teams and the same workflow.

That convergence is interesting.

Whether these specific products reach widespread adoption is a separate question. What the convergence tells us is that multiple independent groups, including regulators, have concluded that foundation model-generated draft reports reviewed by radiologists is a credible clinical architecture worth fast-tracking. That is not a research hypothesis anymore.

I have been writing about this transition for a while because I think it changes what it means to practice radiology in a specific and under appreciated way.

The task is shifting from generation to evaluation. When a VLM drafts the report across every organ system in the study, the radiologist who reviews it is performing a categorically different cognitive task than the one who interpreted the same study from scratch five years ago.

The 2025 task was generation.

The 2026 task is evaluation.

Those two tasks have different skill profiles, different liability structures, and different implications for how we train residents and supervise AI.

What no one has solved yet is the VLM monitoring problem. Breakthrough Device Designation tells you a product was worth developing. It does not tell you whether the model, six months after go-live at your institution, is still performing the way it performed during validation. Vendor dashboards do not answer that question.

Most practice contracts do not even ask it.

That gap is where I have been focused, and it is why I wrote my recent articles and book.

If you are thinking about how your practice should evaluate, deploy, and monitor these tools, take a look.

Link to Article 1: https://lnkd.in/gNdRNZv9

Link to book on Amazon: https://a.co/d/0cn7zUuM

Clinical AI does not fail at deployment. It fails quietly after deployment.

If you are building or deploying radiology AI, especially vision-language models or automated reporting systems, post-deployment monitoring is no longer optional. Regulators, including the FDA, increasingly expect continuous assessment of real-world performance, including drift, site variability, and human-AI disagreement.

Veriloop is a vendor-agnostic clinical AI observability layer designed to measure how models behave in real workflows. We track agreement, error patterns, and trust over time, helping teams understand where AI is reliable, where it breaks, and how much workload it can safely absorb.

We do not replace regulatory approval or pre-deployment validation. We provide the missing layer after deployment, where safety, performance, and trust actually evolve.

If you are building, deploying, or evaluating clinical AI systems and need real-world monitoring, reach out at ty@orainformatics.com

Menu