Turning Trial-Grade AI Into Everyday Medicine

Authored by

Nicole Hemsoth Prickett

In retinal drug development the data that drives trial decisions and the data that drives day-to-day care have always been two different species.

Trial datasets are dense, high-fidelity, and centrally graded; clinic datasets are irregular, multimodal, and shaped by the quirks of individual physicians and devices.

AI is starting to make those differences irrelevant. Validated algorithms can now read OCT volumes, segment fluid across all retinal layers, parse physician notes, cross-reference claims and genetics, and do it all fast enough to make a retreatment decision in the time it takes a patient to walk from imaging to the exam lane.

The same system can run inside a phase III trial and at a follow-up visit six months later, producing the same measurement in both contexts. And for the first time, evidence can move with the patient instead of staying trapped in the environment where it was collected.

At a recent meeting of trialists and practicing specialists, Daniela Ferrara, MD, PhD, FASRS, Chief Medical Officer at Topcon, did not bother with hypotheticals. “Anyone in this room that’s not using AI for clinical trials is falling behind,” she said.

The panel she led was made up of people who are already deploying AI at scale inside phase III trials, in clinics, and in some cases as the regulated product itself.

Ehsan Vaghefi, PhD, CEO of Toku, described running the first AI diagnostic through FDA paces. Their cardiovascular risk model must match American Heart Association standards while being developed entirely from ophthalmic data. That means defining and agreeing on gold standards with cardiology before training, building a system that no human could replicate manually, and then defending it under a de novo submission.

He noted that while ophthalmology and cardiology divisions at the FDA are relatively progressive on AI, other specialties such as nephrology are not, which changes strategy for their chronic kidney disease model.

The core difficulties, he said, are not the algorithms but cybersecurity and bias mitigation, both of which the FDA audits closely.

For Ben Toker, MBA, Co-Founder and President of Amaros, the technical barrier is data fragmentation. Amaros integrates five data types (EHR, provider notes, claims, multimodal imaging, and genetics) into a single patient matching system for trial recruitment. Natural language models parse unstructured notes to capture physician treatment patterns and tendencies, while computer vision processes imaging for biomarker analysis. The goal is to lower screen failure rates without pulling coordinators into endless manual checks, but Toker stressed that site workflows must adapt to use the AI output effectively or the benefit is lost.

Shriji Patel, MD, MBA, Lead Medical Director at Genentech, described the company’s in-house AI development program, built on decades of trial imaging and clinical data. Algorithms are trained and validated against reading center outputs to achieve high sensitivity and specificity, which allows early-phase anatomical findings to be linked more directly to efficacy signals. This is what enables “failing faster” in drug development, freeing resources for assets that can compete in crowded indications. Patel’s group is now working on refining those models with real-world datasets to close the gap between trial results and clinic outcomes.

Dolly Chang, MD, PhD, Chief Scientific Officer at Kodiak Sciences, focused on a problem every retinal trial with extended durability molecules faces, standardizing retreatment criteria so they match real practice. Kodiak validated an OCT fluid detection algorithm on over 200,000 scans from more than a thousand patients, tuning cutoffs to catch every patient who should be treated while reducing noise. Only after that retrospective phase did they integrate it. Ultimately, AI could make real-time retreatment calls without adding latency to site workflows.

Charles Wykoff, MD, PhD, of Retina Consultants of Texas, Retina Consultants of America, Houston Methodist Hospital, and the Blanton Eye Institute, runs a network of more than 300 retina specialists across 23 states, logging 2.5 million patient visits a year. His group built an internal system to collate EMR and imaging data after finding no vendor could bridge them reliably. The system is designed not just for trial screening but for longitudinal real-world analysis.

Wykoff emphasized that high-density volumetric scan protocols from trials rarely exist, which limits the portability of algorithms unless acquisition is standardized. He also noted that projecting disease progression (overlaying structural changes with expected functional loss) could help patients understand the value of current therapies, something AI could deliver but industry has yet to quantify well.

What linked all of these perspectives was the shift from AI as a trial-only tool to AI as a continuous system that spans trial and clinic.

Retinal imaging analytics, when validated and deployed in both environments, make it possible to standardize retreatment decisions, define inclusion criteria, and set endpoints that can be measured identically whether a patient is in a phase III study or at a follow-up visit in a local practice.

For the first time, the evidence generated in trials can follow patients into real care without translation loss, and the data generated in real care can inform the next trial without starting from scratch. That, in Ferrara’s terms, is what falling behind now means.

Turning Trial-Grade AI Into Everyday Medicine

More from this topic