AI research agents are no longer theoretical in rare disease and oncology—they’re starting to reshape how teams find patients, understand journeys, and generate evidence. That reality came through clearly in our recent joint webinar with Rare Patient Voice, “AI Agents and Patient-Level Data: From Confirmation of Diagnosis to Insights.”
Setting the stage: disruption, data, and AI agents
Clinakos CEO Inder Jaggi opened by framing the moment we’re in: AI is already disrupting entire sectors, including software and services, and biopharma is next. He noted that while “AI” has become a buzzword, the real value for rare diseases and oncology sits where medically smart agents meet deeply curated patient-level data.
Rare Patient Voice’s Wes Michael shared timely news that Rare Patient Voice has been acquired by Konovo, reinforcing their long-term commitment to patient engagement and recruitment. Since 2013, Clinakos and Rare Patient Voice have been working independently and together on projects that sit at the intersection of patient-level data, AI, and rare disease research.
The fraud problem in rare disease research
The first big theme of the webinar was fraud—both old and new. Wes described how fraud rates in market research can reach 10–15% in general, and up to 25% in some rare disease studies, with one extreme case reported at 94%. The rise of AI-generated responses makes it even harder to distinguish real patients from sophisticated fraudsters, especially when incentives are high in rare disease research.
The panel emphasized three key risks:
- Financial waste occurs when fraudulent data must be discarded or, worse, drives wrong decisions.
- Regulatory risk, since regulators and payers require verified outcomes data and cannot accept dubious evidence.
- Scientific distortion, as a small, rare disease population, magnifies the impact of each fraudulent respondent.
Wes contrasted self-reported data—valuable for experiences and emotions but prone to recall bias and inaccuracies—with patient-level medical records that capture timestamps, diagnoses, lab results, prescriptions, genetic tests, and clinical assessments. By linking these records (with patient consent) to survey responses, Clinakos and Rare Patient Voice are building a foundation where AI agents can help eliminate fraud rather than fuel it.
The invisible 80–90%: why integrated patient data matters
Inder then walked the audience through an uncomfortable truth: in rare diseases, traditional syndicated and tokenized datasets can miss 80–90% of relevant patients. Many of the ~7,000 rare diseases lack ICD-9/ICD-10 codes, and definitive diagnosis often sits in unstructured physician notes, genomic reports, and imaging, not in structured claims fields.
He highlighted several structural challenges:
- Long diagnostic delays of 5–7 years, leaving fragmented, error-prone records across multiple providers.
- Data blocking by specialty pharmacies that withhold data from syndicated feeds, particularly for competitive rare and oncology therapies.
- Claims data lag of 3–6 months, which is increasingly misaligned with the pace of decision-making in oncology and rare disease.
The solution they presented is a per-patient, consented data curation model: curating EMR notes, labs, imaging, prescriptions, genomics, PROs, and device data (e.g., wearables) into a linked, longitudinal dataset. Historically, nurses or clinicians did this manually at great cost and over many months; now, AI agents are automating the heavy lifting.
Enter AI research agents: speed, depth, and continuous intelligence
From there, the session shifted to AI research agents themselves. Inder drew a sharp distinction between generic large language models and disease-trained research agents grounded in patient-level data.
In traditional studies, teams might track ~100 variables; with AI research agents trained on specific diseases, they’re now monitoring thousands of data points per patient, including biomarkers and nuanced clinical events. This unlocks:
- Study time reductions of up to 85% in some cases, compressing projects that once took 8–10 months down to weeks.
- A step-change in quality, as teams “spend” the time savings on richer analyses, going from ~50 data points to hundreds or thousands.
- Continuous insights, where patient journeys and treatment patterns can be refreshed monthly or quarterly instead of only at Phase 2, Phase 3, and pre-launch.
A particularly compelling segment compared a generic medical agent versus a disease-trained agent (Clinakos’ AI Research Agent ClarionTM) on spinal muscular atrophy (SMA) data. The generic agent returned high-level statements about “many patients achieving motor milestones,” while the SMA-trained agent produced precise, cohort-based probabilities (e.g., a 64% probability of exceptional response based on a specific genotype and outcome threshold). This illustrated how disease-specific training on real patient-level data can make research agents 1.8–2 times more accurate than general-purpose models.
Use cases: from confirmation of diagnosis to launch readiness
The webinar then mapped these capabilities to concrete use cases across the product lifecycle. Among those discussed:
- Confirmation of diagnosis: Using Clinakos’ ConfirmisTM agent, trained on rare disease patient data, to verify diagnoses from unstructured records—crucial where codes are missing, and fraud risk is high.
- Natural history studies: Replacing multi-year, multimillion-dollar registries with AI-enabled, consented patient-level datasets that can start generating insights in 6–8 weeks at a fraction of the cost.
- Launch readiness: Compressing the time to build robust patient journeys and treatment-flow maps from six months to a few weeks, enabling faster and more accurate launch planning in small populations.
- Real-world effectiveness and HEOR: Supporting payer dossiers and outcomes work with verified, longitudinal data rather than relying solely on claims or self-report.
- Competitive intelligence and patient finding: Overcoming data blocking through patient consent, and using AI agents to identify eligible patients for trials and therapies across fragmented systems.
The ROI examples shared were eye-catching. For one program, natural history work that might have cost USD 1.5–3 million and taken two years was modeled down to under USD 500,000, with initial readouts in as little as six to eight weeks. In another case, better patient identification and persistence monitoring translated into an estimated USD 50–225 million in incremental revenue by keeping 1,150 patients on therapy.
A tipping point for AI research agents
The webinar closed with a live poll that mirrored what Clinakos is seeing across its client base: a majority of organizations, more than 80%, are either actively using AI research agents today or planning pilots.
For rare disease and oncology teams, the message was clear:
- AI agents are coming to regulatory review, payer evidence, and internal insights, whether we’re ready or not.
- The differentiator will be medically smart, disease-trained agents built on integrated, consented patient-level data rather than generic models on shallow datasets.
If you missed the live session but are interested in how integrated patient data plus AI research agents can change confirmation of diagnosis, natural history, launch readiness, and real-world effectiveness in your therapeutic area, this is the moment to lean in.
You can watch the full session on demand here https://clinakos.com/resources/. (Send an email to info@clinakos.com for the password)
