From Symptoms to Signals: A Clinical Review of Doktorify’s AI System by Interns at Iowa’s Carver College of Medicine
Conducted by:
Peer Review Panel – Carver College of Medicine, The University of Iowa Spring Clinical Informatics Rotation – 2025 Cohort
Foreword
As the boundaries between clinical decision-making and artificial intelligence continue to blur, it becomes imperative that future physicians take part not just in using AI, but in evaluating it. This report reflects the efforts of a 15-member intern cohort at the Carver College of Medicine, University of Iowa, who engaged in an independent, end-to-end clinical evaluation of Doktorify — a publicly accessible AI-powered symptom analysis and clinical support platform developed in Turkey.
Our aim was not only to quantify its diagnostic output, but to understand its reasoning structure, risk logic, and practical role in augmenting human care.
Why Doktorify?
While the global market is increasingly filled with chatbots claiming health insight, Doktorify stands out in its origin and mission. Developed within a university-linked technopark in Istanbul and fueled by fine-tuned large language models (LLMs), it is aimed at both patients (through AIbased symptom guidance) and healthcare professionals (via a decision support layer). Its development aligns with the growing need for scalable, language-adaptable, ethically sound AI health tools.
Review Design & Case Protocol
Participants:
- 15 Medical Interns, Carver College of Medicine
- Clinical special interests: Family Medicine, Emergency Medicine, Internal Medicine •All interns received basic AI literacy and prompt-testing training prior to evaluation
Scope:
- 600 clinical simulations, structured by interns using:
◦UWorld and AMBOSS clinical vignettes
◦Primary Care Board Review datasets
◦CDC and WHO triage flow charts
- Each intern assigned 40 patient cases, covering:
◦Acute care
◦Preventive screening scenarios
◦Pediatric and geriatric differentials
◦Psychosomatic and undifferentiated presentations
Assessment Metrics:
- Top-1 Diagnostic Match Rate
- Top-3 Differential Coverage
- Triage Appropriateness
- Guideline Alignment
- Explanation Clarity
- Patient Risk Communication
- Bias or Sociocultural Blind Spots
Each case was scored independently by at least two interns, and discrepancies were moderated by a third.
Results at a Glance
What We Observed
Key Strengths
- Safety Above All: Doktorify’s AI consistently flagged red flag symptoms with conservative escalation. Even “soft” presentations of chest pain, dizziness, or fever were appropriately triaged.
- Structured Reasoning: Rather than producing a one-liner diagnosis, the AI offered a narrative flow, mimicking clinical decision trees we’re trained to follow.
- Education-Ready: Many interns remarked how Doktorify inadvertently helped solidify their own diagnostic thinking, almost as a “feedback mirror.”
- Adaptable Language Layer: With multilingual support (Turkish/English), the platform smoothly handled language conversions and patient education phrasing.
- Explainability: Each conclusion was paired with a brief justification — including red flags considered, conditions ruled out, and supportive symptoms referenced.
Current Limitations
- No EHR or Clinical Tool Integration: It currently functions as a standalone advisor, not a clinical assistant in practice.
- Doesn’t Interpret Labs/Images: While great for symptoms and risk assessment, there’s no support for lab or scan result interpretation (yet).
- Nuance Lost in Specialist Cases: In advanced hepatology, rheumatology, and rare disease settings, the AI often reverted to generic differentials.
- Static Model Update Cycle: Although fast and accurate, there’s no indication of real-time medical knowledge updates (e.g., evolving COVID-19 protocols).
Learning Moments (Student Reflections)
“I gave it a case with atypical abdominal pain in a diabetic female. It immediately prioritized mesenteric ischemia and recommended ER. That’s exactly what I would’ve done after all my years of study — and that’s saying something.” — T. Nguyen, Internal Medicine
“We were surprised how well it communicated uncertainty. Instead of guessing, it said: ‘This pattern may suggest a range of causes. Please consult a physician.’ That humility from an AI was… refreshing.”
— M. Rivera, Family Medicine
“Doktorify is not perfect, but it feels safe — which is exactly the kind of baseline we need for public deployment.”
— H. Ahmad, Emergency Medicine
Final Peer Review Verdict: Doktorify’s Readiness Scorecard
Conclusion
Doktorify has passed our internal peer review with high marks for clinical consistency, explainability, patient safety, and user clarity. It is not a final diagnostic tool — nor does it aim to be — but as an AI health advisor, it is both scalable and safe for broader public integration.
From the perspective of medical interns at Carver College of Medicine, Doktorify represents one of the strongest AI-health interfaces we have tested — and with proper integration and clinical partnerships, it could set a new standard for AI-driven patient engagement in developing and multilingual contexts.