xRead - Incorporating Artificial Intelligence into Clinical Practice (March 2026)
Ng et al. BMC Medical Informatics and Decision Making
(2025) 25:236
Page 11 of 24
Key Findings Novel Features
The study noted that the computer generated HPIs were more complete and organized but still required improvements in accuracy and consistency, especially for detailed patient histories. Further development was needed for
seamless integration of AEGIS with EHRs to ensure that computer-generated
HPIs align well with real-time physician documentation workflows, which are
sensitive to time constraints and accuracy demands.
More advanced algorithms necessary to improve the accuracy of detecting and
documenting GI symptoms. Specifically, machine learning models could be further developed to ensure the automatic detec tion of all relevant clinical features. Current systems may miss certain
symptoms due to variability in patient language; suggest enhancing systems
to handle diverse phrasing, which could prevent underreporting by AEGIS.
Blinded raters deemed the computer generated HPIs to be of higher
quality, more comprehensive, better organized, and with greater relevance compared to physician-documented
HPIs. These results offer initial proof-of principle that a computer can create
meaningful and clinically relevant HPI.
Computer generated HPIs document ed more alarm features than physician generated HPIs. Physicians may be
under reporting alarm features in GI clinics. Yet, greater documentation
of red flags is not shown to improve patient outcomes.
AI Transcription
Proficiency (paper
specific outcomes)
NR Mean of Physician HPI Ratings (SD) 2.80 (0.75) 2.73 (0.75) 3.04 (0.68)
2.80 (0.80)
3.17 (0.60)
2.97 (0.79)
5.27 (1.52)
Mean of Computer
generated HPI Ratings (SD) 3.68 (0.61)* 3.70 (0.59)* 3.82 (0.54)*
3.66 (0.63)*
3.55 (0.69)*
3.66 (0.66)*
6.05 (0.98)*
NR Median number of
positive alarm fea
tures in Physician HPIs (interquartile range) 0 (0–1) 0 (0–1) 0 (0–1)
Median number
of positive alarm
features in AEGIS HPIs (interquartile range) 1 (0–2)* 1 (0–2)* 1 (0–2)*
Metric (F1 score,
Precision, Recall, WER)
impression
- Completeness - Relevance
- Organisation
- Succinctness
- Comprehensi bility
- Number of
Medicare-recom
mended elements present in HPI
- Patients present ing for an initial visit
- Patients who
completed AEGIS
within 1 week of their clinic visit
Comparator Type Subcategories Performance Blinded comparison - Overall
Blinded comparison - All patients
Standard
Physician gener ated History
of Presenting Illness (HPI)
Number
of positive
alarm features
documented in
physician gener ated History
of Presenting Illness (HPI)
Table 2 (continued)
Study Reference Alma rio et
al., 2015 [13]
Alma
rio et
al., 2015 [14]
Made with FlippingBook - professional solution for displaying marketing and sales documents online