Assessing the Ability of Artificial Intelligence-Driven Language Processing Frameworks to Create Patient-Oriented Medical Education Material on Hypothermia

Adam Schwartz; Alfred Urba

doi:10.62573/knxh0m41

Authors

Adam Schwartz Rosalind Franklin University of Medicine and Science (US) Author https://orcid.org/0009-0003-8095-4639
Alfred Urba Illinois Farm Bureau (US) Author https://orcid.org/0009-0006-6360-0048

DOI:

https://doi.org/10.62573/knxh0m41

Keywords:

artificial intelligence, hypothermia, patient education

Abstract

Introduction: Artificial Intelligence-Driven Language Processing Frameworks (AI-LPFs) such as ChatGPT, Grok, and Gemini are increasingly being explored for their ability to generate patient-oriented medical education material (PEM). While prior studies have assessed AI-generated PEM in various medical fields, their applicability to operational medicine remains understudied. Given the significance of hypothermia in operational and civilian settings, this study evaluates the quality and readability of AI-generated PEM on hypothermia.

Methods: Three AI-LPFs (ChatGPT-4, Grok-3, and Gemini 2.0 Flash) were prompted to generate PEM on hypothermia. Readability was assessed using the Flesch-Kincaid reading grade level and Flesch Reading Ease Score (FRE). Additional text metrics included PEM length, the proportion of complex words and sentences, and average sentence and word length. The quality of AI-generated PEM was scored using the CDC Clear Communication Index (CCI), and content accuracy was assessed through fact-checking against the Wilderness Medical Society guidelines. A benchmark PEM from the American Red Cross was included for comparison.

Results: Readability analysis showed that the PEM from Gemini and the American Red Cross met NIH recommendations for an 8th-grade reading level, whereas ChatGPT and Grok were slightly above this threshold. Grok generated the most comprehensive PEM, uniquely categorizing hypothermia into mild, moderate, and severe, aligning with Wilderness Medical Society guidelines. Unlike the other AI-generated PEM, it also addressed both EMS activation and CPR. The PEM from Grok scored the highest on the CDC CCI, outperforming the other AI-generated PEMs and the benchmark from the American Red Cross. A manual review confirmed that all AI-generated PEM were factually accurate

Conclusion: AI-LPFs successfully produced factually accurate PEM on hypothermia, with Grok generating the most comprehensive material. These findings suggest AI-LPFs have potential for enhancing public education on operational medicine topics. Further refinement of AI-generated PEM to improve readability and adherence to established guidelines may enhance their utility as reliable educational tools.

Author Biographies

Adam Schwartz, Rosalind Franklin University of Medicine and Science (US)

Student, Doctor of Medicine
Alfred Urba, Illinois Farm Bureau (US)

Member

Assessing the Ability of Artificial Intelligence-Driven Language Processing Frameworks to Create Patient-Oriented Medical Education Material on Hypothermia

Authors

DOI:

Keywords:

Abstract

Author Biographies

Downloads

Published

Issue

Section

License

How to Cite

Make a Submission

Information