Evaluation High-Quality of Information from ChatGPT (Artificial Intelligence-Large Language Model) Artificial Intelligence on Shoulder Stabilization Surgery.

Abstract

Purpose

To analyze the quality and readability of information regarding shoulder stabilization surgery available using an online AI software (ChatGPT), using standardized scoring systems, as well as to report on the given answers by the AI.

Methods

An open AI model (ChatGPT) was used to answer 23 commonly asked questions from patients on shoulder stabilization surgery. These answers were evaluated for medical accuracy, quality, and readability using The JAMA Benchmark criteria, DISCERN score, Flesch-Kincaid Reading Ease Score (FRES) & Grade Level (FKGL).

Results

The JAMA Benchmark criteria score was 0, which is the lowest score, indicating no reliable resources cited. The DISCERN score was 60, which is considered a good score. The areas that open AI model did not achieve full marks were also related to the lack of available source material used to compile the answers, and finally some shortcomings with information not fully supported by the literature. The FRES was 26.2, and the FKGL was considered to be that of a college graduate.

Conclusions

There was generally high quality in the answers given on questions relating to shoulder stabilization surgery, but there was a high reading level required to comprehend the information presented. However, it is unclear where the answers came from with no source material cited. It is important to note that the ChatGPT software repeatedly references the need to discuss these questions with an orthopaedic surgeon and the importance of shared discussion making, as well as compliance with surgeon treatment recommendations.

Clinical relevance

As shoulder instability is an injury that predominantly affects younger individuals who may use the Internet for information, this study shows what information patients may be getting online.

Department

Description

Provenance

Citation

Published Version (Please cite this version)

10.1016/j.arthro.2023.07.048

Publication Info

Hurley, Eoghan T, Bryan S Crook, Samuel G Lorentz, Richard M Danilkowicz, Brian C Lau, Dean C Taylor, Jonathan F Dickens, Oke Anakwenze, et al. (2024). Evaluation High-Quality of Information from ChatGPT (Artificial Intelligence-Large Language Model) Artificial Intelligence on Shoulder Stabilization Surgery. Arthroscopy : the journal of arthroscopic & related surgery : official publication of the Arthroscopy Association of North America and the International Arthroscopy Association, 40(3). pp. 726–731.e6. 10.1016/j.arthro.2023.07.048 Retrieved from https://hdl.handle.net/10161/30373.

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.

Scholars@Duke

Taylor

Dean Curtis Taylor

Professor of Orthopaedic Surgery

Dr. Dean Taylor is a Sports Medicine Orthopaedic Surgeon whose practice and research interests include shoulder instability, shoulder arthroscopy, knee ligament injuries, meniscus injuries, knee cartilage injuries, and ACL injuries in adults and children. He attended the United States Military Academy at West Point and completed his medical training and residency at Duke University. Dr. Taylor went on to be a part of the John Feagin West Point Sports Medicine Fellowship, retired from the United States Army at the rank of Colonel, and returned to Duke in 2006.

Dickens

Jonathan F Dickens

Professor of Orthopaedic Surgery

Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.