When looking at how often the three tests had the same result for each participant, GPT-4 was consistent for 76% of participants regarding the number of brain lesions. It was consistent for 83% of participants for the side of the brain, and for 87% of participants regarding the brain regions. However, when combining its responses to all three questions across all three times, GPT-4 provided accurate answers for 41% of participants.
“While not yet ready for use in the clinic, large language models such as generative pre-trained transformers have the potential not only to assist in locating lesions after stroke, they may also reduce health care disparities because they can function across different languages,” said Lee. “The potential for use is encouraging, especially due to the great need for improved health care in underserved areas across multiple countries where access to neurologic care is limited.”
A limitation of the study is that the accuracy of GPT-4 depends on the quality of the information it is provided. While researchers had detailed health histories and neurologic exam information for each participant, such information is not always available for everyone who has a stroke.
Source: American Academy of Neurology