Large Language Models in Surgical Escape Room Simulations: A Randomized Controlled Trial Showing No Improvement in Performance or Teamwork
Gustafsson, Joona (2025-03-26)
Large Language Models in Surgical Escape Room Simulations: A Randomized Controlled Trial Showing No Improvement in Performance or Teamwork
Gustafsson, Joona
(26.03.2025)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
suljettu
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2025040824949
https://urn.fi/URN:NBN:fi-fe2025040824949
Tiivistelmä
Large language models (LLMs) are increasingly considered for integration into medical education, yet their impact on real-time decision-making and teamwork in trauma scenarios remains uncertain. The objective of this study is to evaluate whether LLM assistance improves clinical decision-making, teamwork, or learning outcomes in a structured trauma simulation. This randomized controlled trial was conducted from January 14 to March 11, 2025, at Vaasa Central Hospital and University of Turku, Finland with 40 last-year medical students. Participants were randomly assigned to LLM-assisted or non-LLM groups in an escape-room-style trauma simulation consisting of 18 structured clinical scenarios. The LLM-assisted groups could consult the free version of ChatGPT-4o mini as needed. The measured primary outcomes included response times, accuracy, teamwork scores, and changes in confidence before and after the simulation. Secondary outcomes included student feedback on LLM use and longterm knowledge retention. The trauma simulations significantly improved confidence levels across all participants (p < 0.05). However, LLM assistance did not significantly improve response times or accuracy. In some cases, LLM-assisted groups exhibited longer decision-making times and were less likely to engage in team discussion. Confidence gains were observed in both groups but did not differ significantly between LLM and non-LLM participants (p = 0.210). In conclusion, LLMs did not improve decision-making efficiency or teamwork in trauma simulations. While students found referencing artificial intelligence useful, over-reliance on AI may reduce active discussion and team-based problem-solving. Future research should refine the integration of AI in clinical training to optimize its educational potential.