Performance and reliability of large language models on the European Board of Hand Surgery examination: a multi-model evaluation study

Ibrahim Güler, Lindsay Muir, Gerrit Grieb, Philipp Moog, Armin Kraus, Henrik Stelling

Journal of Hand Surgery (European Volume): Journal of the British Society for Surgery of the Hand & Official Journal of the Federation of European Societies for Surgery of the Hand

Published online on April 22, 2026

Abstract

Journal of Hand Surgery (European Volume), Ahead of Print.
Introduction:Artificial intelligence (AI) has demonstrated transformative potential in medical education and assessment, with large language models achieving competitive results across multiple high-stakes examinations. In this study, we evaluated the ...