A study of 21 AI chatbots found they failed to generate appropriate differential diagnoses more than 80% of the time when given only basic patient information such as age, gender, and symptoms. The study, which included latest versions of ChatGPT, DeepSeek, Claude, Gemini, reached a correct final diagnosis more than 90% of the time once provided with comprehensive clinical data.