%0 Report %A Si, Yafei %A Yang, Yuyi %A Wang, Xi %A An, Ruopeng %A Zu, Jiaqi %A Chen, Xi %A Fan, Xiaojing %A Gong, Sen %T Quality and Accountability of Large Language Models (LLMs) in Healthcare in Low- And Middle-Income Countries (LMIC): A Simulated Patient Study Using ChatGPT %D 2024 %8 2024 Aug %I Institute of Labor Economics (IZA) %C Bonn %7 IZA Discussion Paper %N 17204 %U https://www.iza.org/publications/dp17204 %X Using simulated patients to mimic nine established non-communicable and infectious diseases over 27 trials, we assess ChatGPT's effectiveness and reliability in diagnosing and treating common diseases in low- and middle-income countries. We find ChatGPT's performance varied within a single disease, despite a high level of accuracy in both correct diagnosis (74.1%) and medication prescription (84.5%). Additionally, ChatGPT recommended a concerning level of unnecessary or harmful medications (85.2%) even with correct diagnoses. Finally, ChatGPT performed better in managing non-communicable diseases compared to infectious ones. These results highlight the need for cautious AI integration in healthcare systems to ensure quality and safety. %K safety %K quality %K ChatGPT %K Large Language Models %K generative AI %K simulated patient %K healthcare %K low- and middle-income countries