TY - RPRT AU - Si, Yafei AU - Yang, Yuyi AU - Wang, Xi AU - An, Ruopeng AU - Zu, Jiaqi AU - Chen, Xi AU - Fan, Xiaojing AU - Gong, Sen TI - Quality and Accountability of Large Language Models (LLMs) in Healthcare in Low- And Middle-Income Countries (LMIC): A Simulated Patient Study Using ChatGPT PY - 2024/Aug/ PB - Institute of Labor Economics (IZA) CY - Bonn T2 - IZA Discussion Paper IS - 17204 UR - https://www.iza.org/index.php/publications/dp17204 AB - Using simulated patients to mimic nine established non-communicable and infectious diseases over 27 trials, we assess ChatGPT's effectiveness and reliability in diagnosing and treating common diseases in low- and middle-income countries. We find ChatGPT's performance varied within a single disease, despite a high level of accuracy in both correct diagnosis (74.1%) and medication prescription (84.5%). Additionally, ChatGPT recommended a concerning level of unnecessary or harmful medications (85.2%) even with correct diagnoses. Finally, ChatGPT performed better in managing non-communicable diseases compared to infectious ones. These results highlight the need for cautious AI integration in healthcare systems to ensure quality and safety. KW - safety KW - quality KW - ChatGPT KW - Large Language Models KW - generative AI KW - simulated patient KW - healthcare KW - low- and middle-income countries ER -