文献检索，用中文搜 PubMed

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Gumilar Khanisyah Erza, Wardhana Manggala Pasca, Akbar Muhammad Ilham Aldika, Putra Agung Sunarko, Banjarnahor Dharma Putra Perjuangan, Mulyana Ryan Saktika, Fatati Ita, Yu Zih-Ying, Hsu Yu-Cheng, Dachlan Erry Gumilar, Lu Chien-Hsing, Liao Li-Na, Tan Ming

Graduate Institute of Biomedical Science, China Medical University, Taichung, Taiwan.

Department of Obstetrics and Gynecology, Universitas Airlangga Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, Indonesia.

Comput Struct Biotechnol J. 2025 Mar 18;27:1140-1147. doi: 10.1016/j.csbj.2025.03.026. eCollection 2025.

Graduate Institute of Biomedical Science, China Medical University, Taichung, Taiwan.

Department of Obstetrics and Gynecology, Universitas Airlangga Hospital - Faculty of Medicine, Universitas Airlangga, Surabaya, Indonesia.

Comput Struct Biotechnol J. 2025 Mar 18;27:1140-1147. doi: 10.1016/j.csbj.2025.03.026. eCollection 2025.

BACKGROUND

Accurate cardiotocography (CTG) interpretation is vital for the monitoring of fetal well-being during pregnancy and labor. Advanced artificial intelligence (AI) tools such as AI-large language models (AI-LLMs) may enhance the accuracy of CTG interpretation, but their potential has not been extensively evaluated.

OBJECTIVE

This study aimed to assess the performance of three AI-LLMs (ChatGPT-4o, Gemini Advanced, and Copilot) in CTG image interpretation, compare their results to those of junior (JHDs) and senior human doctors (SHDs), and evaluate their reliability in clinical decision-making.

STUDY DESIGN

Seven CTG images were interpreted by the three AI-LLMs, five SHDs, and five JHDs, with the evaluations scored by five blinded maternal-fetal medicine experts using a Likert scale for five parameters (relevance, clarity, depth, focus, and coherence). The homogeneity of the expert ratings and group performances were statistically compared.

RESULTS

ChatGPT-4o scored 77.86, outperforming the Gemini Advanced (57.14), Copilot (47.29), and JHDs (61.57). Its performance closely approached that of the SHDs (80.43), with no statistically significant difference between the two (p > 0.05). ChatGPT-4o excelled in the depth parameter and was only marginally inferior to the SHDs regarding the other parameters.

CONCLUSION

ChatGPT-4o demonstrated superior performance among the AI-LLMs, surpassed JHDs in CTG interpretation, and closely matched the performance level of SHDs. AI-LLMs, particularly ChatGPT-4o, are promising tools for assisting obstetricians, improving diagnostic accuracy, and enhancing obstetric patient care.

背景

准确解读胎心监护（CTG）对于孕期和分娩期间监测胎儿健康至关重要。先进的人工智能（AI）工具，如人工智能大语言模型（AI-LLMs），可能会提高CTG解读的准确性，但其潜力尚未得到广泛评估。

目的

本研究旨在评估三种AI-LLMs（ChatGPT-4o、Gemini Advanced和Copilot）在CTG图像解读中的表现，将其结果与初级（JHDs）和高级人类医生（SHDs）的结果进行比较，并评估它们在临床决策中的可靠性。

研究设计

由三种AI-LLMs、五名SHDs和五名JHDs对七张CTG图像进行解读，由五名不知情的母胎医学专家使用李克特量表对五个参数（相关性、清晰度、深度、重点和连贯性）进行评分。对专家评分和组间表现的同质性进行统计学比较。

结果

ChatGPT-4o得分为77.86，优于Gemini Advanced（57.14）、Copilot（47.29）和JHDs（61.57）。其表现与SHDs（80.43）相近，两者之间无统计学显著差异（p > 0.05）。ChatGPT-4o在深度参数方面表现出色，在其他参数方面仅略逊于SHDs。

结论

ChatGPT-4o在AI-LLMs中表现卓越，在CTG解读方面超过了JHDs，与SHDs的表现水平相近。AI-LLMs，尤其是ChatGPT-4o，是协助产科医生、提高诊断准确性和加强产科患者护理的有前途的工具。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于产科实践中可靠且准确解读胎心监护（CTG）的人工智能大语言模型（AI-LLMs）。

Artificial intelligence-large language models (AI-LLMs) for reliable and accurate cardiotocography (CTG) interpretation in obstetric practice.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

用于产科实践中可靠且准确解读胎心监护（CTG）的人工智能大语言模型（AI-LLMs）。

Artificial intelligence-large language models (AI-LLMs) for reliable and accurate cardiotocography (CTG) interpretation in obstetric practice.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

STUDY DESIGN

RESULTS

CONCLUSION

背景

目的

研究设计

结果

结论

相似文献

引用本文的文献

本文引用的文献