Yang Xiongwen, Xiao Yi, Liu Di, Deng Huiyin, Huang Jian, Zhou Yubin, Dai Chuanzhou, Wu Jun, Liu Dan, Liang Maoli, Xu Chuan
Department of Thoracic Surgery, Guizhou Provincial People's Hospital, No. 83, Zhongshan East Road, Guiyang, , Guizhou, China.
NHC Key Laboratory of Pulmonary Immunological Diseases, Guizhou Provincial People's Hospital, Guiyang, Guizhou, China.
Sci Rep. 2025 May 2;15(1):15417. doi: 10.1038/s41598-025-97500-7.
In a recent study, the effectiveness of GPT-4 Omni in transforming lobectomy surgical records into structured data across multiple languages was explored. The aim was to improve both efficiency and accuracy in documenting thoracic surgical oncology procedures. Involving 466 records from seven specialized hospitals, the process started with OCR and text normalization. A manual restructuring by thoracic oncologists set the benchmark for fine-tuning Generative Pre-trained Transformer 4 Omni (GPT-4o). Experts reviewed the AI's output, assessing it on accuracy, precision, recall, and F1 scores. GPT-4o demonstrated high performance across both Chinese and English records, achieving an accuracy of 0.966, precision of 0.981, recall of 0.982, and an F1-score of 0.982 in both language settings. Results showed that GPT-4o was highly effective in both Chinese and English, significantly speeding up documentation compared to traditional methods. While it performed well across languages and reduced review times, common error types included terminology misinterpretations (2.82%), procedural sequence errors (1.41%), and omissions of key details (0.47%). While it performed well across languages and reduced review times, these limitations highlight areas for further refinement, particularly in enhancing contextual understanding and mitigating minor errors. Nonetheless, GPT-4o shows great potential in standardizing surgical records, streamlining workflows, and boosting care and research in thoracic oncology.
在最近的一项研究中,探索了GPT-4 Omni在将多语言肺叶切除手术记录转换为结构化数据方面的有效性。目的是提高胸外科肿瘤手术记录的效率和准确性。该过程涉及来自七家专科医院的466份记录,首先进行光学字符识别(OCR)和文本规范化。胸科肿瘤学家进行的手动重组为微调生成式预训练变换器4 Omni(GPT-4o)设定了基准。专家们审查了人工智能的输出,并根据准确性、精确率、召回率和F1分数对其进行评估。GPT-4o在中文和英文记录中均表现出高性能,在两种语言环境下的准确率均达到0.966,精确率为0.981,召回率为0.982,F1分数为0.982。结果表明,GPT-4o在中文和英文方面都非常有效,与传统方法相比,显著加快了记录速度。虽然它在多种语言中表现良好并减少了审查时间,但常见的错误类型包括术语误解(2.82%)、程序顺序错误(1.41%)和关键细节遗漏(0.47%)。虽然它在多种语言中表现良好并减少了审查时间,但这些局限性突出了需要进一步改进的领域,特别是在增强上下文理解和减少小错误方面。尽管如此,GPT-4o在标准化手术记录、简化工作流程以及促进胸科肿瘤学的护理和研究方面显示出巨大潜力。