ChatGPT 4.0对护理评估量表翻译的语义评估：词汇计量分析

Semantic Evaluation of Nursing Assessment Scales Translations by ChatGPT 4.0: A Lexicometric Analysis.

作者信息

Parozzi Mauro, Bozzetti Mattia, Lo Cascio Alessio, Napolitano Daniele, Pendoni Roberta, Marcomini Ilaria, Cangelosi Giovanni, Mancin Stefano, Bonacaro Antonio

机构信息

Medicine and Surgery Department, University of Parma, Via Gramsci 14, 43126 Parma, Italy.

Direction of Health Professions, ASST Cremona, 26100 Cremona, Italy.

出版信息

Nurs Rep. 2025 Jun 11;15(6):211. doi: 10.3390/nursrep15060211.

DOI:10.3390/nursrep15060211

PMID:40559502

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12196477/

Abstract

: The use of standardized assessment tools within the nursing care process is a globally established practice, widely recognized as a foundation for evidence-based evaluation. Accurate translation is essential to ensure their correct and consistent clinical use. While effective, traditional procedures are time-consuming and resource-intensive, leading to increasing interest in whether artificial intelligence can assist or streamline this process for nursing researchers. Therefore, this study aimed to assess the translation's quality of nursing assessment scales performed by ChatGPT 4.0. : A total of 31 nursing rating scales with 772 items were translated from English to Italian using two different prompts, and then underwent a deep lexicometric analysis. To assess the semantic accuracy of the translations the Sentence-BERT, Jaccard similarity, TF-IDF cosine similarity, and Overlap ratio were used. Sensitivity, specificity, AUC, and AUROC were calculated to assess the quality of the translation classification. Paired-sample -tests were conducted to compare the similarity scores. : The Maastricht prompt produced translations that are marginally but consistently more semantically and lexically faithful to the original. While all differences were found to be statistically significant, the corresponding effect sizes indicate that the advantage of the Maastricht prompt is slight but consistent across all measures. The sensitivity of the prompts was 0.929 (92.9%) for York and 0.932 (93.2%) for Maastricht. Specificity and precision remained for both at 1.000. : Findings highlight the potential of prompt engineering as a low-cost, effective method to enhance translation outcomes. Nonetheless, as translation represents only a preliminary step in the full validation process, further studies should investigate the integration of AI-assisted translation within the broader framework of instrument adaptation and validation.

摘要

在护理过程中使用标准化评估工具是一种全球公认的做法，被广泛视为循证评估的基础。准确翻译对于确保其在临床中的正确和一致使用至关重要。虽然传统方法有效，但耗时且资源密集，这使得人们越来越关注人工智能是否可以协助或简化护理研究人员的这一过程。因此，本研究旨在评估ChatGPT 4.0对护理评估量表的翻译质量。

使用两种不同的提示将总共31个护理评定量表（共772个项目）从英语翻译成意大利语，然后进行深入的词汇分析。为了评估翻译的语义准确性，使用了句子BERT、杰卡德相似度、TF-IDF余弦相似度和重叠率。计算敏感性、特异性、AUC和AUROC以评估翻译分类的质量。进行配对样本检验以比较相似度得分。

马斯特里赫特提示生成的翻译在语义和词汇上对原文的忠实度略高且较为一致。虽然所有差异均具有统计学意义，但相应的效应大小表明，马斯特里赫特提示的优势微小但在所有指标上均保持一致。约克提示的敏感性为0.929（92.9%），马斯特里赫特提示的敏感性为0.932（93.2%）。两者的特异性和精确性均为1.000。

研究结果凸显了提示工程作为一种低成本、有效方法来提高翻译效果的潜力。尽管如此，由于翻译只是完整验证过程的初步步骤，进一步的研究应探讨在工具改编和验证的更广泛框架内整合人工智能辅助翻译的情况。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/886a/12196477/533862584f85/nursrep-15-00211-g001.jpg

相似文献

Semantic Evaluation of Nursing Assessment Scales Translations by ChatGPT 4.0: A Lexicometric Analysis.

Nurs Rep. 2025 Jun 11;15(6):211. doi: 10.3390/nursrep15060211.

Prescription of Controlled Substances: Benefits and Risks

Home treatment for mental health problems: a systematic review.

Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.

The development of a novel, standardized, norm-referenced Arabic Discourse Assessment Tool (ADAT), including an examination of psychometric properties of discourse measures in aphasia.

Int J Lang Commun Disord. 2024 Sep-Oct;59(5):2103-2117. doi: 10.1111/1460-6984.13083. Epub 2024 Jun 18.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

The measurement of collaboration within healthcare settings: a systematic review of measurement properties of instruments.

JBI Database System Rev Implement Rep. 2016 Apr;14(4):138-97. doi: 10.11124/JBISRIR-2016-2159.

The agreement of phonetic transcriptions between paediatric speech and language therapists transcribing a disordered speech sample.

Int J Lang Commun Disord. 2024 Sep-Oct;59(5):1981-1995. doi: 10.1111/1460-6984.13043. Epub 2024 Jun 8.

Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.

Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Eliciting adverse effects data from participants in clinical trials.

Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2.

引用本文的文献

Correction: Parozzi et al. Semantic Evaluation of Nursing Assessment Scales Translations by ChatGPT 4.0: A Lexicometric Analysis. 2025, , 211.

Nurs Rep. 2025 Jul 10;15(7):251. doi: 10.3390/nursrep15070251.

本文引用的文献

The impact of a risk assessment tool on hospital pressure injury prevalence and prevention: a quantitative pre-post evaluation.

Int J Nurs Stud Adv. 2025 May 1;8:100342. doi: 10.1016/j.ijnsa.2025.100342. eCollection 2025 Jun.

Integrating human expertise & automated methods for a dynamic and multi-parametric evaluation of large language models' feasibility in clinical decision-making.

Int J Med Inform. 2024 Aug;188:105501. doi: 10.1016/j.ijmedinf.2024.105501. Epub 2024 May 26.

Presenting artificial intelligence, deep learning, and machine learning studies to clinicians and healthcare stakeholders: an introductory reference with a guideline and a Clinical AI Research (CAIR) checklist proposal.

Acta Orthop. 2021 Oct;92(5):513-525. doi: 10.1080/17453674.2021.1918389. Epub 2021 May 14.

Use of Standardized and Non-Standardized Tools for Measuring the Risk of Falls and Independence in Clinical Practice.

Int J Environ Res Public Health. 2021 Mar 20;18(6):3226. doi: 10.3390/ijerph18063226.

Use of Standardized Assessment Tools to Improve the Effectiveness of Palliative Care Rounds: A Quality Improvement Initiative.

J Palliat Care. 2017 Jul/Oct;32(3-4):134-140. doi: 10.1177/0825859717740051. Epub 2017 Nov 3.

English language skills requirements for internationally educated nurses working in the care industry: Barriers to UK registration or institutionalised discrimination?

Int J Nurs Stud. 2016 Feb;54:1-4. doi: 10.1016/j.ijnurstu.2014.12.006. Epub 2015 Jan 3.

Evaluation of online machine translation by nursing users.

Comput Inform Nurs. 2013 Aug;31(8):382-7. doi: 10.1097/NXN.0b013e3182999dc2.

Use of online machine translation for nursing literature: a questionnaire-based survey.

Open Nurs J. 2013;7:22-8. doi: 10.2174/1874434601307010022. Epub 2013 Feb 1.

Online machine translation use with nursing literature: evaluation method and usability.

Comput Inform Nurs. 2013 Feb;31(2):59-65. doi: 10.1097/NXN.0b013e3182701056.

Guidelines for the process of cross-cultural adaptation of self-report measures.

Spine (Phila Pa 1976). 2000 Dec 15;25(24):3186-91. doi: 10.1097/00007632-200012150-00014.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ChatGPT 4.0对护理评估量表翻译的语义评估：词汇计量分析

Semantic Evaluation of Nursing Assessment Scales Translations by ChatGPT 4.0: A Lexicometric Analysis.

作者信息

Parozzi Mauro, Bozzetti Mattia, Lo Cascio Alessio, Napolitano Daniele, Pendoni Roberta, Marcomini Ilaria, Cangelosi Giovanni, Mancin Stefano, Bonacaro Antonio

机构信息

Medicine and Surgery Department, University of Parma, Via Gramsci 14, 43126 Parma, Italy.

Direction of Health Professions, ASST Cremona, 26100 Cremona, Italy.

出版信息

Nurs Rep. 2025 Jun 11;15(6):211. doi: 10.3390/nursrep15060211.

DOI:10.3390/nursrep15060211

PMID:40559502

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12196477/

Abstract

摘要

ChatGPT 4.0对护理评估量表翻译的语义评估：词汇计量分析

Semantic Evaluation of Nursing Assessment Scales Translations by ChatGPT 4.0: A Lexicometric Analysis.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

ChatGPT 4.0对护理评估量表翻译的语义评估：词汇计量分析

Semantic Evaluation of Nursing Assessment Scales Translations by ChatGPT 4.0: A Lexicometric Analysis.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献