Suppr超能文献

改善与长新冠相关的文本分类:一种新颖的端到端领域自适应释义框架。

Improving long COVID-related text classification: a novel end-to-end domain-adaptive paraphrasing framework.

机构信息

Department of Electrical and Computer Engineering, University of California, La Jolla, San Diego, USA.

Division of Biomedical Informatics, University of California, La Jolla, San Diego, USA.

出版信息

Sci Rep. 2024 Jan 2;14(1):85. doi: 10.1038/s41598-023-48594-4.

Abstract

The emergence of long COVID during the ongoing COVID-19 pandemic has presented considerable challenges for healthcare professionals and researchers. The task of identifying relevant literature is particularly daunting due to the rapidly evolving scientific landscape, inconsistent definitions, and a lack of standardized nomenclature. This paper proposes a novel solution to this challenge by employing machine learning techniques to classify long COVID literature. However, the scarcity of annotated data for machine learning poses a significant obstacle. To overcome this, we introduce a strategy called medical paraphrasing, which diversifies the training data while maintaining the original content. Additionally, we propose a Data-Reweighting-Based Multi-Level Optimization Framework for Domain Adaptive Paraphrasing, supported by a Meta-Weight-Network (MWN). This innovative approach incorporates feedback from the downstream text classification model to influence the training of the paraphrasing model. During the training process, the framework assigns higher weights to the training examples that contribute more effectively to the downstream task of long COVID text classification. Our findings demonstrate that this method substantially improves the accuracy and efficiency of long COVID literature classification, offering a valuable tool for physicians and researchers navigating this complex and ever-evolving field.

摘要

在当前的 COVID-19 大流行期间,长新冠的出现给医疗保健专业人员和研究人员带来了相当大的挑战。由于科学领域的快速发展、定义不一致以及缺乏标准化术语,识别相关文献的任务特别艰巨。本文提出了一种通过使用机器学习技术对长新冠文献进行分类的新方法。然而,机器学习的注释数据稀缺是一个重大障碍。为了克服这个问题,我们引入了一种称为医学释义的策略,该策略在保持原始内容的同时,使训练数据多样化。此外,我们提出了一种基于数据重新加权的多层次优化框架,用于领域自适应释义,并得到了元权重网络(MWN)的支持。这种创新方法结合了来自下游文本分类模型的反馈,以影响释义模型的训练。在训练过程中,框架会为对下游长新冠文本分类任务贡献更大的训练示例分配更高的权重。我们的研究结果表明,这种方法大大提高了长新冠文献分类的准确性和效率,为医生和研究人员在这个复杂且不断发展的领域提供了有价值的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c683/10761882/d438d9e8986d/41598_2023_48594_Fig1_HTML.jpg

相似文献

3
The future of Cochrane Neonatal.
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
5
[Health professionals facing the coronavirus disease 2019 (COVID-19) pandemic: What are the mental health risks?].
Encephale. 2020 Jun;46(3S):S73-S80. doi: 10.1016/j.encep.2020.04.008. Epub 2020 Apr 22.
6
Paraphrasing to improve the performance of Electronic Health Records Question Answering.
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:626-635. eCollection 2020.
7
SSA-Net: Spatial self-attention network for COVID-19 pneumonia infection segmentation with semi-supervised few-shot learning.
Med Image Anal. 2022 Jul;79:102459. doi: 10.1016/j.media.2022.102459. Epub 2022 Apr 22.
9
Development of a data-driven digital phenotype profile of distress experience of healthcare workers during COVID-19 pandemic.
Comput Methods Programs Biomed. 2023 Oct;240:107645. doi: 10.1016/j.cmpb.2023.107645. Epub 2023 Jun 12.

本文引用的文献

1
Long COVID Classification: Findings from a Clustering Analysis in the Predi-COVID Cohort Study.
Int J Environ Res Public Health. 2022 Nov 30;19(23):16018. doi: 10.3390/ijerph192316018.
2
Comprehensively identifying Long Covid articles with human-in-the-loop machine learning.
Patterns (N Y). 2023 Jan 13;4(1):100659. doi: 10.1016/j.patter.2022.100659. Epub 2022 Dec 1.
6
Continuous development of the semantic search engine preVIEW: from COVID-19 to long COVID.
Database (Oxford). 2022 Jul 1;2022. doi: 10.1093/database/baac048.
7
Identifying who has long COVID in the USA: a machine learning approach using N3C data.
Lancet Digit Health. 2022 Jul;4(7):e532-e541. doi: 10.1016/S2589-7500(22)00048-6. Epub 2022 May 16.
8
SARS-CoV-2 is associated with changes in brain structure in UK Biobank.
Nature. 2022 Apr;604(7907):697-707. doi: 10.1038/s41586-022-04569-5. Epub 2022 Mar 7.
9
A clinical case definition of post-COVID-19 condition by a Delphi consensus.
Lancet Infect Dis. 2022 Apr;22(4):e102-e107. doi: 10.1016/S1473-3099(21)00703-9. Epub 2021 Dec 21.
10
Characterizing Long COVID: Deep Phenotype of a Complex Condition.
EBioMedicine. 2021 Dec;74:103722. doi: 10.1016/j.ebiom.2021.103722. Epub 2021 Nov 25.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验