验证GAN-BioBERT：一种评估临床试验报告趋势的方法。

Validating GAN-BioBERT: A Methodology for Assessing Reporting Trends in Clinical Trials.

作者信息

Myszewski Joshua J, Klossowski Emily, Meyer Patrick, Bevil Kristin, Klesius Lisa, Schroeder Kristopher M

机构信息

School of Medicine and Public Health, University of Wisconsin, Madison, WI, United States.

University of Wisconsin-Milwaukee, Milwaukee, WI, United States.

出版信息

Front Digit Health. 2022 May 24;4:878369. doi: 10.3389/fdgth.2022.878369. eCollection 2022.

DOI:10.3389/fdgth.2022.878369

PMID:35685304

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9170913/

Abstract

BACKGROUND

The aim of this study was to validate a three-class sentiment classification model for clinical trial abstracts combining adversarial learning and the BioBERT language processing model as a tool to assess trends in biomedical literature in a clearly reproducible manner. We then assessed the model's performance for this application and compared it to previous models used for this task.

METHODS

Using 108 expert-annotated clinical trial abstracts and 2,000 unlabeled abstracts this study develops a three-class sentiment classification algorithm for clinical trial abstracts. The model uses a semi-supervised model based on the Bidirectional Encoder Representation from Transformers (BERT) model, a much more advanced and accurate method compared to previously used models based upon traditional machine learning methods. The prediction performance was compared to those previous studies.

RESULTS

The algorithm was found to have a classification accuracy of 91.3%, with a macro F1-Score of 0.92, significantly outperforming previous studies used to classify sentiment in clinical trial literature, while also making the sentiment classification finer grained with greater reproducibility.

CONCLUSION

We demonstrate an easily applied sentiment classification model for clinical trial abstracts that significantly outperforms previous models with greater reproducibility and applicability to large-scale study of reporting trends.

摘要

背景

本研究的目的是验证一种用于临床试验摘要的三类情感分类模型，该模型结合了对抗学习和BioBERT语言处理模型，作为以清晰可重复的方式评估生物医学文献趋势的工具。然后，我们评估了该模型在此应用中的性能，并将其与以前用于此任务的模型进行了比较。

方法

本研究使用108篇专家注释的临床试验摘要和2000篇未标记的摘要，开发了一种用于临床试验摘要的三类情感分类算法。该模型使用基于变换器双向编码器表示（BERT）模型的半监督模型，与以前基于传统机器学习方法的模型相比，这是一种更先进、更准确的方法。将预测性能与以前的研究进行了比较。

结果

发现该算法的分类准确率为91.3%，宏F1分数为0.92，显著优于以前用于对临床试验文献中的情感进行分类的研究，同时还使情感分类更细化，具有更高的可重复性。

结论

我们展示了一种易于应用的临床试验摘要情感分类模型，该模型显著优于以前的模型，具有更高的可重复性和对报告趋势大规模研究的适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cace/9170913/96501322c3a3/fdgth-04-878369-g0001.jpg

相似文献

Validating GAN-BioBERT: A Methodology for Assessing Reporting Trends in Clinical Trials.验证GAN-BioBERT：一种评估临床试验报告趋势的方法。

Front Digit Health. 2022 May 24;4:878369. doi: 10.3389/fdgth.2022.878369. eCollection 2022.

Relation Classification for Bleeding Events From Electronic Health Records Using Deep Learning Systems: An Empirical Study.使用深度学习系统对电子健康记录中的出血事件进行关系分类：一项实证研究。

JMIR Med Inform. 2021 Jul 2;9(7):e27527. doi: 10.2196/27527.

BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT：一种用于生物医学文本挖掘的预训练生物医学语言表示模型。

Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.

Utilization of sentiment analysis to assess and compare negative finding reporting in veterinary and human literature.利用情感分析评估和比较兽医和人类文献中的阴性发现报告。

Res Vet Sci. 2022 Nov;148:27-32. doi: 10.1016/j.rvsc.2022.04.010. Epub 2022 May 23.

Multi-Label Classification in Patient-Doctor Dialogues With the RoBERTa-WWM-ext + CNN (Robustly Optimized Bidirectional Encoder Representations From Transformers Pretraining Approach With Whole Word Masking Extended Combining a Convolutional Neural Network) Model: Named Entity Study.基于RoBERTa-WWM-ext + CNN（带有全词掩码扩展的基于变换器预训练方法的稳健优化双向编码器表示与卷积神经网络相结合）模型的医患对话多标签分类：命名实体研究

JMIR Med Inform. 2022 Apr 21;10(4):e35606. doi: 10.2196/35606.

Training a Deep Contextualized Language Model for International Classification of Diseases, 10th Revision Classification via Federated Learning: Model Development and Validation Study.通过联邦学习训练用于国际疾病分类第10次修订版分类的深度情境化语言模型：模型开发与验证研究

JMIR Med Inform. 2022 Nov 10;10(11):e41342. doi: 10.2196/41342.

Adversarial active learning for the identification of medical concepts and annotation inconsistency.对抗式主动学习在医学概念识别和标注不一致性中的应用。

J Biomed Inform. 2020 Aug;108:103481. doi: 10.1016/j.jbi.2020.103481. Epub 2020 Jul 18.

Confirm or refute?: A comparative study on citation sentiment classification in clinical research publications.确认或反驳？：临床研究出版物中引文情绪分类的对比研究。

J Biomed Inform. 2019 Mar;91:103123. doi: 10.1016/j.jbi.2019.103123. Epub 2019 Feb 10.

Transfer Learning for Sentiment Classification Using Bidirectional Encoder Representations from Transformers (BERT) Model.使用来自Transformer的双向编码器表征（BERT）模型进行情感分类的迁移学习

Sensors (Basel). 2023 May 31;23(11):5232. doi: 10.3390/s23115232.

When BERT meets Bilbo: a learning curve analysis of pretrained language model on disease classification.当 BERT 遇见比尔博：预训练语言模型在疾病分类上的学习曲线分析。

BMC Med Inform Decis Mak. 2022 Apr 5;21(Suppl 9):377. doi: 10.1186/s12911-022-01829-2.

引用本文的文献

GPT meets PubMed: a novel approach to literature review using a large language model to crowdsource migraine medication reviews.GPT 与 PubMed 相遇：一种使用大语言模型众包偏头痛药物评价进行文献综述的新方法。

BMC Neurol. 2025 Feb 19;25(1):69. doi: 10.1186/s12883-025-04071-1.

Contextual Word Embedding for Biomedical Knowledge Extraction: a Rapid Review and Case Study.用于生物医学知识提取的上下文词嵌入：快速回顾与案例研究

J Healthc Inform Res. 2024 Jan 3;8(1):158-179. doi: 10.1007/s41666-023-00157-y. eCollection 2024 Mar.

Improving text mining in plant health domain with GAN and/or pre-trained language model.利用生成对抗网络（GAN）和/或预训练语言模型改进植物健康领域的文本挖掘。

Front Artif Intell. 2023 Feb 21;6:1072329. doi: 10.3389/frai.2023.1072329. eCollection 2023.

本文引用的文献

Toward automatic evaluation of medical abstracts: The current value of sentiment analysis and machine learning for classification of the importance of PubMed abstracts of randomized trials for stroke.迈向医学摘要的自动评估：情感分析和机器学习在对中风随机试验的PubMed摘要重要性分类方面的当前价值。

J Stroke Cerebrovasc Dis. 2020 Sep;29(9):105042. doi: 10.1016/j.jstrokecerebrovasdis.2020.105042. Epub 2020 Jun 23.

BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT：一种用于生物医学文本挖掘的预训练生物医学语言表示模型。

Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.

Confirm or refute?: A comparative study on citation sentiment classification in clinical research publications.确认或反驳？：临床研究出版物中引文情绪分类的对比研究。

J Biomed Inform. 2019 Mar;91:103123. doi: 10.1016/j.jbi.2019.103123. Epub 2019 Feb 10.

Construct validity of six sentiment analysis methods in the text of encounter notes of patients with critical illness.危重症患者就诊记录文本中 6 种情感分析方法的构建有效性。

J Biomed Inform. 2019 Jan;89:114-121. doi: 10.1016/j.jbi.2018.12.001. Epub 2018 Dec 14.

Extracting the Population, Intervention, Comparison and Sentiment from Randomized Controlled Trials.从随机对照试验中提取人群、干预措施、对照和研究观点

Stud Health Technol Inform. 2018;247:146-150.

Quantifying publication bias in meta-analysis.量化荟萃分析中的发表偏倚。

Biometrics. 2018 Sep;74(3):785-794. doi: 10.1111/biom.12817. Epub 2017 Nov 15.

A Visualization of Evolving Clinical Sentiment Using Vector Representations of Clinical Notes.使用临床记录的向量表示对不断演变的临床情感进行可视化

Comput Cardiol (2010). 2015 Sep;2015:629-632. doi: 10.1109/CIC.2015.7410989. Epub 2016 Feb 18.

Publication Bias and Nonreporting Found in Majority of Systematic Reviews and Meta-analyses in Anesthesiology Journals.麻醉学杂志中大多数系统评价和荟萃分析存在发表偏倚和未报告情况。

Anesth Analg. 2016 Oct;123(4):1018-25. doi: 10.1213/ANE.0000000000001452.

Citation Sentiment Analysis in Clinical Trial Papers.临床试验论文中的引用情感分析

AMIA Annu Symp Proc. 2015 Nov 5;2015:1334-41. eCollection 2015.

The role of balanced training and testing data sets for binary classifiers in bioinformatics.生物信息学中用于二分类器的平衡训练集和测试集的作用。

PLoS One. 2013 Jul 9;8(7):e67863. doi: 10.1371/journal.pone.0067863. Print 2013.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

验证GAN-BioBERT：一种评估临床试验报告趋势的方法。

Validating GAN-BioBERT: A Methodology for Assessing Reporting Trends in Clinical Trials.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献