Suppr超能文献

迈向评估临床试验出版物报告的透明度。

Toward assessing clinical trial publications for reporting transparency.

作者信息

Kilicoglu Halil, Rosemblat Graciela, Hoang Linh, Wadhwa Sahil, Peng Zeshan, Malički Mario, Schneider Jodi, Ter Riet Gerben

机构信息

School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, IL, USA; U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

U.S. National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

出版信息

J Biomed Inform. 2021 Apr;116:103717. doi: 10.1016/j.jbi.2021.103717. Epub 2021 Feb 26.

Abstract

OBJECTIVE

To annotate a corpus of randomized controlled trial (RCT) publications with the checklist items of CONSORT reporting guidelines and using the corpus to develop text mining methods for RCT appraisal.

METHODS

We annotated a corpus of 50 RCT articles at the sentence level using 37 fine-grained CONSORT checklist items. A subset (31 articles) was double-annotated and adjudicated, while 19 were annotated by a single annotator and reconciled by another. We calculated inter-annotator agreement at the article and section level using MASI (Measuring Agreement on Set-Valued Items) and at the CONSORT item level using Krippendorff's α. We experimented with two rule-based methods (phrase-based and section header-based) and two supervised learning approaches (support vector machine and BioBERT-based neural network classifiers), for recognizing 17 methodology-related items in the RCT Methods sections.

RESULTS

We created CONSORT-TM consisting of 10,709 sentences, 4,845 (45%) of which were annotated with 5,246 labels. A median of 28 CONSORT items (out of possible 37) were annotated per article. Agreement was moderate at the article and section levels (average MASI: 0.60 and 0.64, respectively). Agreement varied considerably among individual checklist items (Krippendorff's α= 0.06-0.96). The model based on BioBERT performed best overall for recognizing methodology-related items (micro-precision: 0.82, micro-recall: 0.63, micro-F1: 0.71). Combining models using majority vote and label aggregation further improved precision and recall, respectively.

CONCLUSION

Our annotated corpus, CONSORT-TM, contains more fine-grained information than earlier RCT corpora. Low frequency of some CONSORT items made it difficult to train effective text mining models to recognize them. For the items commonly reported, CONSORT-TM can serve as a testbed for text mining methods that assess RCT transparency, rigor, and reliability, and support methods for peer review and authoring assistance. Minor modifications to the annotation scheme and a larger corpus could facilitate improved text mining models. CONSORT-TM is publicly available at https://github.com/kilicogluh/CONSORT-TM.

摘要

目的

用CONSORT报告指南的清单项目注释随机对照试验(RCT)出版物语料库,并使用该语料库开发用于RCT评估的文本挖掘方法。

方法

我们使用37个细粒度的CONSORT清单项目在句子层面注释了一个包含50篇RCT文章的语料库。一个子集(31篇文章)进行了双人注释和裁决,而19篇由一名注释者注释并由另一名注释者核对。我们使用MASI(集值项目一致性测量)在文章和章节层面以及使用Krippendorff's α在CONSORT项目层面计算注释者间一致性。我们试验了两种基于规则的方法(基于短语和基于章节标题)和两种监督学习方法(支持向量机和基于BioBERT的神经网络分类器),用于识别RCT方法部分中的17个与方法相关的项目。

结果

我们创建了CONSORT-TM,它由10,709个句子组成,其中4,845个(45%)被标注了5,246个标签。每篇文章标注的CONSORT项目中位数为28个(可能的37个项目中)。在文章和章节层面一致性为中等(平均MASI分别为0.60和0.64)。各个清单项目之间的一致性差异很大(Krippendorff's α = 0.06 - 0.96)。基于BioBERT的模型在识别与方法相关的项目方面总体表现最佳(微精度:0.82,微召回率:0.63,微F1值:0.71)。使用多数投票和标签聚合组合模型分别进一步提高了精度和召回率。

结论

我们注释的语料库CONSORT-TM比早期的RCT语料库包含更细粒度的信息。一些CONSORT项目的低频使得难以训练有效的文本挖掘模型来识别它们。对于常见报告的项目,CONSORT-TM可以作为评估RCT透明度、严谨性和可靠性的文本挖掘方法的测试平台,并支持同行评审和作者辅助方法。对注释方案进行小的修改并增加语料库规模可以促进改进文本挖掘模型。CONSORT-TM可在https://github.com/kilicogluh/CONSORT-TM上公开获取。

相似文献

1
Toward assessing clinical trial publications for reporting transparency.
J Biomed Inform. 2021 Apr;116:103717. doi: 10.1016/j.jbi.2021.103717. Epub 2021 Feb 26.
4
SPIRIT-CONSORT-TM: a corpus for assessing transparency of clinical trial protocol and results publications.
medRxiv. 2025 Jan 15:2025.01.14.25320543. doi: 10.1101/2025.01.14.25320543.
6
Reporting Quality of Randomized Controlled Trials of Periodontal Diseases in Journal Abstracts-A Cross-sectional Survey and Bibliometric Analysis.
J Evid Based Dent Pract. 2018 Jun;18(2):130-141.e22. doi: 10.1016/j.jebdp.2017.08.005. Epub 2017 Sep 21.
10
Methodology reporting improved over time in 176,469 randomized controlled trials.
J Clin Epidemiol. 2023 Oct;162:19-28. doi: 10.1016/j.jclinepi.2023.08.004. Epub 2023 Aug 9.

引用本文的文献

1
Large Language Model Analysis of Reporting Quality of Randomized Clinical Trial Articles: A Systematic Review.
JAMA Netw Open. 2025 Aug 1;8(8):e2529418. doi: 10.1001/jamanetworkopen.2025.29418.
3
The Maastricht Intensive Care COVID Cohort: A Critical Appraisal of the Predefined Research Questions.
Crit Care Explor. 2025 Feb 3;7(2):e1211. doi: 10.1097/CCE.0000000000001211. eCollection 2025 Feb 1.
4
SPIRIT-CONSORT-TM: a corpus for assessing transparency of clinical trial protocol and results publications.
medRxiv. 2025 Jan 15:2025.01.14.25320543. doi: 10.1101/2025.01.14.25320543.
5
The Impact of Temperature on Extracting Information From Clinical Trial Publications Using Large Language Models.
Cureus. 2024 Dec 15;16(12):e75748. doi: 10.7759/cureus.75748. eCollection 2024 Dec.
6
Predicting the sample size of randomized controlled trials using natural language processing.
JAMIA Open. 2024 Oct 25;7(4):ooae116. doi: 10.1093/jamiaopen/ooae116. eCollection 2024 Dec.
9
Automatic categorization of self-acknowledged limitations in randomized controlled trial publications.
J Biomed Inform. 2024 Apr;152:104628. doi: 10.1016/j.jbi.2024.104628. Epub 2024 Mar 26.
10
Retrieval augmented scientific claim verification.
JAMIA Open. 2024 Feb 21;7(1):ooae021. doi: 10.1093/jamiaopen/ooae021. eCollection 2024 Apr.

本文引用的文献

1
The past, present and future of Registered Reports.
Nat Hum Behav. 2022 Jan;6(1):29-42. doi: 10.1038/s41562-021-01193-7. Epub 2021 Nov 15.
2
The Rigor and Transparency Index Quality Metric for Assessing Biological and Medical Science Methods.
iScience. 2020 Oct 20;23(11):101698. doi: 10.1016/j.isci.2020.101698. eCollection 2020 Nov 20.
3
Menagerie: A text-mining tool to support animal-human translation in neurodegeneration research.
PLoS One. 2019 Dec 17;14(12):e0226176. doi: 10.1371/journal.pone.0226176. eCollection 2019.
4
Improving reference prioritisation with PICO recognition.
BMC Med Inform Decis Mak. 2019 Dec 5;19(1):256. doi: 10.1186/s12911-019-0992-8.
5
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
6
Checklists work to improve science.
Nature. 2018 Apr;556(7701):273-274. doi: 10.1038/d41586-018-04590-7.
7
A manual corpus of annotated main findings of clinical case reports.
Database (Oxford). 2019 Jan 1;2019:bay143. doi: 10.1093/database/bay143.
9
Automatic recognition of self-acknowledged limitations in clinical research literature.
J Am Med Inform Assoc. 2018 Jul 1;25(7):855-861. doi: 10.1093/jamia/ocy038.
10
Biomedical text mining for research rigor and integrity: tasks, challenges, directions.
Brief Bioinform. 2018 Nov 27;19(6):1400-1414. doi: 10.1093/bib/bbx057.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验