自动总结临床试验证据：凸显当前挑战的一个原型

Automatically Summarizing Evidence from Clinical Trials: A Prototype Highlighting Current Challenges.

作者信息

Ramprasad Sanjana, Marshall Iain J, McInerney Denis Jered, Wallace Byron C

机构信息

Northeastern University.

King's College London.

出版信息

Proc Conf Assoc Comput Linguist Meet. 2023 May;2023:236-247.

PMID:37483390

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10361334/

Abstract

We present , a system that aims to automatically summarize evidence presented in the set of randomized controlled trials most relevant to a given query. Building on prior work (Marshall et al., 2020), the system retrieves trial publications matching a query specifying a combination of condition, intervention(s), and outcome(s), and ranks these according to sample size and estimated study quality. The top- such studies are passed through a neural multi-document summarization system, yielding a synopsis of these trials. We consider two architectures: A standard sequence-to-sequence model based on BART (Lewis et al., 2019), and a multi-headed architecture intended to provide greater transparency to end-users. Both models produce fluent and relevant summaries of evidence retrieved for queries, but their tendency to introduce unsupported statements render them inappropriate for use in this domain at present. The proposed architecture may help users verify outputs allowing users to trace generated tokens back to inputs. The demonstration video is available at: https://vimeo.com/735605060 The prototype, source code, and model weights are available at: https://sanjanaramprasad.github.io/trials-summarizer/.

摘要

我们展示了一个系统，其旨在自动总结与给定查询最相关的一组随机对照试验中所呈现的证据。基于先前的工作（马歇尔等人，2020年），该系统检索与指定疾病、干预措施和结果组合的查询相匹配的试验出版物，并根据样本量和估计的研究质量对这些出版物进行排名。排名靠前的此类研究将通过一个神经多文档摘要系统，生成这些试验的概要。我们考虑了两种架构：一种基于BART的标准序列到序列模型（刘易斯等人，2019年），以及一种旨在为最终用户提供更高透明度的多头架构。这两种模型都能生成针对查询检索到的证据的流畅且相关的摘要，但它们引入无根据陈述的倾向使得它们目前不适用于该领域。所提出的架构可能有助于用户验证输出，允许用户将生成的令牌追溯到输入。演示视频可在以下网址获取：https://vimeo.com/735605060 原型、源代码和模型权重可在以下网址获取：https://sanjanaramprasad.github.io/trials-summarizer/ 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce4e/10361334/35efe9076387/nihms-1912129-f0007.jpg

相似文献

Automatically Summarizing Evidence from Clinical Trials: A Prototype Highlighting Current Challenges.自动总结临床试验证据：凸显当前挑战的一个原型

Proc Conf Assoc Comput Linguist Meet. 2023 May;2023:236-247.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

The future of Cochrane Neonatal.考克兰新生儿协作网的未来。

Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.

Generating (Factual?) Narrative Summaries of RCTs: Experiments with Neural Multi-Document Summarization.生成（真实的？）RCT 叙述性摘要：神经多文档摘要实验。

AMIA Jt Summits Transl Sci Proc. 2021 May 17;2021:605-614. eCollection 2021.

Literature Retrieval for Precision Medicine with Neural Matching and Faceted Summarization.基于神经匹配和分面摘要的精准医学文献检索

Proc Conf Empir Methods Nat Lang Process. 2020 Nov;2020:3389-3399. doi: 10.18653/v1/2020.findings-emnlp.304.

Scientific basis of the OCRA method for risk assessment of biomechanical overload of upper limb, as preferred method in ISO standards on biomechanical risk factors.OCRA 方法评估上肢生物力学过载风险的科学基础，作为 ISO 生物力学风险因素标准中的首选方法。

Scand J Work Environ Health. 2018 Jul 1;44(4):436-438. doi: 10.5271/sjweh.3746.

Quantifying the informativeness for biomedical literature summarization: An itemset mining method.量化生物医学文献摘要的信息量：一种基于项集挖掘的方法。

Comput Methods Programs Biomed. 2017 Jul;146:77-89. doi: 10.1016/j.cmpb.2017.05.011. Epub 2017 May 27.

Clinical Context-Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation.基于深度神经网络的临床相关生物医学文本摘要：模型开发与验证。

J Med Internet Res. 2020 Oct 23;22(10):e19810. doi: 10.2196/19810.

Letter to the Editor: CONVERGENCES AND DIVERGENCES IN THE ICD-11 VS. DSM-5 CLASSIFICATION OF MOOD DISORDERS.给编辑的信：《ICD-11 与 DSM-5 心境障碍分类的趋同与分歧》

Turk Psikiyatri Derg. 2021;32(4):293-295. doi: 10.5080/u26899.

The effectiveness of health literacy interventions on the informed consent process of health care users: a systematic review protocol.健康素养干预措施对医疗保健使用者知情同意过程的有效性：一项系统评价方案

JBI Database System Rev Implement Rep. 2015 Oct;13(10):82-94. doi: 10.11124/jbisrir-2015-2304.

引用本文的文献

Scalable Scientific Interest Profiling Using Large Language Models.使用大语言模型进行可扩展的科学兴趣剖析

ArXiv. 2025 Aug 19:arXiv:2508.15834v1.

Accelerating clinical evidence synthesis with large language models.利用大语言模型加速临床证据综合分析

NPJ Digit Med. 2025 Aug 8;8(1):509. doi: 10.1038/s41746-025-01840-7.

Artificial intelligence in food and nutrition evidence: The challenges and opportunities.食品与营养领域人工智能的证据：挑战与机遇

PNAS Nexus. 2024 Oct 15;3(12):pgae461. doi: 10.1093/pnasnexus/pgae461. eCollection 2024 Dec.

Leveraging generative AI for clinical evidence synthesis needs to ensure trustworthiness.利用生成式人工智能进行临床证据综合需要确保其可信度。

J Biomed Inform. 2024 May;153:104640. doi: 10.1016/j.jbi.2024.104640. Epub 2024 Apr 10.

Constructing a finer-grained representation of clinical trial results from ClinicalTrials.gov.从 ClinicalTrials.gov 构建临床试验结果的更细粒度表示。

Sci Data. 2024 Jan 6;11(1):41. doi: 10.1038/s41597-023-02869-7.

Opportunities and challenges for ChatGPT and large language models in biomedicine and health.ChatGPT 和大型语言模型在生物医学和健康领域的机遇与挑战。

Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad493.

Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health.ChatGPT和大语言模型在生物医学与健康领域的机遇与挑战

ArXiv. 2023 Oct 17:arXiv:2306.10070v2.

本文引用的文献

Generating (Factual?) Narrative Summaries of RCTs: Experiments with Neural Multi-Document Summarization.生成（真实的？）RCT 叙述性摘要：神经多文档摘要实验。

AMIA Jt Summits Transl Sci Proc. 2021 May 17;2021:605-614. eCollection 2021.

: Mapping and Browsing Medical Evidence in Real-Time.实时映射与浏览医学证据

Proc Conf. 2020 Jul;2020:63-69. doi: 10.18653/v1/2020.acl-demos.9.

State of the evidence: a survey of global disparities in clinical trials.证据现状：全球临床试验差异调查。

BMJ Glob Health. 2021 Jan;6(1). doi: 10.1136/bmjgh-2020-004145.

Trialstreamer: A living, automatically updated database of clinical trial reports.Trialstreamer：一个实时更新的临床试验报告数据库。

J Am Med Inform Assoc. 2020 Dec 9;27(12):1903-1912. doi: 10.1093/jamia/ocaa163.

SEQ2SEQ-VIS : A Visual Debugging Tool for Sequence-to-Sequence Models.SEQ2SEQ-VIS：一种用于序列到序列模型的可视化调试工具。

IEEE Trans Vis Comput Graph. 2018 Oct 17. doi: 10.1109/TVCG.2018.2865044.

Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide.机器学习在识别随机对照试验中的应用：评估与实践指南。

Res Synth Methods. 2018 Dec;9(4):602-614. doi: 10.1002/jrsm.1287. Epub 2018 Feb 7.

MetaMap Lite: an evaluation of a new Java implementation of MetaMap.MetaMap精简版：对MetaMap新Java实现的评估

J Am Med Inform Assoc. 2017 Jul 1;24(4):841-844. doi: 10.1093/jamia/ocw177.

Evaluating the use of different positional strategies for sentence selection in biomedical literature summarization.评估在生物医学文献总结中选择句子时使用不同位置策略的效果。

BMC Bioinformatics. 2013 Feb 27;14:71. doi: 10.1186/1471-2105-14-71.

Seventy-five trials and eleven systematic reviews a day: how will we ever keep up?每天要处理七十五个试验和十一个系统评价：我们怎么才能跟得上？

PLoS Med. 2010 Sep 21;7(9):e1000326. doi: 10.1371/journal.pmed.1000326.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验