• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CBAG:条件生物医学文摘生成。

CBAG: Conditional biomedical abstract generation.

机构信息

School of Computing, Clemson University, Clemson, SC, United States of America.

Department of Computer and Information Sciences, University of Delaware, Newark, DE, United States of America.

出版信息

PLoS One. 2021 Jul 6;16(7):e0253905. doi: 10.1371/journal.pone.0253905. eCollection 2021.

DOI:10.1371/journal.pone.0253905
PMID:34228754
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8259990/
Abstract

Biomedical research papers often combine disjoint concepts in novel ways, such as when describing a newly discovered relationship between an understudied gene with an important disease. These concepts are often explicitly encoded as metadata keywords, such as the author-provided terms included with many documents in the MEDLINE database. While substantial recent work has addressed the problem of text generation in a more general context, applications, such as scientific writing assistants, or hypothesis generation systems, could benefit from the capacity to select the specific set of concepts that underpin a generated biomedical text. We propose a conditional language model following the transformer architecture. This model uses the "encoder stack" to encode concepts that a user wishes to discuss in the generated text. The "decoder stack" then follows the masked self-attention pattern to perform text generation, using both prior tokens as well as the encoded condition. We demonstrate that this approach provides significant control, while still producing reasonable biomedical text.

摘要

生物医学研究论文经常以新颖的方式结合不相关的概念,例如在描述一个新发现的、未充分研究的基因与一种重要疾病之间的关系时。这些概念通常作为元数据关键字明确编码,例如 MEDLINE 数据库中许多文档中包含的作者提供的术语。虽然最近有大量工作解决了更一般上下文中的文本生成问题,但应用程序(如科学写作助手或假设生成系统)可能会受益于选择生成生物医学文本所依据的特定概念集的能力。我们提出了一种基于转换器架构的条件语言模型。该模型使用“编码器堆叠”对用户希望在生成文本中讨论的概念进行编码。然后,“解码器堆叠”遵循屏蔽的自注意力模式执行文本生成,同时使用前一个标记和编码条件。我们证明了这种方法提供了显著的控制能力,同时仍然生成了合理的生物医学文本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff66/8259990/ab3966397987/pone.0253905.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff66/8259990/92a1be045087/pone.0253905.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff66/8259990/2af89eb1611f/pone.0253905.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff66/8259990/4c5a7c4c0c20/pone.0253905.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff66/8259990/ab3966397987/pone.0253905.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff66/8259990/92a1be045087/pone.0253905.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff66/8259990/2af89eb1611f/pone.0253905.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff66/8259990/4c5a7c4c0c20/pone.0253905.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff66/8259990/ab3966397987/pone.0253905.g004.jpg

相似文献

1
CBAG: Conditional biomedical abstract generation.CBAG:条件生物医学文摘生成。
PLoS One. 2021 Jul 6;16(7):e0253905. doi: 10.1371/journal.pone.0253905. eCollection 2021.
2
Different approaches for identifying important concepts in probabilistic biomedical text summarization.概率生物医学文本摘要中重要概念识别的不同方法。
Artif Intell Med. 2018 Jan;84:101-116. doi: 10.1016/j.artmed.2017.11.004. Epub 2017 Dec 6.
3
Cloud-Based Phrase Mining and Analysis of User-Defined Phrase-Category Association in Biomedical Publications.基于云的生物医学出版物中用户定义短语类别关联的短语挖掘与分析
J Vis Exp. 2019 Feb 23(144). doi: 10.3791/59108.
4
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
5
A text-mining technique for extracting gene-disease associations from the biomedical literature.一种从生物医学文献中提取基因-疾病关联的文本挖掘技术。
Int J Bioinform Res Appl. 2010;6(3):270-86. doi: 10.1504/IJBRA.2010.034075.
6
Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research.从文本和大规模数据分析中提取基因与疾病之间的关系:对转化研究的启示。
BMC Bioinformatics. 2015 Feb 21;16:55. doi: 10.1186/s12859-015-0472-9.
7
Gene Ontology synonym generation rules lead to increased performance in biomedical concept recognition.基因本体同义词生成规则可提高生物医学概念识别的性能。
J Biomed Semantics. 2016 Sep 9;7(1):52. doi: 10.1186/s13326-016-0096-7.
8
Evaluating the effect of unbalanced data in biomedical document classification.评估不平衡数据在生物医学文档分类中的影响。
J Integr Bioinform. 2011 Sep 16;8(3):177. doi: 10.2390/biecoll-jib-2011-177.
9
Mapping of biomedical text to concepts of lexicons, terminologies, and ontologies.将生物医学文本映射到词典、术语表和本体的概念。
Methods Mol Biol. 2014;1159:33-45. doi: 10.1007/978-1-4939-0709-0_3.
10
Deep contextualized embeddings for quantifying the informative content in biomedical text summarization.用于量化生物医学文本摘要是信息内容的深度语境化嵌入。
Comput Methods Programs Biomed. 2020 Feb;184:105117. doi: 10.1016/j.cmpb.2019.105117. Epub 2019 Oct 4.

引用本文的文献

1
Dyport: dynamic importance-based biomedical hypothesis generation benchmarking technique.Dyport:基于动态重要性的生物医学假说生成基准测试技术。
BMC Bioinformatics. 2024 Jun 13;25(1):213. doi: 10.1186/s12859-024-05812-8.

本文引用的文献

1
COVID-Twitter-BERT: A natural language processing model to analyse COVID-19 content on Twitter.COVID-Twitter-BERT:一种用于分析推特上新冠疫情相关内容的自然语言处理模型。
Front Artif Intell. 2023 Mar 14;6:1023281. doi: 10.3389/frai.2023.1023281. eCollection 2023.
2
Inhibition of the Dead Box RNA Helicase 3 Prevents HIV-1 Tat and Cocaine-Induced Neurotoxicity by Targeting Microglia Activation.抑制死亡盒 RNA 解旋酶 3 通过靶向小胶质细胞激活来预防 HIV-1 Tat 和可卡因诱导的神经毒性。
J Neuroimmune Pharmacol. 2020 Jun;15(2):209-223. doi: 10.1007/s11481-019-09885-8. Epub 2019 Dec 4.
3
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.
BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
4
MOLIERE: Automatic Biomedical Hypothesis Generation System.莫里哀:自动生物医学假设生成系统。
KDD. 2017 Aug;2017:1633-1642. doi: 10.1145/3097983.3098057.
5
Artificial intelligence in neurodegenerative disease research: use of IBM Watson to identify additional RNA-binding proteins altered in amyotrophic lateral sclerosis.人工智能在神经退行性疾病研究中的应用:利用 IBM Watson 鉴定肌萎缩侧索硬化症中改变的额外 RNA 结合蛋白。
Acta Neuropathol. 2018 Feb;135(2):227-247. doi: 10.1007/s00401-017-1785-8. Epub 2017 Nov 13.
6
The spreading of misinformation online.网上错误信息的传播。
Proc Natl Acad Sci U S A. 2016 Jan 19;113(3):554-9. doi: 10.1073/pnas.1517441113. Epub 2016 Jan 4.
7
The extent and consequences of p-hacking in science.科学中的 p-值操纵的程度和后果。
PLoS Biol. 2015 Mar 13;13(3):e1002106. doi: 10.1371/journal.pbio.1002106. eCollection 2015 Mar.
8
Overview of BioCreAtIvE: critical assessment of information extraction for biology.生物创意(BioCreAtIvE)概述:生物学信息提取的批判性评估
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2105-6-S1-S1. Epub 2005 May 24.