• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

连接 PubMed 摘要之间的点。

Connecting the dots between PubMed abstracts.

机构信息

Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America.

出版信息

PLoS One. 2012;7(1):e29509. doi: 10.1371/journal.pone.0029509. Epub 2012 Jan 3.

DOI:10.1371/journal.pone.0029509
PMID:22235301
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3250456/
Abstract

BACKGROUND

There are now a multitude of articles published in a diversity of journals providing information about genes, proteins, pathways, and diseases. Each article investigates subsets of a biological process, but to gain insight into the functioning of a system as a whole, we must integrate information from multiple publications. Particularly, unraveling relationships between extra-cellular inputs and downstream molecular response mechanisms requires integrating conclusions from diverse publications.

METHODOLOGY

We present an automated approach to biological knowledge discovery from PubMed abstracts, suitable for "connecting the dots" across the literature. We describe a storytelling algorithm that, given a start and end publication, typically with little or no overlap in content, identifies a chain of intermediate publications from one to the other, such that neighboring publications have significant content similarity. The quality of discovered stories is measured using local criteria such as the size of supporting neighborhoods for each link and the strength of individual links connecting publications, as well as global metrics of dispersion. To ensure that the story stays coherent as it meanders from one publication to another, we demonstrate the design of novel coherence and overlap filters for use as post-processing steps.

CONCLUSIONS

WE DEMONSTRATE THE APPLICATION OF OUR STORYTELLING ALGORITHM TO THREE CASE STUDIES: i) a many-one study exploring relationships between multiple cellular inputs and a molecule responsible for cell-fate decisions, ii) a many-many study exploring the relationships between multiple cytokines and multiple downstream transcription factors, and iii) a one-to-one study to showcase the ability to recover a cancer related association, viz. the Warburg effect, from past literature. The storytelling pipeline helps narrow down a scientist's focus from several hundreds of thousands of relevant documents to only around a hundred stories. We argue that our approach can serve as a valuable discovery aid for hypothesis generation and connection exploration in large unstructured biological knowledge bases.

摘要

背景

现在有大量的文章发表在各种期刊上,提供关于基因、蛋白质、途径和疾病的信息。每篇文章都研究了生物过程的子集,但为了深入了解整个系统的功能,我们必须整合来自多个出版物的信息。特别是,要揭示细胞外输入与下游分子反应机制之间的关系,需要整合来自不同出版物的结论。

方法

我们提出了一种从 PubMed 摘要中自动发现生物知识的方法,适用于“连接文献中的点”。我们描述了一种讲故事的算法,给定一个开始和结束的出版物,通常内容上几乎没有或没有重叠,该算法从一个出版物到另一个出版物识别出一连串的中间出版物,使得相邻的出版物具有显著的内容相似性。所发现的故事的质量使用局部标准来衡量,例如每个链接的支持邻域的大小和连接出版物的各个链接的强度,以及全局分散度指标。为了确保故事在从一个出版物到另一个出版物的曲折过程中保持连贯性,我们展示了新颖的连贯性和重叠过滤器的设计,作为后处理步骤。

结论

我们展示了我们的讲故事算法在三个案例研究中的应用:i)一项多对一的研究,探索了多个细胞输入与负责细胞命运决定的分子之间的关系,ii)一项多对多的研究,探索了多个细胞因子与多个下游转录因子之间的关系,以及 iii)一项一对一的研究,展示了从过去文献中恢复与癌症相关的关联的能力,即沃伯格效应。讲故事的管道帮助科学家将注意力从数十万篇相关文献缩小到只有大约一百个故事。我们认为,我们的方法可以作为在大型非结构化生物知识库中生成假设和探索连接的有价值的发现辅助工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/f13e32b1387c/pone.0029509.g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/1732053909f2/pone.0029509.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/a73cc07d8faa/pone.0029509.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/3241825adde6/pone.0029509.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/19259a1b27a4/pone.0029509.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/6cbb1cc4562f/pone.0029509.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/e364821d3a41/pone.0029509.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/a144464b296d/pone.0029509.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/1db41f1548de/pone.0029509.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/f31c65b08d13/pone.0029509.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/b10002d8f091/pone.0029509.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/764e055d6386/pone.0029509.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/43f3d58215ed/pone.0029509.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/23c8d78f1f3e/pone.0029509.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/fd876af6eeaf/pone.0029509.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/f13e32b1387c/pone.0029509.g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/1732053909f2/pone.0029509.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/a73cc07d8faa/pone.0029509.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/3241825adde6/pone.0029509.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/19259a1b27a4/pone.0029509.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/6cbb1cc4562f/pone.0029509.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/e364821d3a41/pone.0029509.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/a144464b296d/pone.0029509.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/1db41f1548de/pone.0029509.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/f31c65b08d13/pone.0029509.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/b10002d8f091/pone.0029509.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/764e055d6386/pone.0029509.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/43f3d58215ed/pone.0029509.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/23c8d78f1f3e/pone.0029509.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/fd876af6eeaf/pone.0029509.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a96d/3250456/f13e32b1387c/pone.0029509.g015.jpg

相似文献

1
Connecting the dots between PubMed abstracts.连接 PubMed 摘要之间的点。
PLoS One. 2012;7(1):e29509. doi: 10.1371/journal.pone.0029509. Epub 2012 Jan 3.
2
pubmed2ensembl: a resource for mining the biological literature on genes.pubmed2ensembl:一个挖掘基因相关生物文献的资源
PLoS One. 2011;6(9):e24716. doi: 10.1371/journal.pone.0024716. Epub 2011 Sep 29.
3
OmixLitMiner: A Bioinformatics Tool for Prioritizing Biological Leads from 'Omics Data Using Literature Retrieval and Data Mining.OmixLitMiner:一种生物信息学工具,用于通过文献检索和数据挖掘从 'Omics 数据中优先考虑生物学线索。
Int J Mol Sci. 2020 Feb 19;21(4):1374. doi: 10.3390/ijms21041374.
4
BioTextQuest(+): a knowledge integration platform for literature mining and concept discovery.BioTextQuest(+):一个用于文献挖掘和概念发现的知识整合平台。
Bioinformatics. 2014 Nov 15;30(22):3249-56. doi: 10.1093/bioinformatics/btu524. Epub 2014 Aug 6.
5
HEALTH GeoJunction: place-time-concept browsing of health publications.健康地理联合系统:健康出版物的时空浏览。
Int J Health Geogr. 2010 May 18;9:23. doi: 10.1186/1476-072X-9-23.
6
BioReader: a text mining tool for performing classification of biomedical literature.BioReader:一种文本挖掘工具,用于对生物医学文献进行分类。
BMC Bioinformatics. 2019 Feb 4;19(Suppl 13):57. doi: 10.1186/s12859-019-2607-x.
7
Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature.文本挖掘有助于数据库管理——从生物医学文献中提取突变与疾病的关联。
BMC Bioinformatics. 2015 Jun 6;16:185. doi: 10.1186/s12859-015-0609-x.
8
iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature.iTextMine:用于从文献中大规模知识提取的集成文本挖掘系统。
Database (Oxford). 2018 Jan 1;2018:bay128. doi: 10.1093/database/bay128.
9
Best Match: New relevance search for PubMed.最佳匹配:PubMed 的新相关性搜索。
PLoS Biol. 2018 Aug 28;16(8):e2005343. doi: 10.1371/journal.pbio.2005343. eCollection 2018 Aug.
10
Unsupervised discovery of information structure in biomedical documents.生物医学文献中信息结构的无监督发现。
Bioinformatics. 2015 Apr 1;31(7):1084-92. doi: 10.1093/bioinformatics/btu758. Epub 2014 Nov 18.

引用本文的文献

1
A systematic review on literature-based discovery workflow.基于文献的发现工作流程的系统综述。
PeerJ Comput Sci. 2019 Nov 18;5:e235. doi: 10.7717/peerj-cs.235. eCollection 2019.
2
Rediscovering Don Swanson: the Past, Present and Future of Literature-Based Discovery.重新发现唐·斯旺森:基于文献的发现的过去、现在与未来
J Data Inf Sci. 2017 Dec;2(4):43-64. doi: 10.1515/jdis-2017-0019.
3
Narratives in the network: interactive methods for mining cell signaling networks.网络中的叙述:挖掘细胞信号网络的交互方法

本文引用的文献

1
BioGraph: unsupervised biomedical knowledge discovery via automated hypothesis generation.BioGraph:通过自动化假设生成进行无监督的生物医学知识发现。
Genome Biol. 2011 Jun 22;12(6):R57. doi: 10.1186/gb-2011-12-6-r57.
2
A high-throughput platform for lentiviral overexpression screening of the human ORFeome.高通量平台用于人 ORFeome 的慢病毒过表达筛选。
PLoS One. 2011;6(5):e20057. doi: 10.1371/journal.pone.0020057. Epub 2011 May 24.
3
Integration and publication of heterogeneous text-mined relationships on the Semantic Web.
J Comput Biol. 2012 Sep;19(9):1043-59. doi: 10.1089/cmb.2011.0244. Epub 2012 Aug 16.
语义网上异构文本挖掘关系的整合与发布。
J Biomed Semantics. 2011 May 17;2 Suppl 2(Suppl 2):S10. doi: 10.1186/2041-1480-2-S2-S10.
4
BICEPP: an example-based statistical text mining method for predicting the binary characteristics of drugs.BICEPP:一种基于实例的统计文本挖掘方法,用于预测药物的二元特征。
BMC Bioinformatics. 2011 Apr 21;12:112. doi: 10.1186/1471-2105-12-112.
5
Nampt and its potential role in inflammation and type 2 diabetes.烟酰胺磷酸核糖转移酶(Nampt)及其在炎症和2型糖尿病中的潜在作用。
Handb Exp Pharmacol. 2011(203):147-64. doi: 10.1007/978-3-642-17214-4_7.
6
Literature-based discovery of diabetes- and ROS-related targets.基于文献的糖尿病和 ROS 相关靶点的发现。
BMC Med Genomics. 2010 Oct 27;3:49. doi: 10.1186/1755-8794-3-49.
7
Discovering drug-drug interactions: a text-mining and reasoning approach based on properties of drug metabolism.发现药物-药物相互作用:一种基于药物代谢特性的文本挖掘和推理方法。
Bioinformatics. 2010 Sep 15;26(18):i547-53. doi: 10.1093/bioinformatics/btq382.
8
Using text to build semantic networks for pharmacogenomics.利用文本构建药物基因组学的语义网络。
J Biomed Inform. 2010 Dec;43(6):1009-19. doi: 10.1016/j.jbi.2010.08.005. Epub 2010 Aug 17.
9
Analysis of biological processes and diseases using text mining approaches.使用文本挖掘方法分析生物过程和疾病。
Methods Mol Biol. 2010;593:341-82. doi: 10.1007/978-1-60327-194-3_16.
10
Novel protein-protein interactions inferred from literature context.从文献上下文中推断出的新型蛋白质-蛋白质相互作用。
PLoS One. 2009 Nov 18;4(11):e7894. doi: 10.1371/journal.pone.0007894.