• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于评估和增强全基因组关联研究结果的无监督文本挖掘

Unsupervised text mining for assessing and augmenting GWAS results.

作者信息

Ailem Melissa, Role François, Nadif Mohamed, Demenais Florence

机构信息

LIPADE, Université Paris Descartes, Sorbonne Paris Cité, Paris F-75006, France.

INSERM, Genetic Variation and Human Diseases Unit, UMR-946, Paris F-75010, France; Institut Universitaire d'Hématologie, Université Paris Diderot, Sorbonne Paris Cité, Paris F-75010, France.

出版信息

J Biomed Inform. 2016 Apr;60:252-9. doi: 10.1016/j.jbi.2016.02.008. Epub 2016 Feb 19.

DOI:10.1016/j.jbi.2016.02.008
PMID:26911523
Abstract

Text mining can assist in the analysis and interpretation of large-scale biomedical data, helping biologists to quickly and cheaply gain confirmation of hypothesized relationships between biological entities. We set this question in the context of genome-wide association studies (GWAS), an actively emerging field that contributed to identify many genes associated with multifactorial diseases. These studies allow to identify groups of genes associated with the same phenotype, but provide no information about the relationships between these genes. Therefore, our objective is to leverage unsupervised text mining techniques using text-based cosine similarity comparisons and clustering applied to candidate and random gene vectors, in order to augment the GWAS results. We propose a generic framework which we used to characterize the relationships between 10 genes reported associated with asthma by a previous GWAS. The results of this experiment showed that the similarities between these 10 genes were significantly stronger than would be expected by chance (one-sided p-value<0.01). The clustering of observed and randomly selected gene also allowed to generate hypotheses about potential functional relationships between these genes and thus contributed to the discovery of new candidate genes for asthma.

摘要

文本挖掘有助于对大规模生物医学数据进行分析和解读,帮助生物学家快速且低成本地确认生物实体之间假设关系的真实性。我们将这个问题置于全基因组关联研究(GWAS)的背景下,这是一个正在积极兴起的领域,它有助于识别许多与多因素疾病相关的基因。这些研究能够识别与同一表型相关的基因群组,但并未提供这些基因之间关系的信息。因此,我们的目标是利用无监督文本挖掘技术,通过基于文本的余弦相似度比较和应用于候选基因和随机基因向量的聚类,来增强GWAS的结果。我们提出了一个通用框架,并用它来表征先前一项GWAS报告的与哮喘相关的10个基因之间的关系。该实验结果表明,这10个基因之间的相似性显著强于随机预期(单侧p值<0.01)。对观察到的基因和随机选择的基因进行聚类,也能够生成关于这些基因之间潜在功能关系的假设,从而有助于发现哮喘的新候选基因。

相似文献

1
Unsupervised text mining for assessing and augmenting GWAS results.用于评估和增强全基因组关联研究结果的无监督文本挖掘
J Biomed Inform. 2016 Apr;60:252-9. doi: 10.1016/j.jbi.2016.02.008. Epub 2016 Feb 19.
2
Bridging heterogeneous mutation data to enhance disease gene discovery.桥接异质突变数据以增强疾病基因发现。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab079.
3
GWAS Integrator: a bioinformatics tool to explore human genetic associations reported in published genome-wide association studies.GWAS 整合器:一种生物信息学工具,用于探索已发表的全基因组关联研究中报告的人类遗传关联。
Eur J Hum Genet. 2011 Oct;19(10):1095-9. doi: 10.1038/ejhg.2011.91. Epub 2011 May 25.
4
Functional genomics of candidate genes derived from genome-wide association studies for five common neurological diseases.针对五种常见神经疾病的全基因组关联研究衍生候选基因的功能基因组学
Int J Neurosci. 2017 Feb;127(2):118-123. doi: 10.3109/00207454.2016.1149172. Epub 2016 Feb 17.
5
Mining Plant Genomic and Genetic Data Using the GnpIS Information System.使用GnpIS信息系统挖掘植物基因组和遗传数据。
Methods Mol Biol. 2017;1533:103-117. doi: 10.1007/978-1-4939-6658-5_5.
6
Text mining biomedical literature for constructing gene regulatory networks.从生物医学文献中挖掘文本构建基因调控网络。
Interdiscip Sci. 2009 Sep;1(3):179-86. doi: 10.1007/s12539-009-0028-7. Epub 2009 Aug 7.
7
Network.assisted analysis to prioritize GWAS results: principles, methods and perspectives.网络辅助分析优先化 GWAS 结果:原理、方法和观点。
Hum Genet. 2014 Feb;133(2):125-38. doi: 10.1007/s00439-013-1377-1.
8
Unsupervised discovery of information structure in biomedical documents.生物医学文献中信息结构的无监督发现。
Bioinformatics. 2015 Apr 1;31(7):1084-92. doi: 10.1093/bioinformatics/btu758. Epub 2014 Nov 18.
9
Text mining in livestock animal science: introducing the potential of text mining to animal sciences.文本挖掘在畜牧动物科学中的应用:介绍文本挖掘在动物科学中的应用潜力。
J Anim Sci. 2012 Oct;90(10):3666-76. doi: 10.2527/jas.2011-4841. Epub 2012 Jun 4.
10
Airway Epithelial Expression Quantitative Trait Loci Reveal Genes Underlying Asthma and Other Airway Diseases.气道上皮表达数量性状基因座揭示哮喘和其他气道疾病的潜在基因。
Am J Respir Cell Mol Biol. 2016 Feb;54(2):177-87. doi: 10.1165/rcmb.2014-0381OC.

引用本文的文献

1
Semantically defined subdomains of functional neuroimaging literature and their corresponding brain regions.功能神经影像学文献的语义定义子领域及其对应的大脑区域。
Hum Brain Mapp. 2018 Jul;39(7):2764-2776. doi: 10.1002/hbm.24038. Epub 2018 Mar 25.
2
The research on gene-disease association based on text-mining of PubMed.基于 PubMed 文本挖掘的基因-疾病关联研究。
BMC Bioinformatics. 2018 Feb 7;19(1):37. doi: 10.1186/s12859-018-2048-y.