Suppr超能文献

通过压缩数据融合和链接进行基因优先级排序。

Gene Prioritization by Compressive Data Fusion and Chaining.

作者信息

Žitnik Marinka, Nam Edward A, Dinh Christopher, Kuspa Adam, Shaulsky Gad, Zupan Blaž

机构信息

Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia.

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America.

出版信息

PLoS Comput Biol. 2015 Oct 14;11(10):e1004552. doi: 10.1371/journal.pcbi.1004552. eCollection 2015 Oct.

Abstract

Data integration procedures combine heterogeneous data sets into predictive models, but they are limited to data explicitly related to the target object type, such as genes. Collage is a new data fusion approach to gene prioritization. It considers data sets of various association levels with the prediction task, utilizes collective matrix factorization to compress the data, and chaining to relate different object types contained in a data compendium. Collage prioritizes genes based on their similarity to several seed genes. We tested Collage by prioritizing bacterial response genes in Dictyostelium as a novel model system for prokaryote-eukaryote interactions. Using 4 seed genes and 14 data sets, only one of which was directly related to the bacterial response, Collage proposed 8 candidate genes that were readily validated as necessary for the response of Dictyostelium to Gram-negative bacteria. These findings establish Collage as a method for inferring biological knowledge from the integration of heterogeneous and coarsely related data sets.

摘要

数据整合程序将异构数据集组合成预测模型,但它们仅限于与目标对象类型(如基因)明确相关的数据。Collage是一种用于基因优先级排序的新数据融合方法。它考虑与预测任务具有不同关联水平的数据集,利用集体矩阵分解来压缩数据,并通过链接来关联数据集中包含的不同对象类型。Collage根据基因与几个种子基因的相似性对基因进行优先级排序。我们通过将盘基网柄菌中的细菌反应基因作为原核生物 - 真核生物相互作用的新型模型系统进行优先级排序来测试Collage。使用4个种子基因和14个数据集(其中只有一个与细菌反应直接相关),Collage提出了8个候选基因,这些基因很容易被验证为盘基网柄菌对革兰氏阴性菌反应所必需的。这些发现确立了Collage作为一种从异构和粗略相关的数据集中整合来推断生物学知识的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c4b/4605714/0bfee9451a6c/pcbi.1004552.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验