通过压缩数据融合和链接进行基因优先级排序。

Gene Prioritization by Compressive Data Fusion and Chaining.

作者信息

Žitnik Marinka, Nam Edward A, Dinh Christopher, Kuspa Adam, Shaulsky Gad, Zupan Blaž

机构信息

Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia.

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America.

出版信息

PLoS Comput Biol. 2015 Oct 14;11(10):e1004552. doi: 10.1371/journal.pcbi.1004552. eCollection 2015 Oct.

DOI:10.1371/journal.pcbi.1004552

PMID:26465776

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4605714/

Abstract

Data integration procedures combine heterogeneous data sets into predictive models, but they are limited to data explicitly related to the target object type, such as genes. Collage is a new data fusion approach to gene prioritization. It considers data sets of various association levels with the prediction task, utilizes collective matrix factorization to compress the data, and chaining to relate different object types contained in a data compendium. Collage prioritizes genes based on their similarity to several seed genes. We tested Collage by prioritizing bacterial response genes in Dictyostelium as a novel model system for prokaryote-eukaryote interactions. Using 4 seed genes and 14 data sets, only one of which was directly related to the bacterial response, Collage proposed 8 candidate genes that were readily validated as necessary for the response of Dictyostelium to Gram-negative bacteria. These findings establish Collage as a method for inferring biological knowledge from the integration of heterogeneous and coarsely related data sets.

摘要

数据整合程序将异构数据集组合成预测模型，但它们仅限于与目标对象类型（如基因）明确相关的数据。Collage是一种用于基因优先级排序的新数据融合方法。它考虑与预测任务具有不同关联水平的数据集，利用集体矩阵分解来压缩数据，并通过链接来关联数据集中包含的不同对象类型。Collage根据基因与几个种子基因的相似性对基因进行优先级排序。我们通过将盘基网柄菌中的细菌反应基因作为原核生物 - 真核生物相互作用的新型模型系统进行优先级排序来测试Collage。使用4个种子基因和14个数据集（其中只有一个与细菌反应直接相关），Collage提出了8个候选基因，这些基因很容易被验证为盘基网柄菌对革兰氏阴性菌反应所必需的。这些发现确立了Collage作为一种从异构和粗略相关的数据集中整合来推断生物学知识的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c4b/4605714/0bfee9451a6c/pcbi.1004552.g001.jpg

相似文献

Gene Prioritization by Compressive Data Fusion and Chaining.通过压缩数据融合和链接进行基因优先级排序。

PLoS Comput Biol. 2015 Oct 14;11(10):e1004552. doi: 10.1371/journal.pcbi.1004552. eCollection 2015 Oct.

Matrix factorization-based data fusion for gene function prediction in baker's yeast and slime mold.基于矩阵分解的数据融合用于面包酵母和黏菌中的基因功能预测

Pac Symp Biocomput. 2014:400-11.

A role for copper in protozoan grazing - two billion years selecting for bacterial copper resistance.铜在原生动物捕食中的作用——二十亿年对细菌铜抗性的选择。

Mol Microbiol. 2016 Nov;102(4):628-641. doi: 10.1111/mmi.13483. Epub 2016 Aug 31.

Eat, kill or die: when amoeba meets bacteria.进食、杀灭或死亡：当变形虫遇到细菌时。

Curr Opin Microbiol. 2008 Jun;11(3):271-6. doi: 10.1016/j.mib.2008.05.005. Epub 2008 Jun 10.

The Saposin-Like Protein AplD Displays Pore-Forming Activity and Participates in Defense Against Bacterial Infection During a Multicellular Stage of .Saposin-like 蛋白 AplD 显示出孔形成活性，并在的多细胞阶段参与防御细菌感染。

Front Cell Infect Microbiol. 2018 Mar 15;8:73. doi: 10.3389/fcimb.2018.00073. eCollection 2018.

A cysteine-rich extracellular protein containing a PA14 domain mediates quorum sensing in Dictyostelium discoideum.一种含有PA14结构域的富含半胱氨酸的细胞外蛋白介导了盘基网柄菌中的群体感应。

Eukaryot Cell. 2005 Jun;4(6):991-8. doi: 10.1128/EC.4.6.991-998.2005.

Dictyostelium discoideum Sir2D modulates cell-type specific gene expression and is involved in autophagy.盘基网柄菌Sir2D调节细胞类型特异性基因表达并参与自噬。

Int J Dev Biol. 2017;61(1-2):95-104. doi: 10.1387/ijdb.160038ss.

Dictyostelium transcriptional host cell response upon infection with Legionella.盘基网柄菌感染嗜肺军团菌后的转录宿主细胞反应

Cell Microbiol. 2006 Mar;8(3):438-56. doi: 10.1111/j.1462-5822.2005.00633.x.

A secreted factor represses cell proliferation in Dictyostelium.一种分泌因子可抑制盘基网柄菌中的细胞增殖。

Development. 2005 Oct;132(20):4553-62. doi: 10.1242/dev.02032. Epub 2005 Sep 21.

An immediate-early gene, srsA: its involvement in the starvation response that initiates differentiation of Dictyostelium cells.一种即早基因，srsA：其参与启动盘基网柄菌细胞分化的饥饿反应。

Differentiation. 2008 Dec;76(10):1093-103. doi: 10.1111/j.1432-0436.2008.00298.x. Epub 2008 Jul 30.

引用本文的文献

Improving drug repositioning accuracy using non-negative matrix tri-factorization.使用非负矩阵三因子分解提高药物重新定位的准确性。

Sci Rep. 2025 Mar 6;15(1):7840. doi: 10.1038/s41598-025-91757-8.

KGRDR: a deep learning model based on knowledge graph and graph regularized integration for drug repositioning.KGRDR：一种基于知识图谱和图正则化集成的用于药物重新定位的深度学习模型。

Front Pharmacol. 2025 Feb 11;16:1525029. doi: 10.3389/fphar.2025.1525029. eCollection 2025.

PLAS-20k: Extended Dataset of Protein-Ligand Affinities from MD Simulations for Machine Learning Applications.PLAS-20k：用于机器学习应用的 MD 模拟中蛋白质-配体亲和力的扩展数据集。

Sci Data. 2024 Feb 9;11(1):180. doi: 10.1038/s41597-023-02872-y.

Graph representation learning in biomedicine and healthcare.生物医学和医疗保健中的图表示学习。

Nat Biomed Eng. 2022 Dec;6(12):1353-1369. doi: 10.1038/s41551-022-00942-x. Epub 2022 Oct 31.

Disease gene prediction with privileged information and heteroscedastic dropout.利用特权信息和异方差失活进行疾病基因预测。

Bioinformatics. 2021 Jul 12;37(Suppl_1):i410-i417. doi: 10.1093/bioinformatics/btab310.

JAMIA Open. 2018 May 14;1(1):75-86. doi: 10.1093/jamiaopen/ooy008. eCollection 2018 Jul.

Genome-wide functional association networks: background, data & state-of-the-art resources.全基因组功能关联网络：背景、数据和最新资源。

Brief Bioinform. 2020 Jul 15;21(4):1224-1237. doi: 10.1093/bib/bbz064.

Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities.用于整合生物学和医学数据的机器学习：原理、实践与机遇

Inf Fusion. 2019 Oct;50:71-91. doi: 10.1016/j.inffus.2018.09.012. Epub 2018 Sep 21.

Gene prioritization using Bayesian matrix factorization with genomic and phenotypic side information.基于基因组和表型侧信息的贝叶斯矩阵分解基因优先级排序。

Bioinformatics. 2018 Jul 1;34(13):i447-i456. doi: 10.1093/bioinformatics/bty289.

Matrix Integrative Analysis (MIA) of Multiple Genomic Data for Modular Patterns.用于模块化模式的多基因组数据的矩阵整合分析（MIA）

Front Genet. 2018 May 29;9:194. doi: 10.3389/fgene.2018.00194. eCollection 2018.

本文引用的文献

Data Fusion by Matrix Factorization.矩阵分解的数据融合。

IEEE Trans Pattern Anal Mach Intell. 2015 Jan;37(1):41-53. doi: 10.1109/TPAMI.2014.2343973.

N-glycomic profiling of a glucosidase II mutant of Dictyostelium discoideum by ''off-line'' liquid chromatography and mass spectrometry.通过“离线”液相色谱和质谱对盘基网柄菌葡糖苷酶II突变体进行N-糖组分析

Electrophoresis. 2014 Aug;35(15):2116-29. doi: 10.1002/elps.201300612. Epub 2014 Mar 31.

Nat Methods. 2014 Mar;11(3):333-7. doi: 10.1038/nmeth.2810. Epub 2014 Jan 26.

Matrix factorization-based data fusion for gene function prediction in baker's yeast and slime mold.基于矩阵分解的数据融合用于面包酵母和黏菌中的基因功能预测

Pac Symp Biocomput. 2014:400-11.

The Reactome pathway knowledgebase.Reactome 通路知识库。

Nucleic Acids Res. 2014 Jan;42(Database issue):D472-7. doi: 10.1093/nar/gkt1102. Epub 2013 Nov 15.

Discovering disease-disease associations by fusing systems-level molecular data.通过融合系统水平的分子数据发现疾病-疾病关联。

Sci Rep. 2013 Nov 15;3:3202. doi: 10.1038/srep03202.

Data, information, knowledge and principle: back to metabolism in KEGG.数据、信息、知识和原理：回到 KEGG 的代谢途径中。

Nucleic Acids Res. 2014 Jan;42(Database issue):D199-205. doi: 10.1093/nar/gkt1076. Epub 2013 Nov 7.

eXtasy: variant prioritization by genomic data fusion.eXtasy：通过基因组数据融合进行变体优先级排序。

Nat Methods. 2013 Nov;10(11):1083-4. doi: 10.1038/nmeth.2656. Epub 2013 Sep 29.

ABC transporters in Dictyostelium discoideum development.ABC 转运蛋白在盘基网柄菌发育中的作用。

PLoS One. 2013 Aug 14;8(8):e70040. doi: 10.1371/journal.pone.0070040. eCollection 2013.

Bacterial discrimination by dictyostelid amoebae reveals the complexity of ancient interspecies interactions.粘菌变形虫对细菌的辨别揭示了古老种间相互作用的复杂性。

Curr Biol. 2013 May 20;23(10):862-72. doi: 10.1016/j.cub.2013.04.034. Epub 2013 May 9.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过压缩数据融合和链接进行基因优先级排序。

Gene Prioritization by Compressive Data Fusion and Chaining.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献