Suppr超能文献

一种基于生物网络的正则化人工神经网络模型,用于从基因表达数据中进行稳健的表型预测。

A biological network-based regularized artificial neural network model for robust phenotype prediction from gene expression data.

作者信息

Kang Tianyu, Ding Wei, Zhang Luoyan, Ziemek Daniel, Zarringhalam Kourosh

机构信息

Department of Computer Science, University of Massachusetts Boston, 100 Morrissey Boulevard, Boston, 02125, MA, USA.

Inflammation and Immunology, Pfizer Worldwide Research & Development, Berlin, Germany.

出版信息

BMC Bioinformatics. 2017 Dec 19;18(1):565. doi: 10.1186/s12859-017-1984-2.

Abstract

BACKGROUND

Stratification of patient subpopulations that respond favorably to treatment or experience and adverse reaction is an essential step toward development of new personalized therapies and diagnostics. It is currently feasible to generate omic-scale biological measurements for all patients in a study, providing an opportunity for machine learning models to identify molecular markers for disease diagnosis and progression. However, the high variability of genetic background in human populations hampers the reproducibility of omic-scale markers. In this paper, we develop a biological network-based regularized artificial neural network model for prediction of phenotype from transcriptomic measurements in clinical trials. To improve model sparsity and the overall reproducibility of the model, we incorporate regularization for simultaneous shrinkage of gene sets based on active upstream regulatory mechanisms into the model.

RESULTS

We benchmark our method against various regression, support vector machines and artificial neural network models and demonstrate the ability of our method in predicting the clinical outcomes using clinical trial data on acute rejection in kidney transplantation and response to Infliximab in ulcerative colitis. We show that integration of prior biological knowledge into the classification as developed in this paper, significantly improves the robustness and generalizability of predictions to independent datasets. We provide a Java code of our algorithm along with a parsed version of the STRING DB database.

CONCLUSION

In summary, we present a method for prediction of clinical phenotypes using baseline genome-wide expression data that makes use of prior biological knowledge on gene-regulatory interactions in order to increase robustness and reproducibility of omic-scale markers. The integrated group-wise regularization methods increases the interpretability of biological signatures and gives stable performance estimates across independent test sets.

摘要

背景

对治疗反应良好或经历不良反应的患者亚群进行分层是开发新的个性化疗法和诊断方法的关键步骤。目前,在一项研究中为所有患者生成组学规模的生物学测量数据是可行的,这为机器学习模型识别疾病诊断和进展的分子标记提供了机会。然而,人类群体中遗传背景的高度变异性阻碍了组学规模标记的可重复性。在本文中,我们开发了一种基于生物网络的正则化人工神经网络模型,用于从临床试验中的转录组测量数据预测表型。为了提高模型的稀疏性和整体可重复性,我们将基于活跃上游调控机制的基因集同时收缩的正则化纳入模型。

结果

我们将我们的方法与各种回归、支持向量机和人工神经网络模型进行了基准测试,并使用肾移植急性排斥反应和溃疡性结肠炎中英夫利昔单抗反应的临床试验数据证明了我们的方法在预测临床结果方面的能力。我们表明,如本文所开发的那样,将先验生物学知识整合到分类中,显著提高了对独立数据集预测的稳健性和通用性。我们提供了算法的Java代码以及STRING DB数据库的解析版本。

结论

总之,我们提出了一种使用基线全基因组表达数据预测临床表型的方法,该方法利用了关于基因调控相互作用的先验生物学知识,以提高组学规模标记的稳健性和可重复性。集成的分组正则化方法提高了生物学特征的可解释性,并在独立测试集中给出了稳定的性能估计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15d0/5735940/9efad12e5cbc/12859_2017_1984_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验