Suppr超能文献

用于基因表达预测的深度自动编码器模型。

A deep auto-encoder model for gene expression prediction.

机构信息

Department of Computer Science, University of Missouri at Columbia, Columbia, MO, USA.

Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, University City Blvd, Charlotte, NC, USA.

出版信息

BMC Genomics. 2017 Nov 17;18(Suppl 9):845. doi: 10.1186/s12864-017-4226-0.

Abstract

BACKGROUND

Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance.

RESULTS

To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns.

CONCLUSION

We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes' contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics.

摘要

背景

基因表达是基因型导致特定性状的关键中间水平。基因表达受多种因素的影响,包括遗传变异的基因型。为了描绘遗传对基因表达的影响,我们构建了一个深度自动编码器模型来评估遗传变异对基因表达变化的贡献程度。这个新的深度学习模型是一种基于多层感知机和堆叠去噪自动编码器(MLP-SAE)的回归预测模型。该模型使用堆叠去噪自动编码器进行特征选择,使用多层感知机框架进行反向传播进行训练。我们通过引入 dropout 进一步改进了模型,以防止过拟合并提高性能。

结果

为了演示该模型的使用,我们将 MLP-SAE 应用于具有酵母中测量的基因型和基因表达谱的真实基因组数据集。我们的结果表明,具有 dropout 的 MLP-SAE 模型优于其他模型,包括 Lasso、随机森林和没有 dropout 的 MLP-SAE 模型。使用具有 dropout 的 MLP-SAE 模型,我们表明仅基于基因型预测的模型的基因表达定量与真实基因表达模式非常吻合。

结论

我们提供了一种从 SNP 基因型预测基因表达的深度自动编码器模型。这项研究表明,深度学习适用于解决另一个基因组问题,即构建预测模型以了解基因型对基因表达的贡献。随着更丰富的基因组数据的出现,我们预计深度学习模型在建模和解释基因组学方面将发挥更大的作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb7e/5773895/138f08dc54f5/12864_2017_4226_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验