使用深度时间卷积网络预测突变效应。

Prediction of mutation effects using a deep temporal convolutional network.

机构信息

Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea.

出版信息

Bioinformatics. 2020 Apr 1;36(7):2047-2052. doi: 10.1093/bioinformatics/btz873.

DOI:10.1093/bioinformatics/btz873

PMID:31746978

Abstract

MOTIVATION

Accurate prediction of the effects of genetic variation is a major goal in biological research. Towards this goal, numerous machine learning models have been developed to learn information from evolutionary sequence data. The most effective method so far is a deep generative model based on the variational autoencoder (VAE) that models the distributions using a latent variable. In this study, we propose a deep autoregressive generative model named mutationTCN, which employs dilated causal convolutions and attention mechanism for the modeling of inter-residue correlations in a biological sequence.

RESULTS

We show that this model is competitive with the VAE model when tested against a set of 42 high-throughput mutation scan experiments, with the mean improvement in Spearman rank correlation ∼0.023. In particular, our model can more efficiently capture information from multiple sequence alignments with lower effective number of sequences, such as in viral sequence families, compared with the latent variable model. Also, we extend this architecture to a semi-supervised learning framework, which shows high prediction accuracy. We show that our model enables a direct optimization of the data likelihood and allows for a simple and stable training process.

AVAILABILITY AND IMPLEMENTATION

Source code is available at https://github.com/ha01994/mutationTCN.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

准确预测遗传变异的影响是生物研究的主要目标。为此，已经开发了许多机器学习模型来从进化序列数据中学习信息。到目前为止，最有效的方法是基于变分自动编码器（VAE）的深度生成模型，该模型使用潜在变量来建模分布。在这项研究中，我们提出了一种名为 mutationTCN 的深度自回归生成模型，它采用扩张因果卷积和注意力机制来对生物序列中的残基间相关性进行建模。

结果

我们表明，当在一组 42 种高通量突变扫描实验中进行测试时，该模型与 VAE 模型具有竞争力，Spearman 秩相关系数的平均提高约为 0.023。特别是，与潜在变量模型相比，我们的模型可以更有效地从具有较低有效序列数的多序列比对中捕获信息，例如在病毒序列家族中。此外，我们将该架构扩展到半监督学习框架，该框架显示出很高的预测准确性。我们表明，我们的模型可以直接优化数据似然度，并允许简单和稳定的训练过程。

可用性和实现

源代码可在 https://github.com/ha01994/mutationTCN 获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

Prediction of mutation effects using a deep temporal convolutional network.

Bioinformatics. 2020 Apr 1;36(7):2047-2052. doi: 10.1093/bioinformatics/btz873.

Dr.VAE: improving drug response prediction via modeling of drug perturbation effects.

Bioinformatics. 2019 Oct 1;35(19):3743-3751. doi: 10.1093/bioinformatics/btz158.

A deep dilated convolutional residual network for predicting interchain contacts of protein homodimers.

Bioinformatics. 2022 Mar 28;38(7):1904-1910. doi: 10.1093/bioinformatics/btac063.

DeepPhos: prediction of protein phosphorylation sites with deep learning.

Bioinformatics. 2019 Aug 15;35(16):2766-2773. doi: 10.1093/bioinformatics/bty1051.

ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks.

Bioinformatics. 2019 Nov 1;35(22):4647-4655. doi: 10.1093/bioinformatics/btz291.

DNCON2: improved protein contact prediction using two-level deep convolutional neural networks.

Bioinformatics. 2018 May 1;34(9):1466-1472. doi: 10.1093/bioinformatics/btx781.

Sequence alignment using machine learning for accurate template-based protein structure prediction.

Bioinformatics. 2020 Jan 1;36(1):104-111. doi: 10.1093/bioinformatics/btz483.

Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks.

Bioinformatics. 2018 Apr 15;34(8):1261-1269. doi: 10.1093/bioinformatics/btx727.

A deep learning architecture for metabolic pathway prediction.

Bioinformatics. 2020 Apr 15;36(8):2547-2553. doi: 10.1093/bioinformatics/btz954.

DeepECA: an end-to-end learning framework for protein contact prediction from a multiple sequence alignment.

BMC Bioinformatics. 2020 Jan 9;21(1):10. doi: 10.1186/s12859-019-3190-x.

引用本文的文献

Variant effect predictor correlation with functional assays is reflective of clinical classification performance.

Genome Biol. 2025 Apr 22;26(1):104. doi: 10.1186/s13059-025-03575-w.

QAFI: a novel method for quantitative estimation of missense variant impact using protein-specific predictors and ensemble learning.

Hum Genet. 2025 Mar;144(2-3):191-208. doi: 10.1007/s00439-024-02692-z. Epub 2024 Jul 24.

Searching for protein variants with desired properties using deep generative models.

BMC Bioinformatics. 2023 Jul 21;24(1):297. doi: 10.1186/s12859-023-05415-9.

Updated benchmarking of variant effect predictors using deep mutational scanning.

Mol Syst Biol. 2023 Aug 8;19(8):e11474. doi: 10.15252/msb.202211474. Epub 2023 Jun 13.

Characterization of RNA polymerase II trigger loop mutations using molecular dynamics simulations and machine learning.

PLoS Comput Biol. 2023 Mar 22;19(3):e1010999. doi: 10.1371/journal.pcbi.1010999. eCollection 2023 Mar.

Recent Advances in Machine Learning Variant Effect Prediction Tools for Protein Engineering.

Ind Eng Chem Res. 2022 May 18;61(19):6235-6245. doi: 10.1021/acs.iecr.1c04943. Epub 2022 Apr 6.

An enhanced variant effect predictor based on a deep generative model and the Born-Again Networks.

Sci Rep. 2021 Sep 27;11(1):19127. doi: 10.1038/s41598-021-98693-3.

Machine and Deep Learning in Molecular and Genetic Aspects of Sleep Research.

Neurotherapeutics. 2021 Jan;18(1):228-243. doi: 10.1007/s13311-021-01014-9. Epub 2021 Apr 7.

Deep Learning in Proteomics.

Proteomics. 2020 Nov;20(21-22):e1900335. doi: 10.1002/pmic.201900335. Epub 2020 Oct 30.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用深度时间卷积网络预测突变效应。

Prediction of mutation effects using a deep temporal convolutional network.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献