Suppr超能文献

EvoLSTM:使用序列到序列 LSTM 的序列进化的上下文相关模型。

EvoLSTM: context-dependent models of sequence evolution using a sequence-to-sequence LSTM.

机构信息

School of Computer Science, McGill University, Montreal, Quebec H3A 0G4, Canada.

出版信息

Bioinformatics. 2020 Jul 1;36(Suppl_1):i353-i361. doi: 10.1093/bioinformatics/btaa447.

Abstract

MOTIVATION

Accurate probabilistic models of sequence evolution are essential for a wide variety of bioinformatics tasks, including sequence alignment and phylogenetic inference. The ability to realistically simulate sequence evolution is also at the core of many benchmarking strategies. Yet, mutational processes have complex context dependencies that remain poorly modeled and understood.

RESULTS

We introduce EvoLSTM, a recurrent neural network-based evolution simulator that captures mutational context dependencies. EvoLSTM uses a sequence-to-sequence long short-term memory model trained to predict mutation probabilities at each position of a given sequence, taking into consideration the 14 flanking nucleotides. EvoLSTM can realistically simulate mammalian and plant DNA sequence evolution and reveals unexpectedly strong long-range context dependencies in mutation probabilities. EvoLSTM brings modern machine-learning approaches to bear on sequence evolution. It will serve as a useful tool to study and simulate complex mutational processes.

AVAILABILITY AND IMPLEMENTATION

Code and dataset are available at https://github.com/DongjoonLim/EvoLSTM.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

准确的序列进化概率模型对于各种生物信息学任务至关重要,包括序列比对和系统发育推断。真实模拟序列进化的能力也是许多基准测试策略的核心。然而,突变过程具有复杂的上下文依赖关系,这些关系仍然建模和理解得很差。

结果

我们引入了 EvoLSTM,这是一种基于递归神经网络的进化模拟器,它可以捕获突变的上下文依赖关系。EvoLSTM 使用序列到序列的长短时记忆模型进行训练,以预测给定序列中每个位置的突变概率,同时考虑到 14 个侧翼核苷酸。EvoLSTM 可以真实地模拟哺乳动物和植物 DNA 序列进化,并揭示出突变概率中出人意料的强远程上下文依赖关系。EvoLSTM 将现代机器学习方法应用于序列进化。它将成为研究和模拟复杂突变过程的有用工具。

可用性和实现

代码和数据集可在 https://github.com/DongjoonLim/EvoLSTM 上获得。

补充信息

补充数据可在Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb83/7355264/d4ca97ffe8d7/btaa447f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验