Suppr超能文献

通过整合长程相互作用,从序列中有效预测基因表达。

Effective gene expression prediction from sequence by integrating long-range interactions.

机构信息

DeepMind, London, UK.

Calico Life Sciences, South San Francisco, CA, USA.

出版信息

Nat Methods. 2021 Oct;18(10):1196-1203. doi: 10.1038/s41592-021-01252-x. Epub 2021 Oct 4.

Abstract

How noncoding DNA determines gene expression in different cell types is a major unsolved problem, and critical downstream applications in human genetics depend on improved solutions. Here, we report substantially improved gene expression prediction accuracy from DNA sequences through the use of a deep learning architecture, called Enformer, that is able to integrate information from long-range interactions (up to 100 kb away) in the genome. This improvement yielded more accurate variant effect predictions on gene expression for both natural genetic variants and saturation mutagenesis measured by massively parallel reporter assays. Furthermore, Enformer learned to predict enhancer-promoter interactions directly from the DNA sequence competitively with methods that take direct experimental data as input. We expect that these advances will enable more effective fine-mapping of human disease associations and provide a framework to interpret cis-regulatory evolution.

摘要

非编码 DNA 如何决定不同细胞类型中的基因表达是一个尚未解决的主要问题,而人类遗传学中关键的下游应用则依赖于改进的解决方案。在这里,我们通过使用一种称为 Enformer 的深度学习架构,报告了通过 DNA 序列大幅提高基因表达预测准确性的方法,该架构能够整合基因组中长距离相互作用(长达 100kb 远)的信息。这种改进提高了通过大规模平行报告基因实验测量的自然遗传变异和饱和诱变对基因表达的变体效应预测的准确性。此外,Enformer 学会了直接从 DNA 序列预测增强子-启动子相互作用,与那些采用直接实验数据作为输入的方法具有竞争力。我们预计这些进展将使人类疾病关联的精细映射更加有效,并为解释顺式调控进化提供一个框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/39ba/8490152/8206b186dc69/41592_2021_1252_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验