Suppr超能文献

通过预测开放染色质中的组织特异性差异来推断哺乳动物组织特异性调控保守性。

Inferring mammalian tissue-specific regulatory conservation by predicting tissue-specific differences in open chromatin.

机构信息

Department of Computational Biology, Carnegie Mellon University, Pittsburgh, PA, USA.

Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA, USA.

出版信息

BMC Genomics. 2022 Apr 11;23(1):291. doi: 10.1186/s12864-022-08450-7.

Abstract

BACKGROUND

Evolutionary conservation is an invaluable tool for inferring functional significance in the genome, including regions that are crucial across many species and those that have undergone convergent evolution. Computational methods to test for sequence conservation are dominated by algorithms that examine the ability of one or more nucleotides to align across large evolutionary distances. While these nucleotide alignment-based approaches have proven powerful for protein-coding genes and some non-coding elements, they fail to capture conservation of many enhancers, distal regulatory elements that control spatial and temporal patterns of gene expression. The function of enhancers is governed by a complex, often tissue- and cell type-specific code that links combinations of transcription factor binding sites and other regulation-related sequence patterns to regulatory activity. Thus, function of orthologous enhancer regions can be conserved across large evolutionary distances, even when nucleotide turnover is high.

RESULTS

We present a new machine learning-based approach for evaluating enhancer conservation that leverages the combinatorial sequence code of enhancer activity rather than relying on the alignment of individual nucleotides. We first train a convolutional neural network model that can predict tissue-specific open chromatin, a proxy for enhancer activity, across mammals. Next, we apply that model to distinguish instances where the genome sequence would predict conserved function versus a loss of regulatory activity in that tissue. We present criteria for systematically evaluating model performance for this task and use them to demonstrate that our models accurately predict tissue-specific conservation and divergence in open chromatin between primate and rodent species, vastly out-performing leading nucleotide alignment-based approaches. We then apply our models to predict open chromatin at orthologs of brain and liver open chromatin regions across hundreds of mammals and find that brain enhancers associated with neuron activity have a stronger tendency than the general population to have predicted lineage-specific open chromatin.

CONCLUSION

The framework presented here provides a mechanism to annotate tissue-specific regulatory function across hundreds of genomes and to study enhancer evolution using predicted regulatory differences rather than nucleotide-level conservation measurements.

摘要

背景

进化保守性是推断基因组功能意义的宝贵工具,包括在许多物种中至关重要的区域和经历趋同进化的区域。用于测试序列保守性的计算方法主要由算法主导,这些算法检查一个或多个核苷酸在大进化距离上对齐的能力。虽然这些基于核苷酸对齐的方法已被证明对蛋白质编码基因和一些非编码元件非常有效,但它们无法捕捉到许多增强子的保守性,增强子是控制基因表达时空模式的远端调控元件。增强子的功能受一种复杂的、通常是组织和细胞类型特异性的调控,它将转录因子结合位点和其他与调控相关的序列模式的组合与调控活性联系起来。因此,即使核苷酸更替率很高,同源增强子区域的功能也可以在大的进化距离上保守。

结果

我们提出了一种新的基于机器学习的评估增强子保守性的方法,该方法利用增强子活性的组合序列代码,而不是依赖于单个核苷酸的对齐。我们首先训练一个卷积神经网络模型,该模型可以预测哺乳动物中组织特异性的开放染色质,这是增强子活性的一个代理。接下来,我们应用该模型来区分基因组序列预测的保守功能与该组织中失去调控活性的情况。我们提出了用于系统评估该任务的模型性能的标准,并使用它们证明我们的模型可以准确地预测灵长类和啮齿类动物之间组织特异性的开放染色质的保守性和分化,远远超过了领先的基于核苷酸对齐的方法。然后,我们将我们的模型应用于预测数百种哺乳动物中脑和肝开放染色质区域的同源开放染色质,发现与神经元活动相关的脑增强子比一般群体更倾向于具有预测的谱系特异性开放染色质。

结论

这里提出的框架提供了一种机制,可以在数百个基因组中注释组织特异性的调控功能,并使用预测的调控差异而不是核苷酸水平的保守性测量来研究增强子进化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/62a6/8996547/6bb7912ba975/12864_2022_8450_Fig1_HTML.jpg

相似文献

2
Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties.
PLoS Comput Biol. 2018 Oct 4;14(10):e1006484. doi: 10.1371/journal.pcbi.1006484. eCollection 2018 Oct.
4
Systematic dissection of regulatory motifs in 2000 predicted human enhancers using a massively parallel reporter assay.
Genome Res. 2013 May;23(5):800-11. doi: 10.1101/gr.144899.112. Epub 2013 Mar 19.
5
DeepCAPE: A Deep Convolutional Neural Network for the Accurate Prediction of Enhancers.
Genomics Proteomics Bioinformatics. 2021 Aug;19(4):565-577. doi: 10.1016/j.gpb.2019.04.006. Epub 2021 Feb 11.
6
Evolution of embryonic cis-regulatory landscapes between divergent Phallusia and Ciona ascidians.
Dev Biol. 2019 Apr 15;448(2):71-87. doi: 10.1016/j.ydbio.2019.01.003. Epub 2019 Jan 17.
7
9
Functional tests of enhancer conservation between distantly related species.
Development. 2003 Nov;130(21):5133-42. doi: 10.1242/dev.00711. Epub 2003 Aug 27.
10
Integrating diverse datasets improves developmental enhancer prediction.
PLoS Comput Biol. 2014 Jun 26;10(6):e1003677. doi: 10.1371/journal.pcbi.1003677. eCollection 2014 Jun.

引用本文的文献

1
Context-dependent regulatory variants in Alzheimer's disease.
bioRxiv. 2025 Jul 24:2025.07.11.659973. doi: 10.1101/2025.07.11.659973.
2
Combining Machine Learning and Multiplexed, Profiling to Engineer Cell Type and Behavioral Specificity.
bioRxiv. 2025 Jun 21:2025.06.20.660790. doi: 10.1101/2025.06.20.660790.
3
Evaluating methods for the prediction of cell-type-specific enhancers in the mammalian cortex.
Cell Genom. 2025 Jun 11;5(6):100879. doi: 10.1016/j.xgen.2025.100879. Epub 2025 May 21.
4
An systemic massively parallel platform for deciphering animal tissue-specific regulatory function.
Front Genet. 2025 Apr 9;16:1533900. doi: 10.3389/fgene.2025.1533900. eCollection 2025.
6
Novelty versus innovation of gene regulatory elements in human evolution and disease.
Curr Opin Genet Dev. 2025 Feb;90:102279. doi: 10.1016/j.gde.2024.102279. Epub 2024 Nov 26.
7
Spatial, transcriptomic, and epigenomic analyses link dorsal horn neurons to chronic pain genetic predisposition.
Cell Rep. 2024 Nov 26;43(11):114876. doi: 10.1016/j.celrep.2024.114876. Epub 2024 Oct 24.
8
A community effort to optimize sequence-based deep learning models of gene regulation.
Nat Biotechnol. 2024 Oct 11. doi: 10.1038/s41587-024-02414-w.
9
Reconstructing human-specific regulatory functions in model systems.
Curr Opin Genet Dev. 2024 Dec;89:102259. doi: 10.1016/j.gde.2024.102259. Epub 2024 Sep 12.
10
Evaluating Methods for the Prediction of Cell Type-Specific Enhancers in the Mammalian Cortex.
bioRxiv. 2025 Mar 25:2024.08.21.609075. doi: 10.1101/2024.08.21.609075.

本文引用的文献

3
Addiction-Associated Genetic Variants Implicate Brain Cell Type- and Region-Specific Cis-Regulatory Elements in Addiction Neurobiology.
J Neurosci. 2021 Oct 27;41(43):9008-9030. doi: 10.1523/JNEUROSCI.2534-20.2021. Epub 2021 Aug 30.
4
Towards complete and error-free genome assemblies of all vertebrate species.
Nature. 2021 Apr;592(7856):737-746. doi: 10.1038/s41586-021-03451-0. Epub 2021 Apr 28.
5
Modeling transcriptional regulation of model species with deep learning.
Genome Res. 2021 Jun;31(6):1097-1105. doi: 10.1101/gr.266171.120. Epub 2021 Apr 22.
6
Enhancer grammar in development, evolution, and disease: dependencies and interplay.
Dev Cell. 2021 Mar 8;56(5):575-587. doi: 10.1016/j.devcel.2021.02.016.
9
A comparative genomics multitool for scientific discovery and conservation.
Nature. 2020 Nov;587(7833):240-245. doi: 10.1038/s41586-020-2876-6. Epub 2020 Nov 11.
10
Progressive Cactus is a multiple-genome aligner for the thousand-genome era.
Nature. 2020 Nov;587(7833):246-251. doi: 10.1038/s41586-020-2871-y. Epub 2020 Nov 11.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验