利用 APARENT2 破译遗传变异对人类多聚腺苷酸化的影响。

Deciphering the impact of genetic variation on human polyadenylation using APARENT2.

机构信息

Department of Genetics, Stanford University, Stanford, USA.

Department of Bioengineering, University of Washington, Seattle, USA.

出版信息

Genome Biol. 2022 Nov 5;23(1):232. doi: 10.1186/s13059-022-02799-4.

DOI:10.1186/s13059-022-02799-4

PMID:36335397

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9636789/

Abstract

BACKGROUND

3'-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging.

RESULTS

We introduce a residual neural network model, APARENT2, that can infer 3'-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2's performance on several variant datasets, including functional reporter data and human 3' aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3' untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of [Formula: see text] million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3'-end and autism spectrum disorder. To experimentally validate APARENT2's predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells.

CONCLUSIONS

A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3'-end mutations and human health.

摘要

背景

通过切割和多聚腺苷酸化进行 3'-末端加工是 mRNA 成熟过程中一个重要且精细调节的调控过程。许多遗传变异已知通过破坏多聚腺苷酸化信号的顺式调控密码而导致或促成人类疾病。然而，由于该密码的复杂性，变异解释仍然具有挑战性。

结果

我们引入了一个残差神经网络模型 APARENT2，该模型可以比以前的任何模型更准确地从 DNA 序列推断 3'-切割和多聚腺苷酸化。该模型可推广到可变数量的多聚腺苷酸化信号的替代多聚腺苷酸化（APA）情况。我们在几个变体数据集上展示了 APARENT2 的性能，包括功能性报告基因数据和 GTEx 中的人类 3' aQTL。我们应用神经网络解释方法来深入了解多聚腺苷酸化的破坏或保护的高阶特征。我们在人类组织解析转录组数据上对 APARENT2 进行微调，以阐明组织特异性变体效应。通过将 APARENT2 与 mRNA 稳定性模型相结合，我们将 aQTL 效应大小预测扩展到整个 3'非翻译区。最后，我们对所有人类多聚腺苷酸化信号进行了计算机模拟饱和诱变，并将 [Formula: see text] 百万个变体的预测效应与 gnomAD 进行了比较。虽然失活功能的变异通常被选择，但我们也发现了与获得性功能突变相关的特定临床情况。例如，我们在自闭症谱系障碍中检测到 3'末端获得性功能突变与疾病的关联。为了实验验证 APARENT2 的预测，我们在包括小神经胶质衍生细胞在内的多个细胞系中检测了临床相关变体。

结论

基于深度残差学习的序列到功能模型能够对多聚腺苷酸化信号中的遗传变异进行准确的功能解释，并且当与大型人类变异数据库结合使用时，可以阐明功能 3'-末端突变与人类健康之间的联系。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2360/9636789/0413375fd310/13059_2022_2799_Fig1_HTML.jpg

相似文献

Deciphering the impact of genetic variation on human polyadenylation using APARENT2.利用 APARENT2 破译遗传变异对人类多聚腺苷酸化的影响。

Genome Biol. 2022 Nov 5;23(1):232. doi: 10.1186/s13059-022-02799-4.

A Deep Neural Network for Predicting and Engineering Alternative Polyadenylation.一种用于预测和工程可变多聚腺苷酸化的深度神经网络。

Cell. 2019 Jun 27;178(1):91-106.e23. doi: 10.1016/j.cell.2019.04.046. Epub 2019 Jun 6.

Implications of polyadenylation in health and disease.多聚腺苷酸化在健康与疾病中的意义。

Nucleus. 2014;5(6):508-19. doi: 10.4161/nucl.36360. Epub 2014 Oct 31.

Inference of the human polyadenylation code.人类多聚腺苷酸化代码推断。

Bioinformatics. 2018 Sep 1;34(17):2889-2898. doi: 10.1093/bioinformatics/bty211.

Genome-wide identification and predictive modeling of tissue-specific alternative polyadenylation.全基因组鉴定和组织特异性可变多聚腺苷酸化的预测建模。

Bioinformatics. 2013 Jul 1;29(13):i108-16. doi: 10.1093/bioinformatics/btt233.

Emerging roles of alternative cleavage and polyadenylation (APA) in human disease.可变剪接和多聚腺苷酸化（APA）在人类疾病中的新作用。

J Cell Physiol. 2022 Jan;237(1):149-160. doi: 10.1002/jcp.30549. Epub 2021 Aug 11.

APA-Scan: detection and visualization of 3'-UTR alternative polyadenylation with RNA-seq and 3'-end-seq data.APA-Scan：利用 RNA-seq 和 3'-端测序数据检测和可视化 3'-UTR 可变多聚腺苷酸化

BMC Bioinformatics. 2022 Sep 28;23(Suppl 3):396. doi: 10.1186/s12859-022-04939-w.

Systematic identification of functional SNPs interrupting 3'UTR polyadenylation signals.系统识别中断 3'UTR 多聚腺苷酸化信号的功能 SNPs。

PLoS Genet. 2020 Aug 17;16(8):e1008977. doi: 10.1371/journal.pgen.1008977. eCollection 2020 Aug.

Alternative cleavage and polyadenylation in spermatogenesis connects chromatin regulation with post-transcriptional control.精子发生过程中的可变切割和多聚腺苷酸化将染色质调控与转录后控制联系起来。

BMC Biol. 2016 Jan 22;14:6. doi: 10.1186/s12915-016-0229-6.

Systematic profiling of poly(A)+ transcripts modulated by core 3' end processing and splicing factors reveals regulatory rules of alternative cleavage and polyadenylation.由核心3'端加工和剪接因子调控的多聚腺苷酸（poly(A)）+转录本的系统分析揭示了可变切割和多聚腺苷酸化的调控规则。

PLoS Genet. 2015 Apr 23;11(4):e1005166. doi: 10.1371/journal.pgen.1005166. eCollection 2015 Apr.

引用本文的文献

scTail: precise polyadenylation site detection and its alternative usage analysis from reads 1 preserved 3' scRNA-seq data.scTail：从保留3'端的单细胞RNA测序数据的读段1中进行精确的多聚腺苷酸化位点检测及其可变使用分析。

Genome Biol. 2025 Aug 7;26(1):236. doi: 10.1186/s13059-025-03710-7.

PAL-AI reveals genetic determinants that control poly(A)-tail length during oocyte maturation, with relevance to human fertility.PAL-AI揭示了在卵母细胞成熟过程中控制多聚腺苷酸尾长度的遗传决定因素，这与人类生育能力相关。

Nat Commun. 2025 Aug 1;16(1):7079. doi: 10.1038/s41467-025-62171-5.

The Advances in Deep Learning Modeling of Polyadenylation Codes.聚腺苷酸化编码的深度学习建模进展

Wiley Interdiscip Rev RNA. 2025 May-Jun;16(3):e70017. doi: 10.1002/wrna.70017.

Generative and predictive neural networks for the design of functional RNA molecules.用于功能性RNA分子设计的生成式和预测性神经网络。

Nat Commun. 2025 May 4;16(1):4155. doi: 10.1038/s41467-025-59389-8.

Genetic Regulation of Alternative Polyadenylation Provides Novel Insights into Molecular Mechanisms Underlying Non-small Cell Lung Cancer.可变聚腺苷酸化的遗传调控为非小细胞肺癌潜在分子机制提供了新见解。

Adv Sci (Weinh). 2025 Jul;12(26):e2502008. doi: 10.1002/advs.202502008. Epub 2025 Apr 26.

A nuclear RNA degradation code is recognized by PAXT for eukaryotic transcriptome surveillance.PAXT识别出一种用于真核转录组监测的核RNA降解密码。

Mol Cell. 2025 Apr 17;85(8):1575-1588.e9. doi: 10.1016/j.molcel.2025.03.010. Epub 2025 Apr 4.

Impact of rare non-coding variants on human diseases through alternative polyadenylation outliers.罕见非编码变异通过可变聚腺苷酸化异常值对人类疾病的影响。

Nat Commun. 2025 Jan 16;16(1):682. doi: 10.1038/s41467-024-55407-3.

RBBP6 anchors pre-mRNA 3' end processing to nuclear speckles for efficient gene expression.RBBP6将前体mRNA 3'末端加工锚定到核斑点以实现高效基因表达。

Mol Cell. 2025 Feb 6;85(3):555-570.e8. doi: 10.1016/j.molcel.2024.12.016. Epub 2025 Jan 10.

Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation.将DNA序列预测RNA测序覆盖度作为基因调控的统一模型。

Nat Genet. 2025 Apr;57(4):949-961. doi: 10.1038/s41588-024-02053-6. Epub 2025 Jan 8.

Active learning of enhancers and silencers in the developing neural retina.发育中的神经视网膜中增强子和沉默子的主动学习

Cell Syst. 2025 Jan 15;16(1):101163. doi: 10.1016/j.cels.2024.12.004. Epub 2025 Jan 7.

本文引用的文献

The genetic and biochemical determinants of mRNA degradation rates in mammals.哺乳动物中 mRNA 降解速率的遗传和生化决定因素。

Genome Biol. 2022 Nov 23;23(1):245. doi: 10.1186/s13059-022-02811-x.

Interpreting Neural Networks for Biological Sequences by Learning Stochastic Masks.通过学习随机掩码来解释生物序列的神经网络。

Nat Mach Intell. 2022 Jan;4(1):41-54. doi: 10.1038/s42256-021-00428-6. Epub 2022 Jan 25.

A compendium of uniformly processed human gene expression and splicing quantitative trait loci.人类基因表达和剪接数量性状位点的综合分析。

Nat Genet. 2021 Sep;53(9):1290-1299. doi: 10.1038/s41588-021-00924-w. Epub 2021 Sep 6.

3'aQTL-atlas: an atlas of 3'UTR alternative polyadenylation quantitative trait loci across human normal tissues.3'aQTL-atlas：人类正常组织中 3'UTR 可变多聚腺苷酸化数量性状位点图谱。

Nucleic Acids Res. 2022 Jan 7;50(D1):D39-D45. doi: 10.1093/nar/gkab740.

Predicting enhancer-promoter interaction from genomic sequence with deep neural networks.利用深度神经网络从基因组序列预测增强子-启动子相互作用。

Quant Biol. 2019 Jun;7(2):122-137. doi: 10.1007/s40484-019-0154-0.

Interpretation of deep learning in genomics and epigenomics.深度学习在基因组学和表观基因组学中的应用。

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa177.

An atlas of alternative polyadenylation quantitative trait loci contributing to complex trait and disease heritability.一个替代性多聚腺苷酸化数量性状基因座图谱，有助于复杂性状和疾病遗传率。

Nat Genet. 2021 Jul;53(7):994-1005. doi: 10.1038/s41588-021-00864-5. Epub 2021 May 13.

Systematic evaluation of the effect of polyadenylation signal variants on the expression of disease-associated genes.系统评估多聚腺苷酸化信号变异对疾病相关基因表达的影响。

Genome Res. 2021 May;31(5):890-899. doi: 10.1101/gr.270256.120. Epub 2021 Apr 19.

MTSplice predicts effects of genetic variants on tissue-specific splicing.MTSplice 预测遗传变异对组织特异性剪接的影响。

Genome Biol. 2021 Mar 31;22(1):94. doi: 10.1186/s13059-021-02273-7.

DeeReCT-APA: Prediction of Alternative Polyadenylation Site Usage Through Deep Learning.DeeReCT-APA：通过深度学习预测可变聚腺苷酸化位点的使用情况

Genomics Proteomics Bioinformatics. 2022 Jun;20(3):483-495. doi: 10.1016/j.gpb.2020.05.004. Epub 2021 Mar 2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用 APARENT2 破译遗传变异对人类多聚腺苷酸化的影响。

Deciphering the impact of genetic variation on human polyadenylation using APARENT2.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献