• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

已知的序列特征解释了所有人类基因末端的一半。

Known sequence features explain half of all human gene ends.

作者信息

Shkurin Aleksei, Pour Sara E, Hughes Timothy R

机构信息

Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada.

Terrence Donnelly Centre for Cellular & Biomolecular Research, Toronto, ON M5S 3E1, Canada.

出版信息

NAR Genom Bioinform. 2023 Apr 5;5(2):lqad031. doi: 10.1093/nargab/lqad031. eCollection 2023 Jun.

DOI:10.1093/nargab/lqad031
PMID:37035540
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10072996/
Abstract

Cleavage and polyadenylation (CPA) sites define eukaryotic gene ends. CPA sites are associated with five key sequence recognition elements: the upstream UGUA, the polyadenylation signal (PAS), and U-rich sequences; the CAUA dinucleotide where cleavage occurs; and GU-rich downstream elements (DSEs). Currently, it is not clear whether these sequences are sufficient to delineate CPA sites. Additionally, numerous other sequences and factors have been described, often in the context of promoting alternative CPA sites and preventing cryptic CPA site usage. Here, we dissect the contributions of individual sequence features to CPA using standard discriminative models. We show that models comprised only of the five primary CPA sequence features give highest probability scores to constitutive CPA sites at the ends of coding genes, relative to the entire pre-mRNA sequence, for 59% of all human genes. U1-hybridizing sequences provide a small boost in performance. The addition of all known RBP RNA binding motifs to the model increases this figure to only 61%, suggesting that additional factors beyond the core CPA machinery have a minimal role in delineating real from cryptic sites. To our knowledge, this high effectiveness of established features to predict human gene ends has not previously been documented.

摘要

切割与聚腺苷酸化(CPA)位点定义了真核基因的末端。CPA位点与五个关键序列识别元件相关:上游的UGUA、聚腺苷酸化信号(PAS)以及富含U的序列;发生切割的CAUA二核苷酸;以及富含GU的下游元件(DSE)。目前尚不清楚这些序列是否足以界定CPA位点。此外,人们还描述了许多其他序列和因子,这些通常是在促进可变CPA位点和防止隐蔽CPA位点使用的背景下进行的。在此,我们使用标准判别模型剖析了各个序列特征对CPA的贡献。我们发现,对于59%的人类基因,仅由五个主要CPA序列特征组成的模型,相对于整个前体mRNA序列,在编码基因末端的组成型CPA位点上给出的概率得分最高。U1杂交序列在性能上有小幅提升。将所有已知的RBP RNA结合基序添加到模型中,这一比例仅提高到61%,这表明除了核心CPA机制之外的其他因素在区分真实位点和隐蔽位点方面作用极小。据我们所知,既定特征在预测人类基因末端方面的这种高效性此前尚未有文献记载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/87fca4dfc190/lqad031fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/1269e3b70136/lqad031fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/8e32de8985af/lqad031fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/880b313213e5/lqad031fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/9927451de82e/lqad031fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/30118bcb479b/lqad031fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/87fca4dfc190/lqad031fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/1269e3b70136/lqad031fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/8e32de8985af/lqad031fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/880b313213e5/lqad031fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/9927451de82e/lqad031fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/30118bcb479b/lqad031fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f6bd/10072996/87fca4dfc190/lqad031fig6.jpg

相似文献

1
Known sequence features explain half of all human gene ends.已知的序列特征解释了所有人类基因末端的一半。
NAR Genom Bioinform. 2023 Apr 5;5(2):lqad031. doi: 10.1093/nargab/lqad031. eCollection 2023 Jun.
2
Known sequence features can explain half of all human gene ends.已知的序列特征可以解释所有人类基因末端的一半。
NAR Genom Bioinform. 2021 Jun 4;3(2):lqab042. doi: 10.1093/nargab/lqab042. eCollection 2021 Jun.
3
Analysis Polyadenylation Signal Usage in .分析……中的聚腺苷酸化信号使用情况
Animals (Basel). 2022 Jan 13;12(2):194. doi: 10.3390/ani12020194.
4
Comprehensive RNP profiling in cells identifies U1 snRNP complexes with cleavage and polyadenylation factors active in telescripting.在细胞中进行全面的 RNP 谱分析,鉴定出具有在远程转录中活跃的剪接和多聚腺苷酸化因子的 U1 snRNP 复合物。
Methods Enzymol. 2021;655:325-347. doi: 10.1016/bs.mie.2021.04.017. Epub 2021 May 25.
5
Implications of polyadenylation in health and disease.多聚腺苷酸化在健康与疾病中的意义。
Nucleus. 2014;5(6):508-19. doi: 10.4161/nucl.36360. Epub 2014 Oct 31.
6
Sequence determinants in human polyadenylation site selection.人类聚腺苷酸化位点选择中的序列决定因素。
BMC Genomics. 2003 Feb 25;4(1):7. doi: 10.1186/1471-2164-4-7.
7
The structure of human cleavage factor I(m) hints at functions beyond UGUA-specific RNA binding: a role in alternative polyadenylation and a potential link to 5' capping and splicing.人切割因子 I(m) 的结构暗示了其超越 UGUA 特异性 RNA 结合的功能:在可变多聚腺苷酸化和与 5' 加帽和剪接的潜在联系中发挥作用。
RNA Biol. 2011 Sep-Oct;8(5):748-53. doi: 10.4161/rna.8.5.16040. Epub 2011 Sep 1.
8
Recent molecular insights into canonical pre-mRNA 3'-end processing.近期对经典前体 mRNA 3'端加工的分子认识。
Transcription. 2020 Apr;11(2):83-96. doi: 10.1080/21541264.2020.1777047. Epub 2020 Jun 11.
9
Suboptimal RNA-RNA interaction limits U1 snRNP inhibition of canonical mRNA 3' processing.U1 snRNP 对典型 mRNA 3' 加工的抑制作用受到 RNA-RNA 相互作用的限制。
RNA Biol. 2019 Oct;16(10):1448-1460. doi: 10.1080/15476286.2019.1636596. Epub 2019 Jul 7.
10
Cleavage and polyadenylation machinery as a novel targetable vulnerability for human cancer.切割与聚腺苷酸化机制作为人类癌症一种新的可靶向性弱点。
Cancer Gene Ther. 2024 Jul;31(7):957-960. doi: 10.1038/s41417-024-00770-y. Epub 2024 Apr 17.

本文引用的文献

1
TREND-DB-a transcriptome-wide atlas of the dynamic landscape of alternative polyadenylation.TREND-DB—一个全转录组范围的可变多聚腺苷酸化动态景观图谱。
Nucleic Acids Res. 2021 Jan 8;49(D1):D243-D253. doi: 10.1093/nar/gkaa722.
2
Co-transcriptional Loading of RNA Export Factors Shapes the Human Transcriptome.RNA 输出因子的共转录加载塑造了人类转录组。
Mol Cell. 2019 Jul 25;75(2):310-323.e8. doi: 10.1016/j.molcel.2019.04.034. Epub 2019 May 16.
3
Functional Interaction between U1snRNP and Sam68 Insures Proper 3' End Pre-mRNA Processing during Germ Cell Differentiation.
U1snRNP 和 Sam68 之间的功能相互作用确保了生殖细胞分化过程中正确的 3' 端前体 mRNA 加工。
Cell Rep. 2019 Mar 12;26(11):2929-2941.e5. doi: 10.1016/j.celrep.2019.02.058.
4
Sequence, Structure, and Context Preferences of Human RNA Binding Proteins.人类 RNA 结合蛋白的序列、结构和上下文偏好。
Mol Cell. 2018 Jun 7;70(5):854-867.e9. doi: 10.1016/j.molcel.2018.05.001.
5
Inference of the human polyadenylation code.人类多聚腺苷酸化代码推断。
Bioinformatics. 2018 Sep 1;34(17):2889-2898. doi: 10.1093/bioinformatics/bty211.
6
PolyA_DB 3 catalogs cleavage and polyadenylation sites identified by deep sequencing in multiple genomes.PolyA_DB 3 目录编目了通过多种基因组的深度测序鉴定的剪接和多聚腺苷酸化位点。
Nucleic Acids Res. 2018 Jan 4;46(D1):D315-D319. doi: 10.1093/nar/gkx1000.
7
U1 snRNP telescripting regulates a size-function-stratified human genome.U1小核核糖核蛋白远程转录调控一个按大小-功能分层的人类基因组。
Nat Struct Mol Biol. 2017 Nov;24(11):993-999. doi: 10.1038/nsmb.3473. Epub 2017 Oct 2.
8
Cleavage and polyadenylation: Ending the message expands gene regulation.切割与聚腺苷酸化:终止信使RNA扩展基因调控。
RNA Biol. 2017 Jul 3;14(7):865-890. doi: 10.1080/15476286.2017.1306171. Epub 2017 Apr 28.
9
Distinctive Patterns of Transcription and RNA Processing for Human lincRNAs.人类长链非编码RNA独特的转录和RNA加工模式
Mol Cell. 2017 Jan 5;65(1):25-38. doi: 10.1016/j.molcel.2016.11.029. Epub 2016 Dec 22.
10
From IPEX syndrome to FOXP3 mutation: a lesson on immune dysregulation.从 IPEX 综合征到 FOXP3 突变:免疫失调的教训。
Ann N Y Acad Sci. 2018 Apr;1417(1):5-22. doi: 10.1111/nyas.13011. Epub 2016 Feb 25.