• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对非经典剪接位点的考虑改进了对拟南芥 Niederzenz-1 基因组序列的基因预测。

Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence.

作者信息

Pucker Boas, Holtgräwe Daniela, Weisshaar Bernd

机构信息

Faculty of Biology & Center for Biotechnology, Bielefeld University, Bielefeld, Germany.

出版信息

BMC Res Notes. 2017 Dec 4;10(1):667. doi: 10.1186/s13104-017-2985-y.

DOI:10.1186/s13104-017-2985-y
PMID:29202864
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5716242/
Abstract

OBJECTIVE

The Arabidopsis thaliana Niederzenz-1 genome sequence was recently published with an ab initio gene prediction. In depth analysis of the predicted gene set revealed some errors involving genes with non-canonical splice sites in their introns. Since non-canonical splice sites are difficult to predict ab initio, we checked for options to improve the annotation by transferring annotation information from the recently released Columbia-0 reference genome sequence annotation Araport11.

RESULTS

Incorporation of hints generated from Araport11 enabled the precise prediction of non-canonical splice sites. Manual inspection of RNA-Seq read mapping and RT-PCR were applied to validate the structural annotations of non-canonical splice sites. Predictions of untranslated regions were also updated by harnessing the potential of Araport11's information, which was generated by using high coverage RNA-Seq data. The improved gene set of the Nd-1 genome assembly (GeneSet_Nd-1_v1.1) was evaluated via comparison to the initial gene prediction (GeneSet_Nd-1_v1.0) as well as against Araport11 for the Col-0 reference genome sequence. GeneSet_Nd-1_v1.1 contains previously missed non-canonical splice sites in 1256 genes. Reciprocal best hits for 24,527 (89.4%) of all nuclear Col-0 genes against the GeneSet_Nd-1_v1.1 indicate a high gene prediction quality.

摘要

目的

拟南芥 Niederzenz-1 基因组序列最近已发表,并带有从头开始的基因预测。对预测的基因集进行深入分析后发现了一些错误,这些错误涉及内含子中具有非规范剪接位点的基因。由于非规范剪接位点难以从头开始预测,我们检查了通过从最近发布的哥伦比亚-0 参考基因组序列注释 Araport11 转移注释信息来改进注释的选项。

结果

纳入从 Araport11 生成的提示能够精确预测非规范剪接位点。应用 RNA-Seq 读段映射的人工检查和 RT-PCR 来验证非规范剪接位点的结构注释。还通过利用 Araport11 的信息潜力更新了非翻译区的预测,该信息是通过使用高覆盖度 RNA-Seq 数据生成的。通过与初始基因预测(GeneSet_Nd-1_v1.0)以及针对 Col-0 参考基因组序列的 Araport11 进行比较,评估了 Nd-1 基因组组装的改进基因集(GeneSet_Nd-1_v1.1)。GeneSet_Nd-1_v1.1 在 1256 个基因中包含先前遗漏的非规范剪接位点。所有核 Col-0 基因中的 24,527 个(89.4%)与 GeneSet_Nd-1_v1.1 的相互最佳匹配表明基因预测质量很高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3636/5716242/12cc6a36c1a8/13104_2017_2985_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3636/5716242/dca04c38e8bc/13104_2017_2985_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3636/5716242/12cc6a36c1a8/13104_2017_2985_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3636/5716242/dca04c38e8bc/13104_2017_2985_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3636/5716242/12cc6a36c1a8/13104_2017_2985_Fig2_HTML.jpg

相似文献

1
Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence.对非经典剪接位点的考虑改进了对拟南芥 Niederzenz-1 基因组序列的基因预测。
BMC Res Notes. 2017 Dec 4;10(1):667. doi: 10.1186/s13104-017-2985-y.
2
A De Novo Genome Sequence Assembly of the Arabidopsis thaliana Accession Niederzenz-1 Displays Presence/Absence Variation and Strong Synteny.拟南芥 Niederzenz-1 生态型的从头基因组序列组装显示了存在/缺失变异和高度的共线性。
PLoS One. 2016 Oct 6;11(10):e0164321. doi: 10.1371/journal.pone.0164321. eCollection 2016.
3
Incorporation of splice site probability models for non-canonical introns improves gene structure prediction in plants.纳入非规范内含子的剪接位点概率模型可改善植物基因结构预测。
Bioinformatics. 2005 Nov 1;21 Suppl 3:iii20-30. doi: 10.1093/bioinformatics/bti1205.
4
Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data.读取-分割-运行:一种利用RNA测序数据识别全基因组非经典剪接区域的改进型生物信息学流程。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):503. doi: 10.1186/s12864-016-2896-7.
5
Prediction of locally optimal splice sites in plant pre-mRNA with applications to gene identification in Arabidopsis thaliana genomic DNA.预测植物前体mRNA中的局部最优剪接位点及其在拟南芥基因组DNA基因鉴定中的应用
Nucleic Acids Res. 1998 Oct 15;26(20):4748-57. doi: 10.1093/nar/26.20.4748.
6
Long-Read Annotation: Automated Eukaryotic Genome Annotation Based on Long-Read cDNA Sequencing.长读注释:基于长读 cDNA 测序的自动化真核基因组注释。
Plant Physiol. 2019 Jan;179(1):38-54. doi: 10.1104/pp.18.00848. Epub 2018 Nov 6.
7
Genome-wide analyses supported by RNA-Seq reveal non-canonical splice sites in plant genomes.基于 RNA-Seq 的全基因组分析揭示了植物基因组中的非规范剪接位点。
BMC Genomics. 2018 Dec 29;19(1):980. doi: 10.1186/s12864-018-5360-z.
8
Mining Arabidopsis thaliana RNA-seq data with Integrated Genome Browser reveals stress-induced alternative splicing of the putative splicing regulator SR45a.利用 Integrated Genome Browser 挖掘拟南芥 RNA-seq 数据揭示了应激诱导的假定剪接调控因子 SR45a 的可变剪接。
Am J Bot. 2012 Feb;99(2):219-31. doi: 10.3732/ajb.1100355. Epub 2012 Jan 30.
9
CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts.CodingQuarry:利用RNA测序转录本对真菌基因组进行高精度隐马尔可夫模型基因预测。
BMC Genomics. 2015 Mar 11;16(1):170. doi: 10.1186/s12864-015-1344-4.
10
Araport11: a complete reannotation of the Arabidopsis thaliana reference genome.Araport11:拟南芥参考基因组的完整重新注释。
Plant J. 2017 Feb;89(4):789-804. doi: 10.1111/tpj.13415. Epub 2017 Feb 10.

引用本文的文献

1
NAVIP: Unraveling the influence of neighboring small sequence variants on functional impact prediction.NAVIP:揭示相邻小序列变异对功能影响预测的影响
PLoS Comput Biol. 2025 Feb 18;21(2):e1012732. doi: 10.1371/journal.pcbi.1012732. eCollection 2025 Feb.
2
DRANetSplicer: A Splice Site Prediction Model Based on Deep Residual Attention Networks.DRANetSplicer:一种基于深度残差注意力网络的剪接位点预测模型。
Genes (Basel). 2024 Mar 26;15(4):404. doi: 10.3390/genes15040404.
3
Identification of annotation artifacts concerning the chalcone synthase (CHS).

本文引用的文献

1
Araport11: a complete reannotation of the Arabidopsis thaliana reference genome.Araport11:拟南芥参考基因组的完整重新注释。
Plant J. 2017 Feb;89(4):789-804. doi: 10.1111/tpj.13415. Epub 2017 Feb 10.
2
A De Novo Genome Sequence Assembly of the Arabidopsis thaliana Accession Niederzenz-1 Displays Presence/Absence Variation and Strong Synteny.拟南芥 Niederzenz-1 生态型的从头基因组序列组装显示了存在/缺失变异和高度的共线性。
PLoS One. 2016 Oct 6;11(10):e0164321. doi: 10.1371/journal.pone.0164321. eCollection 2016.
3
FORGETTER1 mediates stress-induced chromatin memory through nucleosome remodeling.
鉴定有关查尔酮合酶(CHS)的注释伪影。
BMC Res Notes. 2023 Jun 20;16(1):109. doi: 10.1186/s13104-023-06386-z.
4
Apiaceae FNS I originated from F3H through tandem gene duplication.伞形科 FNS I 起源于通过串联基因复制的 F3H。
PLoS One. 2023 Jan 19;18(1):e0280155. doi: 10.1371/journal.pone.0280155. eCollection 2023.
5
Mapping-by-Sequencing Reveals Genomic Regions Associated with Seed Quality Parameters in .测序作图揭示与 种子质量参数相关的基因组区域。
Genes (Basel). 2022 Jun 23;13(7):1131. doi: 10.3390/genes13071131.
6
Spliceator: multi-species splice site prediction using convolutional neural networks.Spliceator:使用卷积神经网络进行多物种剪接位点预测。
BMC Bioinformatics. 2021 Nov 23;22(1):561. doi: 10.1186/s12859-021-04471-3.
7
Characterization of the Flavonol Synthase Gene Family Reveals Bifunctional Flavonol Synthases.黄酮醇合酶基因家族的特征揭示了双功能黄酮醇合酶
Front Plant Sci. 2021 Oct 13;12:733762. doi: 10.3389/fpls.2021.733762. eCollection 2021.
8
Rapid protein evolution, organellar reductions, and invasive intronic elements in the marine aerobic parasite dinoflagellate Amoebophrya spp.海洋需氧寄生甲藻属(Amoebophrya spp.)中的快速蛋白进化、细胞器减少和入侵内含子元件
BMC Biol. 2021 Jan 6;19(1):1. doi: 10.1186/s12915-020-00927-9.
9
The reuse of public datasets in the life sciences: potential risks and rewards.生命科学中公共数据集的再利用:潜在风险与回报
PeerJ. 2020 Sep 22;8:e9954. doi: 10.7717/peerj.9954. eCollection 2020.
10
High Contiguity De Novo Genome Sequence Assembly of Trifoliate Yam () Using Long Read Sequencing.利用长读测序技术对三叶木通()进行高连续性从头基因组序列组装。
Genes (Basel). 2020 Mar 4;11(3):274. doi: 10.3390/genes11030274.
遗忘因子1通过核小体重塑介导应激诱导的染色质记忆。
Elife. 2016 Sep 28;5:e17061. doi: 10.7554/eLife.17061.
4
Well-characterized sequence features of eukaryote genomes and implications for ab initio gene prediction.真核生物基因组特征明确的序列特征及其对从头基因预测的意义。
Comput Struct Biotechnol J. 2016 Jul 27;14:298-303. doi: 10.1016/j.csbj.2016.07.002. eCollection 2016.
5
Endogenous Arabidopsis messenger RNAs transported to distant tissues.内源拟南芥信使 RNA 被运输到遥远的组织。
Nat Plants. 2015 Mar 23;1(4):15025. doi: 10.1038/nplants.2015.25.
6
Lessons from non-canonical splicing.非经典剪接的经验教训。
Nat Rev Genet. 2016 Jul;17(7):407-421. doi: 10.1038/nrg.2016.46. Epub 2016 May 31.
7
Using intron position conservation for homology-based gene prediction.利用内含子位置保守性进行基于同源性的基因预测。
Nucleic Acids Res. 2016 May 19;44(9):e89. doi: 10.1093/nar/gkw092. Epub 2016 Feb 17.
8
BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.BRAKER1:基于RNA测序的无监督基因组注释,结合GeneMark-ET和AUGUSTUS
Bioinformatics. 2016 Mar 1;32(5):767-9. doi: 10.1093/bioinformatics/btv661. Epub 2015 Nov 11.
9
OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy.OrthoFinder:解决全基因组比较中的基本偏差可显著提高直系同源组推断准确性。
Genome Biol. 2015 Aug 6;16(1):157. doi: 10.1186/s13059-015-0721-2.
10
Araport: the Arabidopsis information portal.Araport:拟南芥信息门户。
Nucleic Acids Res. 2015 Jan;43(Database issue):D1003-9. doi: 10.1093/nar/gku1200. Epub 2014 Nov 20.