• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

rRNA 的错误注释现在可能会导致宏转录组研究中 90%的假阳性蛋白质匹配。

Misannotations of rRNA can now generate 90% false positive protein matches in metatranscriptomic studies.

机构信息

Department of Ocean Sciences, University of California, Santa Cruz, CA 95064, USA.

出版信息

Nucleic Acids Res. 2011 Nov 1;39(20):8792-802. doi: 10.1093/nar/gkr576. Epub 2011 Jul 19.

DOI:10.1093/nar/gkr576
PMID:21771858
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3203614/
Abstract

In the course of analyzing 9,522,746 pyrosequencing reads from 23 stations in the Southwestern Pacific and equatorial Atlantic oceans, it came to our attention that misannotations of rRNA as proteins is now so widespread that false positive matching of rRNA pyrosequencing reads to the National Center for Biotechnology Information (NCBI) non-redundant protein database approaches 90%. One conserved portion of 23S rRNA was consistently misannotated often enough to prompt curators at Pfam to create a spurious protein family. Detailed examination of the annotation history of each seed sequence in the spurious Pfam protein family (PF10695, 'Cw-hydrolase') uncovered issues in the standard operating procedures and quality assurance programs of major sequencing centers, and other issues relating to the curation practices of those managing public databases such as GenBank and SwissProt. We offer recommendations for all these issues, and recommend as well that workers in the field of metatranscriptomics take extra care to avoid including false positive matches in their datasets.

摘要

在分析来自西南太平洋和赤道大西洋 23 个站位的 9522746 个焦磷酸测序读数的过程中,我们注意到 rRNA 被错误注释为蛋白质的情况现在非常普遍,以至于 rRNA 焦磷酸测序读数与国家生物技术信息中心(NCBI)非冗余蛋白质数据库的假阳性匹配率接近 90%。23S rRNA 的一个保守部分经常被错误注释,以至于 Pfam 的策展人创建了一个虚假的蛋白质家族。详细检查虚假 Pfam 蛋白质家族(PF10695,“Cw-水解酶”)中每个种子序列的注释历史,揭示了主要测序中心的标准操作程序和质量保证计划中的问题,以及与管理公共数据库(如 GenBank 和 SwissProt)的策展实践相关的其他问题。我们针对所有这些问题提出了建议,并建议从事宏转录组学领域的工作人员格外小心,避免在其数据集中包含假阳性匹配。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e45f/3203614/5685217e0fb0/gkr576f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e45f/3203614/28f78b452964/gkr576f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e45f/3203614/e53a392cf0d9/gkr576f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e45f/3203614/043e6fc42fa2/gkr576f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e45f/3203614/750e2bc95f61/gkr576f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e45f/3203614/5685217e0fb0/gkr576f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e45f/3203614/28f78b452964/gkr576f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e45f/3203614/e53a392cf0d9/gkr576f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e45f/3203614/043e6fc42fa2/gkr576f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e45f/3203614/750e2bc95f61/gkr576f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e45f/3203614/5685217e0fb0/gkr576f5.jpg

相似文献

1
Misannotations of rRNA can now generate 90% false positive protein matches in metatranscriptomic studies.rRNA 的错误注释现在可能会导致宏转录组研究中 90%的假阳性蛋白质匹配。
Nucleic Acids Res. 2011 Nov 1;39(20):8792-802. doi: 10.1093/nar/gkr576. Epub 2011 Jul 19.
2
Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation.核糖体 RNA 序列分析用于 GenBank 提交和数据库管理。
BMC Bioinformatics. 2021 Aug 12;22(1):400. doi: 10.1186/s12859-021-04316-z.
3
Comparative sequence analysis and oligonucleotide probe design based on 23S rRNA genes of Alphaproteobacteria from North Sea bacterioplankton.基于北海浮游细菌中α-变形菌纲23S rRNA基因的比较序列分析和寡核苷酸探针设计。
Syst Appl Microbiol. 2004 Sep;27(5):573-80. doi: 10.1078/0723202041748172.
4
rrnDB: improved tools for interpreting rRNA gene abundance in bacteria and archaea and a new foundation for future development.rrnDB:用于解释细菌和古细菌中rRNA基因丰度的改进工具以及未来发展的新基础。
Nucleic Acids Res. 2015 Jan;43(Database issue):D593-8. doi: 10.1093/nar/gku1201. Epub 2014 Nov 20.
5
Identification of natural antimicrobial peptides from bacteria through metagenomic and metatranscriptomic analysis of high-throughput transcriptome data of Taiwanese oolong teas.通过对台湾乌龙茶高通量转录组数据进行宏基因组和宏转录组分析,从细菌中鉴定天然抗菌肽。
BMC Syst Biol. 2017 Dec 21;11(Suppl 7):131. doi: 10.1186/s12918-017-0503-4.
6
A Comparison between Transcriptome Sequencing and 16S Metagenomics for Detection of Bacterial Pathogens in Wildlife.转录组测序与16S宏基因组学在野生动物细菌病原体检测中的比较
PLoS Negl Trop Dis. 2015 Aug 18;9(8):e0003929. doi: 10.1371/journal.pntd.0003929. eCollection 2015.
7
RefSeq: an update on prokaryotic genome annotation and curation.RefSeq:原核生物基因组注释和管理的最新进展。
Nucleic Acids Res. 2018 Jan 4;46(D1):D851-D860. doi: 10.1093/nar/gkx1068.
8
Pyrosequencing the Bemisia tabaci transcriptome reveals a highly diverse bacterial community and a robust system for insecticide resistance.对烟粉虱转录组进行焦磷酸测序揭示了一个高度多样化的细菌群落和一个强大的杀虫剂抗性系统。
PLoS One. 2012;7(4):e35181. doi: 10.1371/journal.pone.0035181. Epub 2012 Apr 30.
9
The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. SILVA 核糖体 RNA 基因数据库项目:改进的数据处理和基于网络的工具。
Nucleic Acids Res. 2013 Jan;41(Database issue):D590-6. doi: 10.1093/nar/gks1219. Epub 2012 Nov 28.
10
Methodologies for probing the metatranscriptome of grassland soil.探究草原土壤宏转录组的方法学。
J Microbiol Methods. 2016 Dec;131:122-129. doi: 10.1016/j.mimet.2016.10.018. Epub 2016 Oct 25.

引用本文的文献

1
Genome structure and evolutionary history of frankincense producing .乳香生产植物的基因组结构与进化史
iScience. 2022 Jun 10;25(7):104574. doi: 10.1016/j.isci.2022.104574. eCollection 2022 Jul 15.
2
CRISPR sequences are sometimes erroneously translated and can contaminate public databases with spurious proteins containing spaced repeats.CRISPR 序列有时会被错误翻译,并可能导致含有间隔重复的虚假蛋白质污染公共数据库。
Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baaa088.
3
The Ribosome as a Missing Link in Prebiotic Evolution III: Over-Representation of tRNA- and rRNA-Like Sequences and Plieofunctionality of Ribosome-Related Molecules Argues for the Evolution of Primitive Genomes from Ribosomal RNA Modules.

本文引用的文献

1
An experimentally anchored map of transcriptional start sites in the model cyanobacterium Synechocystis sp. PCC6803.实验锚定的模式蓝藻集胞藻 PCC6803 转录起始位点图谱。
Proc Natl Acad Sci U S A. 2011 Feb 1;108(5):2124-9. doi: 10.1073/pnas.1015154108. Epub 2011 Jan 18.
2
Community cyberinfrastructure for Advanced Microbial Ecology Research and Analysis: the CAMERA resource.用于高级微生物生态学研究与分析的社区网络基础设施:CAMERA资源。
Nucleic Acids Res. 2011 Jan;39(Database issue):D546-51. doi: 10.1093/nar/gkq1102. Epub 2010 Nov 2.
3
Structural signatures of antibiotic binding sites on the ribosome.
核糖体作为前生物进化的缺失环节 III:tRNA 和 rRNA 样序列的过度表达以及核糖体相关分子的多功能性表明,原始基因组是从核糖体 RNA 模块进化而来的。
Int J Mol Sci. 2019 Jan 2;20(1):140. doi: 10.3390/ijms20010140.
4
A database of metazoan cytochrome c oxidase subunit I gene sequences derived from GenBank with CO-ARBitrator.利用 CO-ARBitrator 从 GenBank 中提取后生动物细胞色素 c 氧化酶亚基 I 基因序列数据库。
Sci Data. 2018 Aug 7;5:180156. doi: 10.1038/sdata.2018.156.
5
Gene Unprediction with Spurio: A tool to identify spurious protein sequences.使用Spurio进行基因预测:一种识别虚假蛋白质序列的工具。
F1000Res. 2018 Mar 2;7:261. doi: 10.12688/f1000research.14050.1. eCollection 2018.
6
Composition and Activity of Microbial Communities along the Redox Gradient of an Alkaline, Hypersaline, Lake.沿碱性、高盐湖泊氧化还原梯度的微生物群落组成与活性
Front Microbiol. 2018 Jan 31;9:14. doi: 10.3389/fmicb.2018.00014. eCollection 2018.
7
De Novo characterization of transcriptomes from two North American Papaipema stem-borers (Lepidoptera: Noctuidae).对两种北美茎蛀夜蛾(鳞翅目:夜蛾科)转录组的从头表征。
PLoS One. 2018 Jan 24;13(1):e0191061. doi: 10.1371/journal.pone.0191061. eCollection 2018.
8
Transcriptome profiling with focus on potential key genes for wing development and evolution in Megaloprepus caerulatus, the damselfly species with the world's largest wings.转录组分析聚焦于大蓝蜻(Megaloprepus caerulatus)翅膀发育和进化的潜在关键基因,大蓝蜻是一种拥有世界上最大翅膀的豆娘物种。
PLoS One. 2018 Jan 12;13(1):e0189898. doi: 10.1371/journal.pone.0189898. eCollection 2018.
9
The transcriptome of a "sleeping" invader: de novo assembly and annotation of the transcriptome of aestivating Cornu aspersum.一个“沉睡”入侵者的转录组:休眠的玉米蜗牛转录组的从头组装与注释
BMC Genomics. 2017 Jun 28;18(1):491. doi: 10.1186/s12864-017-3885-1.
10
rRNAFilter: A Fast Approach for Ribosomal RNA Read Removal Without a Reference Database.rRNAFilter:一种无需参考数据库即可快速去除核糖体RNA读取片段的方法。
J Comput Biol. 2017 Apr;24(4):368-375. doi: 10.1089/cmb.2016.0113. Epub 2016 Sep 9.
核糖体上抗生素结合位点的结构特征。
Nucleic Acids Res. 2010 Oct;38(18):5982-94. doi: 10.1093/nar/gkq411. Epub 2010 May 21.
4
Binding of aminoglycoside antibiotics to helix 69 of 23S rRNA.氨基糖苷类抗生素与 23S rRNA 螺旋 69 的结合。
Nucleic Acids Res. 2010 May;38(9):3094-105. doi: 10.1093/nar/gkp1253. Epub 2010 Jan 27.
5
The Pfam protein families database.Pfam 蛋白质家族数据库。
Nucleic Acids Res. 2010 Jan;38(Database issue):D211-22. doi: 10.1093/nar/gkp985. Epub 2009 Nov 17.
6
The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata.《基因组在线数据库(GOLD)》2009 年报告:基因组和宏基因组项目及其相关元数据的现状。
Nucleic Acids Res. 2010 Jan;38(Database issue):D346-54. doi: 10.1093/nar/gkp848. Epub 2009 Nov 13.
7
The integrated microbial genomes system: an expanding comparative analysis resource.整合微生物基因组系统:一个不断扩展的比较分析资源。
Nucleic Acids Res. 2010 Jan;38(Database issue):D382-90. doi: 10.1093/nar/gkp887. Epub 2009 Oct 28.
8
Selection of peptides that target the aminoacyl-tRNA site of bacterial 16S ribosomal RNA.靶向细菌16S核糖体RNA氨酰-tRNA位点的肽段的筛选。
Biochemistry. 2009 Sep 8;48(35):8299-311. doi: 10.1021/bi900982t.
9
Evaluation of three automated genome annotations for Halorhabdus utahensis.对犹他嗜盐杆菌三种自动基因组注释的评估。
PLoS One. 2009 Jul 20;4(7):e6291. doi: 10.1371/journal.pone.0006291.
10
IMG ER: a system for microbial genome annotation expert review and curation.IMG ER:一个用于微生物基因组注释专家评审和整理的系统。
Bioinformatics. 2009 Sep 1;25(17):2271-8. doi: 10.1093/bioinformatics/btp393. Epub 2009 Jun 27.