• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在比较分析中混合基因组注释方法会增加谱系特异性基因的表观数量。

Mixing genome annotation methods in a comparative analysis inflates the apparent number of lineage-specific genes.

机构信息

Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, South Drive, Princeton, NJ 08540, USA.

Department of Molecular & Cellular Biology, Harvard University, Divinity Avenue, Cambridge, MA 02138, USA.

出版信息

Curr Biol. 2022 Jun 20;32(12):2632-2639.e2. doi: 10.1016/j.cub.2022.04.085. Epub 2022 May 18.

DOI:10.1016/j.cub.2022.04.085
PMID:35588743
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9346927/
Abstract

Comparisons of genomes of different species are used to identify lineage-specific genes, those genes that appear unique to one species or clade. Lineage-specific genes are often thought to represent genetic novelty that underlies unique adaptations. Identification of these genes depends not only on genome sequences, but also on inferred gene annotations. Comparative analyses typically use available genomes that have been annotated using different methods, increasing the risk that orthologous DNA sequences may be erroneously annotated as a gene in one species but not another, appearing lineage specific as a result. To evaluate the impact of such "annotation heterogeneity," we identified four clades of species with sequenced genomes with more than one publicly available gene annotation, allowing us to compare the number of lineage-specific genes inferred when differing annotation methods are used to those resulting when annotation method is uniform across the clade. In these case studies, annotation heterogeneity increases the apparent number of lineage-specific genes by up to 15-fold, suggesting that annotation heterogeneity is a substantial source of potential artifact.

摘要

对不同物种基因组的比较用于鉴定谱系特异性基因,即那些出现在一个物种或进化枝中独特的基因。通常认为,谱系特异性基因代表了遗传新颖性,是独特适应性的基础。这些基因的鉴定不仅取决于基因组序列,还取决于推断的基因注释。比较分析通常使用已注释的可用基因组,但使用的方法不同,增加了以下风险:同源 DNA 序列可能在一个物种中错误地注释为一个基因,但在另一个物种中却没有,从而表现出谱系特异性。为了评估这种“注释异质性”的影响,我们确定了四个具有多个公开可用基因注释的测序物种进化枝,这使我们能够比较在使用不同注释方法时推断出的谱系特异性基因的数量与在整个进化枝中使用统一注释方法时的数量。在这些案例研究中,注释异质性使谱系特异性基因的数量最多增加了 15 倍,这表明注释异质性是潜在人工制品的一个重要来源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/512c/9346927/8d41a7ef53fa/nihms-1824146-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/512c/9346927/81051a5bdc4b/nihms-1824146-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/512c/9346927/a27c665ac485/nihms-1824146-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/512c/9346927/8d41a7ef53fa/nihms-1824146-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/512c/9346927/81051a5bdc4b/nihms-1824146-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/512c/9346927/a27c665ac485/nihms-1824146-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/512c/9346927/8d41a7ef53fa/nihms-1824146-f0003.jpg

相似文献

1
Mixing genome annotation methods in a comparative analysis inflates the apparent number of lineage-specific genes.在比较分析中混合基因组注释方法会增加谱系特异性基因的表观数量。
Curr Biol. 2022 Jun 20;32(12):2632-2639.e2. doi: 10.1016/j.cub.2022.04.085. Epub 2022 May 18.
2
Comparative Genome Annotation.比较基因组注释。
Methods Mol Biol. 2024;2802:165-187. doi: 10.1007/978-1-0716-3838-5_7.
3
Repertoire-wide gene structure analyses: a case study comparing automatically predicted and manually annotated gene models.全面的基因结构分析:自动预测和手动注释基因模型的比较案例研究。
BMC Genomics. 2019 Oct 17;20(1):753. doi: 10.1186/s12864-019-6064-8.
4
Comparative Genome Annotation.比较基因组注释
Methods Mol Biol. 2018;1704:189-212. doi: 10.1007/978-1-4939-7463-4_6.
5
Comparison of RefSeq protein-coding regions in human and vertebrate genomes.比较人类和脊椎动物基因组中的 RefSeq 编码蛋白区域。
BMC Genomics. 2013 Sep 25;14:654. doi: 10.1186/1471-2164-14-654.
6
7
OGS2: genome re-annotation of the jewel wasp Nasonia vitripennis.OGS2:丽蝇蛹集金小蜂基因组的重新注释
BMC Genomics. 2016 Aug 25;17(1):678. doi: 10.1186/s12864-016-2886-9.
8
Gene annotation errors are common in the mammalian mitochondrial genomes database.基因注释错误在哺乳动物线粒体基因组数据库中很常见。
BMC Genomics. 2019 Jan 22;20(1):73. doi: 10.1186/s12864-019-5447-1.
9
Reannotation and extended community resources for the genome of the non-seed plant Physcomitrella patens provide insights into the evolution of plant gene structures and functions.对非种子植物Physcomitrella patens 基因组的重新注释和扩展的群落资源,为植物基因结构和功能的进化提供了新的见解。
BMC Genomics. 2013 Jul 23;14:498. doi: 10.1186/1471-2164-14-498.
10
RNA-seq-Based Gene Annotation and Comparative Genomics of Four Fungal Grass Pathogens in the Genus Zymoseptoria Identify Novel Orphan Genes and Species-Specific Invasions of Transposable Elements.基于RNA测序的发酵性黑粉菌属四种真菌性禾本科病原菌的基因注释与比较基因组学研究,鉴定出新型孤儿基因和转座元件的物种特异性入侵。
G3 (Bethesda). 2015 Apr 27;5(7):1323-33. doi: 10.1534/g3.115.017731.

引用本文的文献

1
EASYstrata: an all-in-one workflow for genome annotation and genomic divergence analysis.EASYstrata:用于基因组注释和基因组差异分析的一体化工作流程。
NAR Genom Bioinform. 2025 Aug 27;7(3):lqaf110. doi: 10.1093/nargab/lqaf110. eCollection 2025 Sep.
2
Genome of the Myiasis-Causing Fly Chrysomya bezziana, the Old-World Screwworm.致蝇蛆病的苍蝇——旧大陆螺旋锥蝇(Chrysomya bezziana)的基因组。
Genome Biol Evol. 2025 Jul 30;17(8). doi: 10.1093/gbe/evaf121.
3
Annotation matters: the effect of structural gene annotation on orthology inference.

本文引用的文献

1
Universal and taxon-specific trends in protein sequences as a function of age.蛋白质序列随年龄变化的普遍和分类群特异性趋势。
Elife. 2021 Jan 8;10:e57347. doi: 10.7554/eLife.57347.
2
Ensembl 2021.Ensembl 2021.
Nucleic Acids Res. 2021 Jan 8;49(D1):D884-D891. doi: 10.1093/nar/gkaa942.
3
Many, but not all, lineage-specific genes can be explained by homology detection failure.许多(但不是全部)谱系特异性基因可以通过同源性检测失败来解释。
注释很重要:结构基因注释对直系同源推断的影响。
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf365.
4
Predicting Protein Function in the AI and Big Data Era.人工智能与大数据时代的蛋白质功能预测
Biochemistry. 2025 Jun 3;64(11):2345-2352. doi: 10.1021/acs.biochem.5c00186. Epub 2025 May 17.
5
Gene novelty and gene family expansion in the early evolution of Lepidoptera.鳞翅目早期进化中的基因新奇性与基因家族扩张
BMC Genomics. 2025 Feb 19;26(1):161. doi: 10.1186/s12864-025-11338-x.
6
Convergent Evolution and Predictability of Gene Copy Numbers Associated with Diets in Mammals.哺乳动物中与饮食相关的基因拷贝数的趋同进化与可预测性
Genome Biol Evol. 2025 Feb 3;17(2). doi: 10.1093/gbe/evaf008.
7
MATEdb2, a Collection of High-Quality Metazoan Proteomes across the Animal Tree of Life to Speed Up Phylogenomic Studies.MATEdb2,一个高质量后生动物蛋白质组数据库,涵盖动物生命树,以加速系统基因组学研究。
Genome Biol Evol. 2024 Nov 1;16(11). doi: 10.1093/gbe/evae235.
8
Orphan genes are not a distinct biological entity.孤儿基因并非一个独特的生物学实体。
Bioessays. 2025 Jan;47(1):e2400146. doi: 10.1002/bies.202400146. Epub 2024 Nov 3.
9
The Highly Repetitive Genome of Myxobolus rasmusseni, an Emerging Myxozoan Parasite of Fathead Minnows.《胖头鱥新兴粘孢子虫寄生虫小眼粘体虫的高度重复基因组》
Genome Biol Evol. 2024 Nov 1;16(11). doi: 10.1093/gbe/evae220.
10
Regulatory genome annotation of 33 insect species.33 种昆虫的调控基因组注释。
Elife. 2024 Oct 11;13:RP96738. doi: 10.7554/eLife.96738.
PLoS Biol. 2020 Nov 2;18(11):e3000862. doi: 10.1371/journal.pbio.3000862. eCollection 2020 Nov.
4
Only a Single Taxonomically Restricted Gene Family in the Drosophila melanogaster Subgroup Can Be Identified with High Confidence.仅有一个在果蝇亚组中具有严格分类学限制的基因家族可以被高度确信地识别。
Genome Biol Evol. 2020 Aug 1;12(8):1355-1366. doi: 10.1093/gbe/evaa127.
5
De novo gene birth.从头起源基因
PLoS Genet. 2019 May 23;15(5):e1008160. doi: 10.1371/journal.pgen.1008160. eCollection 2019 May.
6
A Shift in Aggregation Avoidance Strategy Marks a Long-Term Direction to Protein Evolution.聚集避免策略的转变标志着蛋白质进化的长期方向。
Genetics. 2019 Apr;211(4):1345-1355. doi: 10.1534/genetics.118.301719. Epub 2019 Jan 28.
7
From De Novo to "De Nono": The Majority of Novel Protein-Coding Genes Identified with Phylostratigraphy Are Old Genes or Recent Duplicates.从头新到“第九新”:系统发生分类学鉴定的大多数新蛋白编码基因是旧基因或近期重复基因。
Genome Biol Evol. 2018 Nov 1;10(11):2906-2918. doi: 10.1093/gbe/evy231.
8
Gene Birth Contributes to Structural Disorder Encoded by Overlapping Genes.基因诞生导致重叠基因编码的结构无序。
Genetics. 2018 Sep;210(1):303-313. doi: 10.1534/genetics.118.301249. Epub 2018 Jul 19.
9
A Molecular Portrait of De Novo Genes in Yeasts.酵母中新基因的分子特征。
Mol Biol Evol. 2018 Mar 1;35(3):631-645. doi: 10.1093/molbev/msx315.
10
Young Genes are Highly Disordered as Predicted by the Preadaptation Hypothesis of Gene Birth.正如基因诞生的预适应假说所预测的那样,年轻基因高度无序。
Nat Ecol Evol. 2017 Jun;1(6):0146-146. doi: 10.1038/s41559-017-0146. Epub 2017 Apr 24.