• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基因组注释:从人类遗传学到生物多样性基因组学

Genome annotation: From human genetics to biodiversity genomics.

作者信息

Guigó Roderic

机构信息

Bioinformatics and Genomics, Center for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Dr. Aiguader 88, 08003 Barcelona, Catalonia.

Universitat Pompeu Fabra (UPF), Barcelona, Catalonia.

出版信息

Cell Genom. 2023 Aug 1;3(8):100375. doi: 10.1016/j.xgen.2023.100375. eCollection 2023 Aug 9.

DOI:10.1016/j.xgen.2023.100375
PMID:37601977
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10435374/
Abstract

Within the next decade, the genomes of 1.8 million eukaryotic species will be sequenced. Identifying genes in these sequences is essential to understand the biology of the species. This is challenging due to the transcriptional complexity of eukaryotic genomes, which encode hundreds of thousands of transcripts of multiple types. Among these, a small set of protein-coding mRNAs play a disproportionately large role in defining phenotypes. Due to their sequence conservation, orthology can be established, making it possible to define the universal catalog of eukaryotic protein-coding genes. This catalog should substantially contribute to uncovering the genomic events underlying the emergence of eukaryotic phenotypes. This piece briefly reviews the basics of protein-coding gene prediction, discusses challenges in finalizing annotation of the human genome, and proposes strategies for producing annotations across the eukaryotic Tree of Life. This lays the groundwork for obtaining the catalog of all genes-the Earth's code of life.

摘要

在未来十年内,将对180万个真核生物物种的基因组进行测序。识别这些序列中的基因对于理解物种生物学至关重要。由于真核生物基因组的转录复杂性,这一过程颇具挑战,真核生物基因组编码了数十万种多种类型的转录本。其中,一小部分蛋白质编码mRNA在定义表型方面发挥着 disproportionately large 的作用。由于它们的序列保守性,可以建立直系同源关系,从而有可能定义真核生物蛋白质编码基因的通用目录。该目录应在很大程度上有助于揭示真核生物表型出现背后的基因组事件。本文简要回顾了蛋白质编码基因预测的基础知识,讨论了完成人类基因组注释的挑战,并提出了在真核生物生命之树上进行注释的策略。这为获取所有基因的目录——地球生命密码奠定了基础。 (注:disproportionately large 未准确翻译出其含义,建议结合语境调整为“不成比例的大”之类更合适的表述)

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/434c/10435374/6b5f7d49c706/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/434c/10435374/90af7bea09c3/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/434c/10435374/4dac3001f094/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/434c/10435374/1bbd97f986bc/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/434c/10435374/5ccf3fc0bbad/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/434c/10435374/6b5f7d49c706/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/434c/10435374/90af7bea09c3/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/434c/10435374/4dac3001f094/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/434c/10435374/1bbd97f986bc/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/434c/10435374/5ccf3fc0bbad/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/434c/10435374/6b5f7d49c706/gr4.jpg

相似文献

1
Genome annotation: From human genetics to biodiversity genomics.基因组注释:从人类遗传学到生物多样性基因组学
Cell Genom. 2023 Aug 1;3(8):100375. doi: 10.1016/j.xgen.2023.100375. eCollection 2023 Aug 9.
2
3
Comparison of RefSeq protein-coding regions in human and vertebrate genomes.比较人类和脊椎动物基因组中的 RefSeq 编码蛋白区域。
BMC Genomics. 2013 Sep 25;14:654. doi: 10.1186/1471-2164-14-654.
4
An Updated Functional Annotation of Protein-Coding Genes in the Cucumber Genome.黄瓜基因组中蛋白质编码基因的更新功能注释
Front Plant Sci. 2018 Mar 15;9:325. doi: 10.3389/fpls.2018.00325. eCollection 2018.
5
OrthoDB: the hierarchical catalog of eukaryotic orthologs in 2011.OrthoDB:2011年真核生物直系同源基因的分层目录。
Nucleic Acids Res. 2011 Jan;39(Database issue):D283-8. doi: 10.1093/nar/gkq930. Epub 2010 Oct 23.
6
A universal genomic coordinate translator for comparative genomics.用于比较基因组学的通用基因组坐标转换器。
BMC Bioinformatics. 2014 Jun 30;15:227. doi: 10.1186/1471-2105-15-227.
7
GENESPACE tracks regions of interest and gene copy number variation across multiple genomes.GENESPACE 跟踪多个基因组中的感兴趣区域和基因拷贝数变异。
Elife. 2022 Sep 9;11:e78526. doi: 10.7554/eLife.78526.
8
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
9
Comparative genomics in cyprinids: common carp ESTs help the annotation of the zebrafish genome.鲤科鱼类的比较基因组学:鲤鱼EST有助于斑马鱼基因组的注释。
BMC Bioinformatics. 2006 Dec 18;7 Suppl 5(Suppl 5):S2. doi: 10.1186/1471-2105-7-S5-S2.
10
Whole-Genome Alignment and Comparative Annotation.全基因组比对和注释。
Annu Rev Anim Biosci. 2019 Feb 15;7:41-64. doi: 10.1146/annurev-animal-020518-115005. Epub 2018 Oct 31.

引用本文的文献

1
A near-complete genome assembly of the bearded dragon Pogona vitticeps provides insights into the origin of Pogona sex chromosomes.鬃狮蜥(Pogona vitticeps)近乎完整的基因组组装为鬃狮蜥性染色体的起源提供了见解。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf079.
2
Chimeric mis-annotations of genes remain pervasive in eukaryotic non-model organisms.基因的嵌合错误注释在真核非模式生物中仍然普遍存在。
BMC Genomics. 2025 Jul 1;26(1):630. doi: 10.1186/s12864-025-11765-w.
3
Annotation matters: the effect of structural gene annotation on orthology inference.

本文引用的文献

1
The status of the human gene catalogue.人类基因目录的现状。
Nature. 2023 Oct;622(7981):41-47. doi: 10.1038/s41586-023-06490-x. Epub 2023 Oct 4.
2
A global catalog of whole-genome diversity from 233 primate species.233 种灵长类动物的全基因组多样性全球目录。
Science. 2023 Jun 2;380(6648):906-913. doi: 10.1126/science.abn7829. Epub 2023 Jun 1.
3
Integrating gene annotation with orthology inference at scale.大规模整合基因注释与直系同源推断。
注释很重要:结构基因注释对直系同源推断的影响。
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf365.
4
Chromosome-level genome assembly and methylome profile yield insights for the conservation of endangered loggerhead sea turtles.染色体水平的基因组组装和甲基化组图谱为濒危蠵龟的保护提供了见解。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf054.
5
Long-read transcriptomics of a diverse human cohort reveals widespread ancestry bias in gene annotation.对多样化人类群体的长读长转录组学研究揭示了基因注释中广泛存在的祖先偏差。
bioRxiv. 2025 Mar 17:2025.03.14.643250. doi: 10.1101/2025.03.14.643250.
6
The BioGenome Portal: a web-based platform for biodiversity genomics data management.生物基因组门户:一个用于生物多样性基因组学数据管理的基于网络的平台。
NAR Genom Bioinform. 2025 Mar 22;7(1):lqaf020. doi: 10.1093/nargab/lqaf020. eCollection 2025 Mar.
7
Hookworm genes encoding intestinal excreted-secreted proteins are transcriptionally upregulated in response to the host's immune system.编码肠道排泄分泌蛋白的钩虫基因在宿主免疫系统的作用下转录上调。
bioRxiv. 2025 Feb 3:2025.02.01.636063. doi: 10.1101/2025.02.01.636063.
8
De novo whole-genome assembly of the critically endangered southern muriqui (Brachyteles arachnoides).极危物种南方绒毛蛛猴(Brachyteles arachnoides)的从头全基因组组装
G3 (Bethesda). 2025 Apr 17;15(4). doi: 10.1093/g3journal/jkaf034.
9
GENCODE: massively expanding the lncRNA catalog through capture long-read RNA sequencing.GENCODE:通过捕获长读长RNA测序大幅扩充长链非编码RNA目录。
bioRxiv. 2024 Oct 31:2024.10.29.620654. doi: 10.1101/2024.10.29.620654.
10
Quest for Orthologs in the Era of Biodiversity Genomics.生物多样性基因组学时代的同源基因探索。
Genome Biol Evol. 2024 Oct 9;16(10). doi: 10.1093/gbe/evae224.
Science. 2023 Apr 28;380(6643):eabn3107. doi: 10.1126/science.abn3107.
4
Evolutionary constraint and innovation across hundreds of placental mammals.数百种胎盘哺乳动物的进化约束与创新。
Science. 2023 Apr 28;380(6643):eabn3943. doi: 10.1126/science.abn3943.
5
Genomes on a Tree (GoaT): A versatile, scalable search engine for genomic and sequencing project metadata across the eukaryotic tree of life.基因组之树(GoaT):一种通用、可扩展的搜索引擎,用于搜索真核生物生命之树中的基因组和测序项目元数据。
Wellcome Open Res. 2023 Jan 17;8:24. doi: 10.12688/wellcomeopenres.18658.1. eCollection 2023.
6
Accurate isoform discovery with IsoQuant using long reads.利用长读长 IsoQuant 进行准确的异构体发现。
Nat Biotechnol. 2023 Jul;41(7):915-918. doi: 10.1038/s41587-022-01565-y. Epub 2023 Jan 2.
7
De novo genes with an lncRNA origin encode unique human brain developmental functionality.具有 lncRNA 起源的从头基因编码独特的人类大脑发育功能。
Nat Ecol Evol. 2023 Feb;7(2):264-278. doi: 10.1038/s41559-022-01925-6. Epub 2023 Jan 2.
8
GENCODE: reference annotation for the human and mouse genomes in 2023.GENCODE:2023 年人类和小鼠基因组的参考注释。
Nucleic Acids Res. 2023 Jan 6;51(D1):D942-D949. doi: 10.1093/nar/gkac1071.
9
Live-seq enables temporal transcriptomic recording of single cells.活细胞测序能够对单细胞进行时间转录组记录。
Nature. 2022 Aug;608(7924):733-740. doi: 10.1038/s41586-022-05046-9. Epub 2022 Aug 17.
10
Standardized annotation of translated open reading frames.翻译后的开放阅读框的标准化注释。
Nat Biotechnol. 2022 Jul;40(7):994-999. doi: 10.1038/s41587-022-01369-0.