• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BRAKER2:借助蛋白质数据库,由GeneMark-EP+和AUGUSTUS支持的真核生物基因组自动注释工具。

BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database.

作者信息

Brůna Tomáš, Hoff Katharina J, Lomsadze Alexandre, Stanke Mario, Borodovsky Mark

机构信息

School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA.

Institute of Mathematics and Computer Science, University of Greifswald, 17489 Greifswald, Germany.

出版信息

NAR Genom Bioinform. 2021 Jan 6;3(1):lqaa108. doi: 10.1093/nargab/lqaa108. eCollection 2021 Mar.

DOI:10.1093/nargab/lqaa108
PMID:33575650
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7787252/
Abstract

The task of eukaryotic genome annotation remains challenging. Only a few genomes could serve as standards of annotation achieved through a tremendous investment of human curation efforts. Still, the correctness of all alternative isoforms, even in the best-annotated genomes, could be a good subject for further investigation. The new BRAKER2 pipeline generates and integrates external protein support into the iterative process of training and gene prediction by GeneMark-EP+ and AUGUSTUS. BRAKER2 continues the line started by BRAKER1 where self-training GeneMark-ET and AUGUSTUS made gene predictions supported by transcriptomic data. Among the challenges addressed by the new pipeline was a generation of reliable hints to protein-coding exon boundaries from likely homologous but evolutionarily distant proteins. In comparison with other pipelines for eukaryotic genome annotation, BRAKER2 is fully automatic. It is favorably compared under equal conditions with other pipelines, e.g. MAKER2, in terms of accuracy and performance. Development of BRAKER2 should facilitate solving the task of harmonization of annotation of protein-coding genes in genomes of different eukaryotic species. However, we fully understand that several more innovations are needed in transcriptomic and proteomic technologies as well as in algorithmic development to reach the goal of highly accurate annotation of eukaryotic genomes.

摘要

真核生物基因组注释任务仍然具有挑战性。只有少数基因组能够作为通过大量人力注释工作所达成的注释标准。即便如此,所有可变剪接异构体的正确性,即使是在注释最为完善的基因组中,仍可能是值得进一步研究的课题。新的BRAKER2流程通过GeneMark-EP+和AUGUSTUS在训练和基因预测的迭代过程中生成并整合外部蛋白质支持信息。BRAKER2延续了BRAKER1开创的思路,即利用自我训练的GeneMark-ET和AUGUSTUS进行由转录组数据支持的基因预测。新流程所应对的挑战之一是,从可能同源但进化距离较远的蛋白质中生成蛋白质编码外显子边界的可靠线索。与其他真核生物基因组注释流程相比,BRAKER2是完全自动化的。在同等条件下,它在准确性和性能方面与其他流程(如MAKER2)相比具有优势。BRAKER2的开发应有助于解决不同真核生物物种基因组中蛋白质编码基因注释协调统一的任务。然而,我们完全明白,转录组学和蛋白质组学技术以及算法开发方面还需要更多创新,才能实现真核生物基因组高精度注释的目标。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba93/7787252/24082f3dfd31/lqaa108fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba93/7787252/9f11e185090c/lqaa108fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba93/7787252/53af2c7917cf/lqaa108fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba93/7787252/28417e40b2fb/lqaa108fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba93/7787252/24082f3dfd31/lqaa108fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba93/7787252/9f11e185090c/lqaa108fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba93/7787252/53af2c7917cf/lqaa108fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba93/7787252/28417e40b2fb/lqaa108fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba93/7787252/24082f3dfd31/lqaa108fig4.jpg

相似文献

1
BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database.BRAKER2:借助蛋白质数据库,由GeneMark-EP+和AUGUSTUS支持的真核生物基因组自动注释工具。
NAR Genom Bioinform. 2021 Jan 6;3(1):lqaa108. doi: 10.1093/nargab/lqaa108. eCollection 2021 Mar.
2
BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.BRAKER1:基于RNA测序的无监督基因组注释,结合GeneMark-ET和AUGUSTUS
Bioinformatics. 2016 Mar 1;32(5):767-9. doi: 10.1093/bioinformatics/btv661. Epub 2015 Nov 11.
3
BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS and TSEBRA.BRAKER3:使用RNA测序和蛋白质证据以及GeneMark-ETP、AUGUSTUS和TSEBRA进行全自动基因组注释。
bioRxiv. 2024 Feb 29:2023.06.10.544449. doi: 10.1101/2023.06.10.544449.
4
BRAKER3: Fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA.BRAKER3:利用 RNA-seq 和蛋白质证据,通过 GeneMark-ETP、AUGUSTUS 和 TSEBRA 进行全自动基因组注释。
Genome Res. 2024 Jun 25;34(5):769-777. doi: 10.1101/gr.278090.123.
5
TSEBRA: transcript selector for BRAKER.TSEBRA:BRAKER 的转录物选择器。
BMC Bioinformatics. 2021 Nov 25;22(1):566. doi: 10.1186/s12859-021-04482-0.
6
Whole-Genome Annotation with BRAKER.使用BRAKER进行全基因组注释。
Methods Mol Biol. 2019;1962:65-95. doi: 10.1007/978-1-4939-9173-0_5.
7
GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins.GeneMark-EP+:在基因和蛋白质空间中进行自我训练的真核基因预测
NAR Genom Bioinform. 2020 Jun;2(2):lqaa026. doi: 10.1093/nargab/lqaa026. Epub 2020 May 13.
8
Comparative Analysis of Annotation Pipelines Using the First Japanese White-Eye (Zosterops japonicus) Genome.利用首个日本白头鹎(Zosterops japonicus)基因组进行注释流水线的比较分析。
Genome Biol Evol. 2021 May 7;13(5). doi: 10.1093/gbe/evab063.
9
GeneMark-ETP significantly improves the accuracy of automatic annotation of large eukaryotic genomes.GeneMark-ETP 显著提高了大型真核基因组自动注释的准确性。
Genome Res. 2024 Jun 25;34(5):757-768. doi: 10.1101/gr.278373.123.
10
A new gene finding tool GeneMark-ETP significantly improves the accuracy of automatic annotation of large eukaryotic genomes.一种新的基因发现工具GeneMark-ETP显著提高了大型真核生物基因组自动注释的准确性。
bioRxiv. 2024 Apr 17:2023.01.13.524024. doi: 10.1101/2023.01.13.524024.

引用本文的文献

1
A High-Quality Chromosome-Level Genome Assembly and Comparative Analyses Provide Insights into the Adaptation of (Fabricius, 1794) (Diptera: Calliphoridae).高质量的染色体水平基因组组装及比较分析为红头丽蝇(法布里丘斯,1794年)(双翅目:丽蝇科)的适应性研究提供了见解。
Biology (Basel). 2025 Jul 22;14(8):913. doi: 10.3390/biology14080913.
2
Chromosome-level assembly of cv. 'Tokiwa' as a reference genome of Japanese cucumber.栽培品种‘常盘’的染色体水平组装,作为日本黄瓜的参考基因组。
Breed Sci. 2025 Apr;75(2):85-92. doi: 10.1270/jsbbs.24066. Epub 2025 Mar 27.
3
1-Aminocyclopropane-1-carboxylic acid induces resource reallocation in sporophytes.

本文引用的文献

1
GeneMark-EP+: eukaryotic gene prediction with self-training in the space of genes and proteins.GeneMark-EP+:在基因和蛋白质空间中进行自我训练的真核基因预测
NAR Genom Bioinform. 2020 Jun;2(2):lqaa026. doi: 10.1093/nargab/lqaa026. Epub 2020 May 13.
2
VARUS: sampling complementary RNA reads from the sequence read archive.VARUS:从序列读取档案中采样互补 RNA 读取。
BMC Bioinformatics. 2019 Nov 8;20(1):558. doi: 10.1186/s12859-019-3182-x.
3
Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype.基于图的基因组比对和基因分型与 HISAT2 和 HISAT-genotype。
1-氨基环丙烷-1-羧酸诱导孢子体中的资源重新分配。
Front Plant Sci. 2025 Aug 15;16:1632530. doi: 10.3389/fpls.2025.1632530. eCollection 2025.
4
The chromosomal genome sequence of the giant barrel sponge, Schmidt 1870 and its associated microbial metagenome sequences.巨型桶状海绵(施密特,1870年)的染色体基因组序列及其相关微生物宏基因组序列。
Wellcome Open Res. 2025 Jul 8;10:336. doi: 10.12688/wellcomeopenres.24173.1. eCollection 2025.
5
A chromosomal-level genome assembly of Omiodes indicata Fabricius (Lepidoptera: Crambidae).印度谷螟(鳞翅目:草螟科)的染色体水平基因组组装
Sci Data. 2025 Aug 29;12(1):1514. doi: 10.1038/s41597-025-05644-y.
6
Gap-free comparative genomics uncover virulence factors for Fusarium wilt of watermelons.无间隙比较基因组学揭示西瓜枯萎病的致病因子。
PLoS Pathog. 2025 Aug 25;21(8):e1013455. doi: 10.1371/journal.ppat.1013455. eCollection 2025 Aug.
7
Chromosome-level haplotype-resolved genome assembly provides insights into the highly heterozygous genome of Italian ryegrass (Lolium multiflorum Lam.).染色体水平单倍型解析的基因组组装为多花黑麦草(Lolium multiflorum Lam.)高度杂合的基因组提供了见解。
Plant Genome. 2025 Sep;18(3):e70079. doi: 10.1002/tpg2.70079.
8
Genetic basis for broad interspecific compatibility in Solanum verrucosum.马铃薯疣粒种广泛种间兼容性的遗传基础。
Plant J. 2025 Aug;123(4):e70426. doi: 10.1111/tpj.70426.
9
Better together: Subgenomes for allotetraploid potato wild relative Solanum acaule Bitt. reveal origins in Petota Clade 3 and 4.携手共进:异源四倍体马铃薯野生近缘种智利茄的亚基因组揭示其起源于马铃薯进化分支3和4。
Plant Genome. 2025 Sep;18(3):e70095. doi: 10.1002/tpg2.70095.
10
SeqForge: A scalable platform for alignment-based searches, motif detection, and sequence curation across meta/genomic datasets.SeqForge:一个用于跨元基因组/基因组数据集进行基于比对的搜索、基序检测和序列整理的可扩展平台。
bioRxiv. 2025 Aug 15:2025.08.12.669971. doi: 10.1101/2025.08.12.669971.
Nat Biotechnol. 2019 Aug;37(8):907-915. doi: 10.1038/s41587-019-0201-4. Epub 2019 Aug 2.
4
BUSCO: Assessing Genome Assembly and Annotation Completeness.BUSCO:评估基因组组装和注释的完整性
Methods Mol Biol. 2019;1962:227-245. doi: 10.1007/978-1-4939-9173-0_14.
5
EuGene: An Automated Integrative Gene Finder for Eukaryotes and Prokaryotes.EuGene:一款用于真核生物和原核生物的自动化综合基因查找工具。
Methods Mol Biol. 2019;1962:97-120. doi: 10.1007/978-1-4939-9173-0_6.
6
Genomic insights into multidrug-resistance, mating and virulence in Candida auris and related emerging species.基因组视角下的耳念珠菌及其相关新兴种属的多药耐药性、交配和毒力。
Nat Commun. 2018 Dec 17;9(1):5346. doi: 10.1038/s41467-018-07779-6.
7
OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs.OrthoDB v10:从动物、植物、真菌、原生生物、细菌和病毒基因组中采样,以进行同源基因的进化和功能注释。
Nucleic Acids Res. 2019 Jan 8;47(D1):D807-D811. doi: 10.1093/nar/gky1053.
8
Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi.将 RNA-seq 数据与基于同源性的基因预测相结合,用于植物、动物和真菌。
BMC Bioinformatics. 2018 May 30;19(1):189. doi: 10.1186/s12859-018-2203-5.
9
Earth BioGenome Project: Sequencing life for the future of life.地球生物基因组计划:为生命的未来测序生命。
Proc Natl Acad Sci U S A. 2018 Apr 24;115(17):4325-4333. doi: 10.1073/pnas.1720115115.
10
BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics.BUSCO的应用:从质量评估到基因预测和系统发育基因组学
Mol Biol Evol. 2018 Mar 1;35(3):543-548. doi: 10.1093/molbev/msx319.