文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

RGAAT:一种基于参考的基因组组装和注释工具,用于新基因组和已知基因组的升级。

RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Genomes.

机构信息

CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; Joint Center for Genomics Research (JCGR), King Abdulaziz City for Science and Technology and Chinese Academy of Sciences, Riyadh 11442, Saudi Arabia; Grail Scientific Co. Ltd., Shenyang 110000, China.

CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.

出版信息

Genomics Proteomics Bioinformatics. 2018 Oct;16(5):373-381. doi: 10.1016/j.gpb.2018.03.006. Epub 2018 Dec 21.


DOI:10.1016/j.gpb.2018.03.006
PMID:30583062
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6364042/
Abstract

The rapid development of high-throughput sequencing technologies has led to a dramatic decrease in the money and time required for de novo genome sequencing or genome resequencing projects, with new genome sequences constantly released every week. Among such projects, the plethora of updated genome assemblies induces the requirement of version-dependent annotation files and other compatible public dataset for downstream analysis. To handle these tasks in an efficient manner, we developed the reference-based genome assembly and annotation tool (RGAAT), a flexible toolkit for resequencing-based consensus building and annotation update. RGAAT can detect sequence variants with comparable precision, specificity, and sensitivity to GATK and with higher precision and specificity than Freebayes and SAMtools on four DNA-seq datasets tested in this study. RGAAT can also identify sequence variants based on cross-cultivar or cross-version genomic alignments. Unlike GATK and SAMtools/BCFtools, RGAAT builds the consensus sequence by taking into account the true allele frequency. Finally, RGAAT generates a coordinate conversion file between the reference and query genomes using sequence variants and supports annotation file transfer. Compared to the rapid annotation transfer tool (RATT), RGAAT displays better performance characteristics for annotation transfer between different genome assemblies, strains, and species. In addition, RGAAT can be used for genome modification, genome comparison, and coordinate conversion. RGAAT is available at https://sourceforge.net/projects/rgaat/ and https://github.com/wushyer/RGAAT_v2 at no cost.

摘要

高通量测序技术的快速发展使得从头基因组测序或重测序项目所需的资金和时间大幅减少,每周都有新的基因组序列发布。在这些项目中,大量更新的基因组组装导致需要版本依赖的注释文件和其他兼容的公共数据集进行下游分析。为了高效地处理这些任务,我们开发了基于参考的基因组组装和注释工具(RGAAT),这是一个用于基于重测序的共识构建和注释更新的灵活工具包。RGAAT 可以检测到具有可比精度、特异性和敏感性的序列变体,与 GATK 相比,在本研究中测试的四个 DNA-seq 数据集上的精度和特异性更高,与 Freebayes 和 SAMtools 相比精度和特异性更高。RGAAT 还可以基于跨品种或跨版本的基因组比对来识别序列变体。与 GATK 和 SAMtools/BCFtools 不同,RGAAT 通过考虑真实等位基因频率来构建共识序列。最后,RGAAT 使用序列变体生成参考基因组和查询基因组之间的坐标转换文件,并支持注释文件传输。与快速注释转移工具(RATT)相比,RGAAT 在不同基因组组装、菌株和物种之间的注释转移方面表现出更好的性能特征。此外,RGAAT 可用于基因组修饰、基因组比较和坐标转换。RGAAT 可在 https://sourceforge.net/projects/rgaat/ 和 https://github.com/wushyer/RGAAT_v2 免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/9308157ef635/fx3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/6861624621b3/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/b3a9101e85f3/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/6c461ee10ab5/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/a852b0d17812/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/31b0a5fa3a36/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/97d9b14e820d/fx2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/9308157ef635/fx3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/6861624621b3/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/b3a9101e85f3/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/6c461ee10ab5/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/a852b0d17812/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/31b0a5fa3a36/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/97d9b14e820d/fx2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6993/6364042/9308157ef635/fx3.jpg

相似文献

[1]
RGAAT: A Reference-based Genome Assembly and Annotation Tool for New Genomes and Upgrade of Known Genomes.

Genomics Proteomics Bioinformatics. 2018-12-21

[2]
AlignGraph: algorithm for secondary de novo genome assembly guided by closely related references.

Bioinformatics. 2014-6-15

[3]
RATT: Rapid Annotation Transfer Tool.

Nucleic Acids Res. 2011-2-8

[4]
From sequence mapping to genome assemblies.

Methods Mol Biol. 2015

[5]
Ragout-a reference-assisted assembly tool for bacterial genomes.

Bioinformatics. 2014-6-15

[6]
SQUAT: a Sequencing Quality Assessment Tool for data quality assessments of genome assemblies.

BMC Genomics. 2019-4-18

[7]
KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies.

Bioinformatics. 2017-2-15

[8]
GAPPadder: a sensitive approach for closing gaps on draft genomes with short sequence reads.

BMC Genomics. 2019-6-6

[9]
MToolBox: a highly automated pipeline for heteroplasmy annotation and prioritization analysis of human mitochondrial variants in high-throughput sequencing.

Bioinformatics. 2014-7-14

[10]
GenAPI: a tool for gene absence-presence identification in fragmented bacterial genome sequences.

BMC Bioinformatics. 2020-7-20

引用本文的文献

[1]
Comparative Genomics and Draft Genome Assembly of the Elite Tunisian Date Palm Cultivar Deglet Nour: Insights into the Genetic Variations Linked to Fruit Ripening and Quality Traits.

Int J Mol Sci. 2025-7-16

[2]
Transposable elements in genomic architecture of Monilinia fungal phytopathogens and TE-driven DMI-resistance adaptation.

Mob DNA. 2025-3-7

[3]
Draft Genome Sequence of the Protozoan Parasite Leishmania braziliensis Strain BA788, Isolated from a Clinical Case in Bahia State, Brazil.

Microbiol Resour Announc. 2022-12-15

[4]
Compensatory Genetic and Transcriptional Cytonuclear Coordination in Allopolyploid Lager Yeast (Saccharomyces pastorianus).

Mol Biol Evol. 2022-11-3

[5]
A Chromosome-level Genome Assembly of Wild Castor Provides New Insights into its Adaptive Evolution in Tropical Desert.

Genomics Proteomics Bioinformatics. 2022-2

[6]
AP-2α-Mediated Activation of E2F and EZH2 Drives Melanoma Metastasis.

Cancer Res. 2021-9-1

[7]
A chromosome-scale reference genome of trifoliate orange (Poncirus trifoliata) provides insights into disease resistance, cold tolerance and genome evolution in Citrus.

Plant J. 2020-12

[8]
Structural and Functional Annotation of Transposable Elements Revealed a Potential Regulation of Genes Involved in Rubber Biosynthesis by TE-Derived siRNA Interference in .

Int J Mol Sci. 2020-6-13

[9]
Evolutionary superscaffolding and chromosome anchoring to improve Anopheles genome assemblies.

BMC Biol. 2020-1-2

本文引用的文献

[1]
Database resources of the National Center for Biotechnology Information.

Nucleic Acids Res. 2018-1-4

[2]
GSA: Genome Sequence Archive<sup/>.

Genomics Proteomics Bioinformatics. 2017-2

[3]
Ensembl core software resources: storage and programmatic access for DNA sequence and genome annotation.

Database (Oxford). 2017-1-1

[4]
Genome sequence and genetic diversity of the common carp, Cyprinus carpio.

Nat Genet. 2014-9-21

[5]
VCGDB: a dynamic genome database of the Chinese population.

BMC Genomics. 2014-4-5

[6]
Trimmomatic: a flexible trimmer for Illumina sequence data.

Bioinformatics. 2014-4-1

[7]
Genome sequence of the date palm Phoenix dactylifera L.

Nat Commun. 2013

[8]
Fast gapped-read alignment with Bowtie 2.

Nat Methods. 2012-3-4

[9]
MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.

BMC Bioinformatics. 2011-12-22

[10]
A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data.

Bioinformatics. 2011-9-8

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索