• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BioKIT:一个用于处理和分析多种类型序列数据的多功能工具包。

BioKIT: a versatile toolkit for processing and analyzing diverse types of sequence data.

作者信息

Steenwyk Jacob L, Buida Thomas J, Gonçalves Carla, Goltz Dayna C, Morales Grace, Mead Matthew E, LaBella Abigail L, Chavez Christina M, Schmitz Jonathan E, Hadjifrangiskou Maria, Li Yuanning, Rokas Antonis

机构信息

Department of Biological Sciences, Vanderbilt University, Nashville, TN 37235, USA.

Evolutionary Studies Initiative, Vanderbilt University, Nashville, TN 37235, USA.

出版信息

Genetics. 2022 Jul 4;221(3). doi: 10.1093/genetics/iyac079.

DOI:10.1093/genetics/iyac079
PMID:35536198
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9252278/
Abstract

Bioinformatic analysis-such as genome assembly quality assessment, alignment summary statistics, relative synonymous codon usage, file format conversion, and processing and analysis-is integrated into diverse disciplines in the biological sciences. Several command-line pieces of software have been developed to conduct some of these individual analyses, but unified toolkits that conduct all these analyses are lacking. To address this gap, we introduce BioKIT, a versatile command line toolkit that has, upon publication, 42 functions, several of which were community-sourced, that conduct routine and novel processing and analysis of genome assemblies, multiple sequence alignments, coding sequences, sequencing data, and more. To demonstrate the utility of BioKIT, we conducted a comprehensive examination of relative synonymous codon usage across 171 fungal genomes that use alternative genetic codes, showed that the novel metric of gene-wise relative synonymous codon usage can accurately estimate gene-wise codon optimization, evaluated the quality and characteristics of 901 eukaryotic genome assemblies, and calculated alignment summary statistics for 10 phylogenomic data matrices. BioKIT will be helpful in facilitating and streamlining sequence analysis workflows. BioKIT is freely available under the MIT license from GitHub (https://github.com/JLSteenwyk/BioKIT), PyPi (https://pypi.org/project/jlsteenwyk-biokit/), and the Anaconda Cloud (https://anaconda.org/jlsteenwyk/jlsteenwyk-biokit). Documentation, user tutorials, and instructions for requesting new features are available online (https://jlsteenwyk.com/BioKIT).

摘要

生物信息学分析,如基因组组装质量评估、比对汇总统计、相对同义密码子使用情况、文件格式转换以及处理与分析,已融入生物科学的各个学科。已经开发了一些命令行软件来进行其中的一些单独分析,但缺乏能进行所有这些分析的统一工具包。为了填补这一空白,我们引入了BioKIT,这是一个多功能的命令行工具包,发布时具有42个功能,其中一些功能是社区提供的,可对基因组组装、多序列比对、编码序列、测序数据等进行常规和新颖的处理与分析。为了证明BioKIT的实用性,我们对171个使用替代遗传密码的真菌基因组的相对同义密码子使用情况进行了全面检查,表明基因水平的相对同义密码子使用这一新指标可以准确估计基因水平的密码子优化情况,评估了901个真核生物基因组组装的质量和特征,并计算了10个系统发育基因组数据矩阵的比对汇总统计数据。BioKIT将有助于促进和简化序列分析工作流程。BioKIT可根据MIT许可从GitHub(https://github.com/JLSteenwyk/BioKIT)、PyPi(https://pypi.org/project/jlsteenwyk-biokit/)和Anaconda Cloud(https://anaconda.org/jlsteenwyk/jlsteenwyk-biokit)免费获取。在线提供了文档、用户教程以及请求新功能的说明(https://jlsteenwyk.com/BioKIT)。

相似文献

1
BioKIT: a versatile toolkit for processing and analyzing diverse types of sequence data.BioKIT:一个用于处理和分析多种类型序列数据的多功能工具包。
Genetics. 2022 Jul 4;221(3). doi: 10.1093/genetics/iyac079.
2
PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data.PhyKIT:一个广泛适用的用于处理和分析系统发育基因组数据的UNIX shell工具包。
Bioinformatics. 2021 Aug 25;37(16):2325-2331. doi: 10.1093/bioinformatics/btab096.
3
orthofisher: a broadly applicable tool for automated gene identification and retrieval.orthofisher:一种广泛适用的自动化基因识别和检索工具。
G3 (Bethesda). 2021 Sep 6;11(9). doi: 10.1093/g3journal/jkab250.
4
GenomeQC: a quality assessment tool for genome assemblies and gene structure annotations.基因组 QC:基因组组装和基因结构注释的质量评估工具。
BMC Genomics. 2020 Mar 2;21(1):193. doi: 10.1186/s12864-020-6568-2.
5
%MinMax: A versatile tool for calculating and comparing synonymous codon usage and its impact on protein folding.%MinMax:一种用于计算和比较同义密码子使用情况及其对蛋白质折叠影响的通用工具。
Protein Sci. 2018 Jan;27(1):356-362. doi: 10.1002/pro.3336. Epub 2017 Nov 21.
6
BuddySuite: Command-Line Toolkits for Manipulating Sequences, Alignments, and Phylogenetic Trees.BuddySuite:用于操作序列、比对和系统发育树的命令行工具包。
Mol Biol Evol. 2017 Jun 1;34(6):1543-1546. doi: 10.1093/molbev/msx089.
7
BpWrapper: BioPerl-based sequence and tree utilities for rapid prototyping of bioinformatics pipelines.BpWrapper:基于 BioPerl 的序列和树实用程序,用于快速原型化生物信息学管道。
BMC Bioinformatics. 2018 Mar 2;19(1):76. doi: 10.1186/s12859-018-2074-9.
8
wgatools: an ultrafast toolkit for manipulating whole genome alignments.Wgatools:一个用于操作全基因组比对的超快速工具包。
ArXiv. 2024 Sep 13:arXiv:2409.08569v1.
9
multiPhATE: bioinformatics pipeline for functional annotation of phage isolates.多噬菌体分析工具(multiPhATE):用于噬菌体分离物功能注释的生物信息学流程。
Bioinformatics. 2019 Nov 1;35(21):4402-4404. doi: 10.1093/bioinformatics/btz258.
10
Semla: a versatile toolkit for spatially resolved transcriptomics analysis and visualization.Semla:一个用于空间分辨转录组学分析和可视化的多功能工具包。
Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad626.

引用本文的文献

1
ClipKIT in the browser: fast online trimming of multiple sequence alignments for phylogenetics.浏览器中的ClipKIT:用于系统发育学的多序列比对快速在线修剪
Nucleic Acids Res. 2025 Jul 7;53(W1):W169-W171. doi: 10.1093/nar/gkaf325.
2
Three reference genomes for freshwater diatom ecology and evolution.用于淡水硅藻生态学与进化研究的三个参考基因组。
J Phycol. 2025 Apr;61(2):267-274. doi: 10.1111/jpy.13545. Epub 2025 Feb 10.
3
Horizontal transfer of accessory chromosomes in fungi - a regulated process for exchange of genetic material?真菌中附属染色体的水平转移——一种遗传物质交换的调控过程?
Heredity (Edinb). 2025 Feb 10. doi: 10.1038/s41437-025-00746-0.
4
Evolutionary origin and population diversity of a cryptic hybrid pathogen.隐生杂交病原体的进化起源和种群多样性。
Nat Commun. 2024 Sep 28;15(1):8412. doi: 10.1038/s41467-024-52639-1.
5
Diverse signatures of convergent evolution in cactus-associated yeasts.仙人掌共生酵母趋同进化的多样特征。
PLoS Biol. 2024 Sep 23;22(9):e3002832. doi: 10.1371/journal.pbio.3002832. eCollection 2024 Sep.
6
Natural proteome diversity links aneuploidy tolerance to protein turnover.天然蛋白质组多样性将非整倍体耐受性与蛋白质周转联系起来。
Nature. 2024 Jun;630(8015):149-157. doi: 10.1038/s41586-024-07442-9. Epub 2024 May 22.
7
The evolution of the gliotoxin biosynthetic gene cluster in Penicillium fungi.《青霉素真菌中Gliotoxin 生物合成基因簇的进化》。
G3 (Bethesda). 2024 May 7;14(5). doi: 10.1093/g3journal/jkae063.
8
Phylogenomics reveals extensive misidentification of fungal strains from the genus .系统发育基因组学揭示了属真菌菌株的广泛错误鉴定。
Microbiol Spectr. 2024 Apr 2;12(4):e0398023. doi: 10.1128/spectrum.03980-23. Epub 2024 Mar 6.
9
Diverse signatures of convergent evolution in cacti-associated yeasts.仙人掌相关酵母中趋同进化的多样特征。
bioRxiv. 2023 Sep 17:2023.09.14.557833. doi: 10.1101/2023.09.14.557833.
10
Isolation, characterization, and evaluation of putative new bacteriophages for controlling bacterial spot on tomato in Brazil.巴西用于防治番茄细菌性斑点病的潜在新型噬菌体的分离、鉴定和评估。
Arch Virol. 2023 Aug 7;168(9):222. doi: 10.1007/s00705-023-05846-y.

本文引用的文献

1
OrthoSNAP: A tree splitting and pruning algorithm for retrieving single-copy orthologs from gene family trees.OrthoSNAP:一种从基因树中检索单拷贝直系同源基因的树分裂和修剪算法。
PLoS Biol. 2022 Oct 13;20(10):e3001827. doi: 10.1371/journal.pbio.3001827. eCollection 2022 Oct.
2
ggpubfigs: Colorblind-Friendly Color Palettes and ggplot2 Graphic System Extensions for Publication-Quality Scientific Figures.ggpubfigs:用于高质量科学图表的色盲友好型调色板和ggplot2图形系统扩展。
Microbiol Resour Announc. 2021 Nov 4;10(44):e0087121. doi: 10.1128/MRA.00871-21.
3
orthofisher: a broadly applicable tool for automated gene identification and retrieval.orthofisher:一种广泛适用的自动化基因识别和检索工具。
G3 (Bethesda). 2021 Sep 6;11(9). doi: 10.1093/g3journal/jkab250.
4
Phylogenomic Subsampling and the Search for Phylogenetically Reliable Loci.系统基因组抽样与寻找进化可靠基因座。
Mol Biol Evol. 2021 Aug 23;38(9):4025-4038. doi: 10.1093/molbev/msab151.
5
Signatures of optimal codon usage in metabolic genes inform budding yeast ecology.代谢基因最优密码子使用特征为 budding yeast 生态学提供信息。
PLoS Biol. 2021 Apr 19;19(4):e3001185. doi: 10.1371/journal.pbio.3001185. eCollection 2021 Apr.
6
Six-State Amino Acid Recoding is not an Effective Strategy to Offset Compositional Heterogeneity and Saturation in Phylogenetic Analyses.六态氨基酸编码不是一种有效策略来抵消系统发育分析中的组成异质性和饱和性。
Syst Biol. 2021 Oct 13;70(6):1200-1212. doi: 10.1093/sysbio/syab027.
7
PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data.PhyKIT:一个广泛适用的用于处理和分析系统发育基因组数据的UNIX shell工具包。
Bioinformatics. 2021 Aug 25;37(16):2325-2331. doi: 10.1093/bioinformatics/btab096.
8
ClipKIT: A multiple sequence alignment trimming software for accurate phylogenomic inference.ClipKIT:一种用于准确系统发育推断的多重序列比对修剪软件。
PLoS Biol. 2020 Dec 2;18(12):e3001007. doi: 10.1371/journal.pbio.3001007. eCollection 2020 Dec.
9
Genome-scale phylogeny and contrasting modes of genome evolution in the fungal phylum Ascomycota.基因组规模的系统发育和真菌门子囊菌门中截然不同的基因组进化模式。
Sci Adv. 2020 Nov 4;6(45). doi: 10.1126/sciadv.abd0079. Print 2020 Nov.
10
Effect of sequence depth and length in long-read assembly of the maize inbred NC358.长读长序列深度和长度对玉米自交系 NC358 组装的影响。
Nat Commun. 2020 May 8;11(1):2288. doi: 10.1038/s41467-020-16037-7.