Maździarz Mateusz, Zając Sebastian, Paukszto Łukasz, Sawicki Jakub
Department of Botany and Evolutionary Ecology, University of Warmia and Mazury in Olsztyn, Plac Łódzki 1, 10-719, Olsztyn, Poland.
Decision Analysis and Support Unit, SGH Warsaw School of Economics, Warsaw, Poland.
BMC Bioinformatics. 2025 May 29;26(1):141. doi: 10.1186/s12859-025-06166-5.
Synonymous codon usage bias, a significant factor in gene expression and genome evolution, was extensively studied in genomics and molecular biology. Although the genetic code is universal, significant variations in synonymous codon usage have been observed among and within organisms. This bias was linked to various factors, including gene expression levels, tRNA abundance, protein structure, and environmental adaptation. Relative Synonymous Codon Usage (RSCU), a normalized measure, was used to quantify this bias. By analyzing RSCU values, researchers uncovered patterns and trends related to the underlying mechanisms driving codon usage bias.
We present an R package named RSCUcaller designed for the analysis of coding nucleotide sequences at the level of relative synonymous codon usage (RSCU). The package enables both visualization of data and the performance of advanced statistical analyses. RSCUcaller accepts as input a multi-fasta file containing coding sequences (CDS) and an accompanying description table. Alternatively, the user may provide separate fasta files for each sequence along with the corresponding table. The program merges the provided sequences and calculates RSCU values for each. Implemented visualization features include creating heatmaps and dendrograms based on these heatmaps. Furthermore, the package allows for the presentation of data in the form of histograms. The calculated RSCU values can also be used to create matrices that can be subjected to further analysis by the user. RSCUcaller offers the functionality of correlation analysis between any two organisms. Additionally, to compare the frequency of amino acid occurrence between different groups of sequences, statistical tests have been implemented.
RSCUcaller enabled comparative RSCU analysis between coding sequences of different organisms or individuals of the same species. It facilitated visualization and statistical analysis among codons and user-defined groups. The RSCUcaller package is available at https://github.com/Mordziarz/RSCUcaller under the GPL-3 license.
同义密码子使用偏好是基因表达和基因组进化中的一个重要因素,在基因组学和分子生物学中得到了广泛研究。尽管遗传密码是通用的,但在生物之间和生物内部已观察到同义密码子使用存在显著差异。这种偏好与多种因素相关,包括基因表达水平、tRNA丰度、蛋白质结构和环境适应性。相对同义密码子使用频率(RSCU)是一种标准化度量,用于量化这种偏好。通过分析RSCU值,研究人员发现了与驱动密码子使用偏好的潜在机制相关的模式和趋势。
我们展示了一个名为RSCUcaller的R包,其设计用于在相对同义密码子使用频率(RSCU)水平上分析编码核苷酸序列。该包既能够可视化数据,也能够进行高级统计分析。RSCUcaller接受包含编码序列(CDS)的多fasta文件以及一个随附的描述表作为输入。或者,用户可以为每个序列提供单独的fasta文件以及相应的表格。该程序会合并提供的序列并计算每个序列的RSCU值。实现的可视化功能包括基于这些热图创建热图和树状图。此外,该包允许以直方图的形式呈现数据。计算得到的RSCU值还可用于创建矩阵,供用户进行进一步分析。RSCUcaller提供了任意两种生物之间的相关性分析功能。此外,为了比较不同序列组之间氨基酸出现的频率,已实施了统计检验。
RSCUcaller能够对不同生物或同一物种的不同个体的编码序列进行比较RSCU分析。它促进了密码子之间以及用户定义组之间的可视化和统计分析。RSCUcaller包可在https://github.com/Mordziarz/RSCUcaller上以GPL - 3许可获取。