Quadrini Michela, Canchari Piero Hierro, Rosati Piermichele, Tesei Luca
School of Sciences and Technology, University of Camerino, Via Madonna delle Carceri 7, 62032 Camerino, Italy.
Int J Mol Sci. 2025 Jun 15;26(12):5728. doi: 10.3390/ijms26125728.
Ribonucleic acids (RNAs) fold into complex structures that are strongly associated with their biological functions. These can be abstracted into secondary structures, represented as nucleotide sequences annotated with base-pairing information. This abstraction is both biologically relevant and computationally manageable. Comparing and classifying RNA molecules typically relies on these secondary structure representations, which exist in multiple formats. In this work, we introduce TARNAS 1.0, a software tool designed to convert RNA secondary structure representations across multiple formats, including Base Pair Sequence (BPSEQ), Connect Table (CT), dot-bracket, Arc-Annotated Sequence (AAS), Fast-All (FASTA), and RNA Markup Language (RNAML). The tool offers options for retaining or removing comments, blank lines, and headers during the conversion process. These format translation and preprocessing capabilities are specifically designed to support the batch handling of large collections of RNA molecules, making TARNAS well suited for large dataset construction and database curation. Beyond format translation, TARNAS computes three levels of abstraction for RNA secondary structures, namely core, core plus, and shape, as well as a set of statistical descriptors for both primary and secondary structure. These abstraction and analysis features are intended to facilitate the comparison of molecules and the identification of recurring structural patterns, which are essential steps for associating structural motifs with molecular function. TARNAS is available as both a standalone desktop application and a web-based tool. The desktop version supports batch processing of large datasets, while the web version is optimized for the analysis of single molecules.
核糖核酸(RNAs)折叠成与其生物学功能密切相关的复杂结构。这些结构可以抽象为二级结构,以注释有碱基配对信息的核苷酸序列表示。这种抽象在生物学上既相关又便于计算处理。比较和分类RNA分子通常依赖于这些存在多种格式的二级结构表示。在这项工作中,我们引入了TARNAS 1.0,这是一个软件工具,旨在转换多种格式的RNA二级结构表示,包括碱基对序列(BPSEQ)、连接表(CT)、点括号、弧注释序列(AAS)、快速全部(FASTA)和RNA标记语言(RNAML)。该工具提供了在转换过程中保留或删除注释、空行和标题的选项。这些格式转换和预处理功能专门设计用于支持对大量RNA分子集合的批量处理,使TARNAS非常适合大型数据集构建和数据库管理。除了格式转换,TARNAS还为RNA二级结构计算三个抽象层次,即核心、核心加和形状,以及一组一级和二级结构的统计描述符。这些抽象和分析功能旨在促进分子比较和识别重复的结构模式,这是将结构基序与分子功能相关联的关键步骤。TARNAS既可以作为独立的桌面应用程序使用,也可以作为基于网络的工具使用。桌面版本支持对大型数据集进行批量处理,而网络版本则针对单分子分析进行了优化。