Diroma Maria Angela, Lubisco Paolo, Attimonelli Marcella
Department of Biosciences, Biotechnologies and Biopharmaceutics, University of Bari, Bari, 70126, Italy.
BMC Bioinformatics. 2016 Nov 8;17(Suppl 12):338. doi: 10.1186/s12859-016-1193-4.
The abundance of biological data characterizing the genomics era is contributing to a comprehensive understanding of human mitochondrial genetics. Nevertheless, many aspects are still unclear, specifically about the variability of the 22 human mitochondrial transfer RNA (tRNA) genes and their involvement in diseases. The complex enrichment and isolation of tRNAs in vitro leads to an incomplete knowledge of their post-transcriptional modifications and three-dimensional folding, essential for correct tRNA functioning. An accurate annotation of mitochondrial tRNA variants would be definitely useful and appreciated by mitochondrial researchers and clinicians since the most of bioinformatics tools for variant annotation and prioritization available so far cannot shed light on the functional role of tRNA variations.
To this aim, we updated our MToolBox pipeline for mitochondrial DNA analysis of high throughput and Sanger sequencing data by integrating tRNA variant annotations in order to identify and characterize relevant variants not only in protein coding regions, but also in tRNA genes. The annotation step in the pipeline now provides detailed information for variants mapping onto the 22 mitochondrial tRNAs. For each mt-tRNA position along the entire genome, the relative tRNA numbering, tRNA type, cloverleaf secondary domains (loops and stems), mature nucleotide and interactions in the three-dimensional folding were reported. Moreover, pathogenicity predictions for tRNA and rRNA variants were retrieved from the literature and integrated within the annotations provided by MToolBox, both in the stand-alone version and web-based tool at the Mitochondrial Disease Sequence Data Resource (MSeqDR) website. All the information available in the annotation step of MToolBox were exploited to generate custom tracks which can be displayed in the GBrowse instance at MSeqDR website.
To the best of our knowledge, specific data regarding mitochondrial variants in tRNA genes were introduced for the first time in a tool for mitochondrial genome analysis, supporting the interpretation of genetic variants in specific genomic contexts.
基因组学时代丰富的生物学数据有助于全面理解人类线粒体遗传学。然而,许多方面仍不清楚,特别是关于22个人类线粒体转移RNA(tRNA)基因的变异性及其与疾病的关系。tRNA在体外的复杂富集和分离导致对其转录后修饰和三维折叠的认识不完整,而这些对于tRNA的正确功能至关重要。线粒体tRNA变异体的准确注释对于线粒体研究人员和临床医生肯定是有用且受欢迎的,因为迄今为止可用的大多数用于变异体注释和优先级排序的生物信息学工具无法阐明tRNA变异的功能作用。
为此,我们更新了用于高通量和桑格测序数据线粒体DNA分析的MToolBox流程,整合了tRNA变异体注释,以便不仅在蛋白质编码区域,而且在tRNA基因中识别和表征相关变异体。流程中的注释步骤现在为映射到22个线粒体tRNA上的变异体提供详细信息。对于整个基因组中每个线粒体tRNA位置,报告了相对tRNA编号、tRNA类型、三叶草二级结构域(环和茎)、成熟核苷酸以及三维折叠中的相互作用。此外,从文献中检索了tRNA和rRNA变异体的致病性预测,并整合到MToolBox提供的注释中,包括独立版本和线粒体疾病序列数据资源(MSeqDR)网站的基于网络的工具中。利用MToolBox注释步骤中可用的所有信息生成自定义轨迹,可在MSeqDR网站的GBrowse实例中显示。
据我们所知,在用于线粒体基因组分析的工具中首次引入了关于tRNA基因中线粒体变异体的特定数据,支持在特定基因组背景下对遗传变异体的解释。