Laboratório de Evolução, IECOS, Universidade Federal do Pará, Campus de Bragança, Bragança, Brazil.
Universidade Federal Rural da Amazônia (UFRA), Campus de Capitão Poço, Capitão Poço, Brazil.
BMC Bioinformatics. 2024 Apr 22;25(1):160. doi: 10.1186/s12859-024-05781-y.
The reconstruction of the evolutionary history of organisms has been greatly influenced by the advent of molecular techniques, leading to a significant increase in studies utilizing genomic data from different species. However, the lack of standardization in gene nomenclature poses a challenge in database searches and evolutionary analyses, impacting the accuracy of results obtained.
To address this issue, a Python class for standardizing gene nomenclatures, SynGenes, has been developed. It automatically recognizes and converts different nomenclature variations into a standardized form, facilitating comprehensive and accurate searches. Additionally, SynGenes offers a web form for individual searches using different names associated with the same gene. The SynGenes database contains a total of 545 gene name variations for mitochondrial and 2485 for chloroplasts genes, providing a valuable resource for researchers.
The SynGenes platform offers a solution for standardizing gene nomenclatures of mitochondrial and chloroplast genes and providing a standardized search solution for specific markers in GenBank. Evaluation of SynGenes effectiveness through research conducted on GenBank and PubMedCentral demonstrated its ability to yield a greater number of outcomes compared to conventional searches, ensuring more comprehensive and accurate results. This tool is crucial for accurate database searches, and consequently, evolutionary analyses, addressing the challenges posed by non-standardized gene nomenclature.
随着分子技术的出现,生物进化史的重建受到了极大的影响,利用不同物种基因组数据的研究也显著增加。然而,基因命名法缺乏标准化,这给数据库搜索和进化分析带来了挑战,影响了结果的准确性。
为了解决这个问题,开发了一个用于标准化基因命名法的 Python 类,名为 SynGenes。它可以自动识别和将不同的命名法变体转换为标准化形式,从而促进全面和准确的搜索。此外,SynGenes 还提供了一个用于使用与同一基因相关的不同名称进行单个搜索的网络表单。SynGenes 数据库包含线粒体基因的 545 种基因名称变体和叶绿体基因的 2485 种基因名称变体,为研究人员提供了有价值的资源。
SynGenes 平台为标准化线粒体和叶绿体基因的基因命名法提供了一种解决方案,并为 GenBank 中的特定标记提供了标准化的搜索解决方案。通过在 GenBank 和 PubMedCentral 上进行的研究对 SynGenes 的有效性进行评估,结果表明,与传统搜索相比,它能够产生更多的结果,从而确保更全面和准确的结果。这个工具对于准确的数据库搜索至关重要,因此对于解决非标准化基因命名法带来的挑战具有重要意义。