Breidenbach Joshua D, Begue Iii E Francis, Kennedy David J, Haller Steven T
Department of Medicine, College of Medicine and Life Sciences, University of Toledo, Toledo, OH 43614, USA.
Biology (Basel). 2022 Jul 26;11(8):1113. doi: 10.3390/biology11081113.
The increasing incorporation of omics technologies into biomedical research and translational medicine presents challenges to end users of the large and complex datasets that are generated by these methods. A particular challenge in genomics is that the nomenclature for genes is not uniform between large genomic databases or between commonly used genetic analysis tools. Furthermore, outdated genomic nomenclature can still be found amongst scientific communications, including peer-reviewed manuscripts. Therefore, a web application (GeneToList) was developed to assist in gene ID conversion and alias matching, with a specific focus on achieving a user-friendly interface for the non-bioinformatics-savvy scientist. It currently includes gene information for over 38,000 different taxa retrieved from the National Center for Biotechnology and Information (NCBI) Gene resource. Supported databases of gene IDs include NCBI Gene Symbols, NCBI Gene IDs (Entrez IDs), OMIM IDs, HGNC IDs, Ensembl IDs, and 28 other taxa-specific identifiers. GeneToList is available at genetolist.com. The tool is a web application that is compatible with many standard browsers. The gene ID conversion feature of this application was found to outcompete the common gene ID conversion tools. Specifically, it was able to successfully convert all tested IDs, whereas the others were not able to recognize the gene aliases. Therefore, the gene ID disambiguation provided by this application should be beneficial for many scientists dealing with gene data when the uniformity of gene nomenclature is important for downstream analysis.
组学技术在生物医学研究和转化医学中的应用日益广泛,这给这些方法所产生的庞大而复杂数据集的终端用户带来了挑战。基因组学中的一个特殊挑战是,大型基因组数据库之间或常用的基因分析工具之间,基因的命名并不统一。此外,在包括同行评审稿件在内的科学交流中,仍然可以发现过时的基因组命名。因此,开发了一个网络应用程序(GeneToList)来协助基因ID转换和别名匹配,特别注重为不精通生物信息学的科学家实现用户友好的界面。它目前包含从美国国立生物技术信息中心(NCBI)基因资源中检索到的超过38,000个不同分类单元的基因信息。支持的基因ID数据库包括NCBI基因符号、NCBI基因ID(Entrez ID)、OMIM ID、HGNC ID、Ensembl ID以及其他28种特定分类单元的标识符。GeneToList可在genetolist.com上获取。该工具是一个与许多标准浏览器兼容的网络应用程序。发现此应用程序的基因ID转换功能优于常见的基因ID转换工具。具体而言,它能够成功转换所有测试的ID,而其他工具则无法识别基因别名。因此,当基因命名的一致性对下游分析很重要时,此应用程序提供的基因ID消歧功能对许多处理基因数据的科学家应该是有益的。