Suppr超能文献

SpolLineages 工具中的新方法可快速准确地预测结核分枝杆菌复合群 spoligotype 家族。

Novel methods included in SpolLineages tool for fast and precise prediction of Mycobacterium tuberculosis complex spoligotype families.

机构信息

WHO Supranational TB Reference Laboratory, Tuberculosis and Mycobacteria Unit, Institut Pasteur de la Guadeloupe, F-97183, Abymes, Guadeloupe, France.

Laboratoire de Mathématiques Informatique et Applications (LAMIA), Université des Antilles, F-97154, Pointe-à-Pitre, Guadeloupe, France.

出版信息

Database (Oxford). 2020 Dec 15;2020. doi: 10.1093/database/baaa108.

Abstract

Bioinformatic tools are currently being developed to better understand the Mycobacterium tuberculosis complex (MTBC). Several approaches already exist for the identification of MTBC lineages using classical genotyping methods such as mycobacterial interspersed repetitive units-variable number of tandem DNA repeats and spoligotyping-based families. In the recently released SITVIT2 proprietary database of the Institut Pasteur de la Guadeloupe, a large number of spoligotype families were assigned by either manual curation/expertise or using an in-house algorithm. In this study, we present two complementary data-driven approaches allowing fast and precise family prediction from spoligotyping patterns. The first one is based on data transformation and the use of decision tree classifiers. In contrast, the second one searches for a set of simple rules using binary masks through a specifically designed evolutionary algorithm. The comparison with the three main approaches in the field highlighted the good performances of our contributions and the significant runtime gain. Finally, we propose the 'SpolLineages' software tool (https://github.com/dcouvin/SpolLineages), which implements these approaches for MTBC spoligotype families' identification.

摘要

生物信息学工具目前正在开发中,以更好地了解结核分枝杆菌复合群(MTBC)。已经存在几种使用经典基因分型方法(例如分枝杆菌插入重复单位-可变数目的串联 DNA 重复和 spoligotyping 家族)来鉴定 MTBC 谱系的方法。在巴斯德研究所瓜德罗普岛分部最近发布的 SITVIT2 专有的数据库中,大量 spoligotype 家族是通过手动策展/专业知识或使用内部算法来分配的。在这项研究中,我们提出了两种互补的数据驱动方法,可从 spoligotyping 模式中快速而精确地预测家族。第一种方法基于数据转换和决策树分类器的使用。相比之下,第二种方法通过专门设计的进化算法使用二进制掩码搜索一组简单的规则。与该领域的三种主要方法进行比较突出了我们的贡献的良好性能和显著的运行时增益。最后,我们提出了“SpolLineages”软件工具(https://github.com/dcouvin/SpolLineages),该工具实现了这些方法来鉴定 MTBC spoligotype 家族。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2bf3/7737520/69c92464012c/baaa108f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验