Aminian Minoo, Shabbeer Amina, Bennett Kristin P
Departments of Mathematical Science and Computer Science, Rensselaer Polytechnic Institute.
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2009 Nov 1;2009:338-343. doi: 10.1109/BIBM.2009.86.
We present a novel Bayesian network (BN) to classify strains of Mycobacterium tuberculosis Complex (MTBC) into six major genetic lineages using mycobacterial interspersed repetitive units (MIRUs), a high-throughput biomarker. MTBC is the causative agent of tuberculosis (TB), which remains one of the leading causes of disease and morbidity world-wide. DNA fingerprinting methods such as MIRU are key components of modern TB control and tracking. The BN achieves high accuracy on four large MTBC genotype collections consisting of over 4700 distinct 12-loci MIRU genotypes. The BN captures distinct MIRU signatures associated with each lineage, explaining the excellent performance of the BN. The errors in the BN support the need for additional biomarkers such as the expanded 24-loci MIRU used in CDC genotyping labs since May 2009. The conditional independence assumption of each locus given the lineage makes the BN easily extensible to additional MIRU loci and other biomarkers.
我们提出了一种新颖的贝叶斯网络(BN),用于使用分枝杆菌散布重复单元(MIRU,一种高通量生物标志物)将结核分枝杆菌复合群(MTBC)菌株分类为六个主要遗传谱系。MTBC是结核病(TB)的病原体,结核病仍是全球主要的疾病和发病原因之一。诸如MIRU之类的DNA指纹识别方法是现代结核病控制和追踪的关键组成部分。该贝叶斯网络在四个大型MTBC基因型集合上实现了高精度,这些集合包含超过4700种不同的12位点MIRU基因型。该贝叶斯网络捕捉到了与每个谱系相关的独特MIRU特征,这解释了其出色的性能。贝叶斯网络中的错误表明需要额外的生物标志物,例如自2009年5月以来疾病控制与预防中心(CDC)基因分型实验室中使用的扩展的24位点MIRU。给定谱系时每个位点的条件独立性假设使得贝叶斯网络易于扩展到额外的MIRU位点和其他生物标志物。