Lupolova Nadejda, Dallman Timothy J, Matthews Louise, Bono James L, Gally David L
Division of Immunity and Infection, The Roslin Institute and The Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian EH25 9RG, United Kingdom.
Public Health England, National Infection Service, London NW9 5EQ, United Kingdom.
Proc Natl Acad Sci U S A. 2016 Oct 4;113(40):11312-11317. doi: 10.1073/pnas.1606567113. Epub 2016 Sep 19.
Sequence analyses of pathogen genomes facilitate the tracking of disease outbreaks and allow relationships between strains to be reconstructed and virulence factors to be identified. However, these methods are generally used after an outbreak has happened. Here, we show that support vector machine analysis of bovine E. coli O157 isolate sequences can be applied to predict their zoonotic potential, identifying cattle strains more likely to be a serious threat to human health. Notably, only a minor subset (less than 10%) of bovine E. coli O157 isolates analyzed in our datasets were predicted to have the potential to cause human disease; this is despite the fact that the majority are within previously defined pathogenic lineages I or I/II and encode key virulence factors. The predictive capacity was retained when tested across datasets. The major differences between human and bovine E. coli O157 isolates were due to the relative abundances of hundreds of predicted prophage proteins. This finding has profound implications for public health management of disease because interventions in cattle, such a vaccination, can be targeted at herds carrying strains of high zoonotic potential. Machine-learning approaches should be applied broadly to further our understanding of pathogen biology.
病原体基因组的序列分析有助于追踪疾病爆发情况,还能重建菌株之间的关系并识别毒力因子。然而,这些方法通常在爆发发生后才使用。在此,我们表明对牛源大肠杆菌O157分离株序列进行支持向量机分析可用于预测其人畜共患病潜力,识别出更有可能对人类健康构成严重威胁的牛源菌株。值得注意的是,在我们的数据集中分析的牛源大肠杆菌O157分离株中,只有一小部分(不到10%)被预测有导致人类疾病的潜力;尽管大多数属于先前定义的致病谱系I或I/II并编码关键毒力因子。在跨数据集测试时,预测能力得以保留。人源和牛源大肠杆菌O157分离株之间的主要差异在于数百种预测的原噬菌体蛋白的相对丰度。这一发现对疾病的公共卫生管理具有深远意义,因为对牛的干预措施,如疫苗接种,可以针对携带高人畜共患病潜力菌株的牛群。机器学习方法应广泛应用,以加深我们对病原体生物学的理解。