Ayasdi Inc, Palo Alto, CA 94301, USA.
BMC Bioinformatics. 2012 Dec 2;13:321. doi: 10.1186/1471-2105-13-321.
Methods of weakening and attenuating pathogens' abilities to infect and propagate in a host, thus allowing the natural immune system to more easily decimate invaders, have gained attention as alternatives to broad-spectrum targeting approaches. The following work describes a technique to identifying proteins involved in virulence by relying on latent information computationally gathered across biological repositories, applicable to both generic and specific virulence categories.
A lightweight method for data integration is used, which links information regarding a protein via a path-based query graph. A method of weighting is then applied to query graphs that can serve as input to various statistical classification methods for discrimination, and the combined usage of both data integration and learning methods are tested against the problem of both generalized and specific virulence function prediction.
This approach improves coverage of functional data over a protein. Moreover, while depending largely on noisy and potentially non-curated data from public sources, we find it outperforms other techniques to identification of general virulence factors and baseline remote homology detection methods for specific virulence categories.
削弱和衰减病原体在宿主中感染和繁殖能力的方法,从而使自然免疫系统更容易消灭入侵者,作为广谱靶向方法的替代方法受到关注。以下工作描述了一种通过依赖于跨生物存储库计算收集的潜在信息来识别与毒力相关的蛋白质的技术,适用于通用和特定毒力类别。
使用轻量级的数据集成方法,通过基于路径的查询图将有关蛋白质的信息链接起来。然后对查询图应用加权方法,这些方法可以作为各种统计分类方法的输入,用于区分,并针对广义和特定毒力功能预测的问题测试数据集成和学习方法的组合使用。
该方法提高了蛋白质功能数据的覆盖范围。此外,虽然它主要依赖于来自公共资源的嘈杂且潜在非编目数据,但我们发现它优于其他一般毒力因子识别技术和特定毒力类别远程同源性检测方法的基线。