Cruz-Cano Raul, Chew David S H, Kwok-Pui Choi, Ming-Ying Leung
Department of Computer and Information Sciences, Texas A&M University-Texarkana, Texarkana, TX, 75501, USA,
INFORMS J Comput. 2010 Jun 1;22(3):457-470. doi: 10.1287/ijoc.1090.0360.
Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications.
其DNA基因组的复制是许多病毒繁殖过程中的核心步骤。因此,寻找作为DNA复制过程起始位点的复制起点的程序对于控制此类病毒的生长和传播至关重要。现有的用于病毒复制起点预测的计算方法大多在疱疹病毒家族中进行了测试。本文提出了一种基于最小二乘支持向量机(LS-SVM)的新方法,并不仅在疱疹病毒家族上测试了其性能,还在来自长尾病毒目下三个病毒家族的一组长尾噬菌体上进行了测试。LS-SVM方法提供的敏感性和阳性预测值优于或与先前方法相当。当与先前方法适当地结合时,LS-SVM方法进一步提高了疱疹病毒复制起点的预测准确性。此外,通过递归特征消除,LS-SVM还帮助找到了数据集的最重要特征。结果表明,LS-SVM将成为病毒复制起点预测计算工具集的一个非常有用的补充,并说明了基于优化的计算技术在生物医学应用中的价值。