Matthew Rabinowitz, Banjevic Milena, Chan A S, Myers Lance, Wolkowicz Roland, Haberer Jessica, Singer Joshua
Gene Security Network, Portola Valley, CA, USA.
AMIA Annu Symp Proc. 2005;2005:505-9.
We describe the use of the l1 norm for selection of a sparse set of model parameters that are used in the prediction of viral drug response, based on genetic sequence data of the Human Immunodeficiency Virus (HIV) reverse-transcriptase enzyme. We discuss the use of the l1 norm in the Least Absolute Selection and Shrinkage Operator (LASSO) regression model and the Support Vector Machine model. When tested by cross-validation with laboratory measurements, these models predict viral phenotype, or resistance, in response to Reverse-Transcriptase Inhibitors (RTIs) more accurately than other known models. The l1 norm is the most selective convex function, which sets a large proportion of the parameters to zero and also assures that a single optimal solution will be found, given a particular model formulation and training data set. A statistical model that reliably predicts viral drug response is an important tool in the selection of Anti-Retroviral Therapy. These techniques have general application to modeling phenotype from complex genetic data.
我们描述了基于人类免疫缺陷病毒(HIV)逆转录酶的基因序列数据,使用l1范数来选择一组稀疏的模型参数,这些参数用于预测病毒药物反应。我们讨论了l1范数在最小绝对收缩选择算子(LASSO)回归模型和支持向量机模型中的应用。当通过与实验室测量值进行交叉验证测试时,这些模型比其他已知模型更准确地预测了针对逆转录酶抑制剂(RTIs)的病毒表型或耐药性。l1范数是最具选择性的凸函数,它将大部分参数设置为零,并确保在给定特定模型公式和训练数据集的情况下能够找到单个最优解。一个能够可靠预测病毒药物反应的统计模型是抗逆转录病毒疗法选择中的重要工具。这些技术在从复杂遗传数据建模表型方面具有广泛应用。