Luke Rayanne A, Kearsley Anthony J, Pisanic Nora, Manabe Yukari C, Thomas David L, Heaney Christopher D, Patrone Paul N
Johns Hopkins University, Whiting School of Engineering, Department of Applied Mathematics and Statistics, Baltimore, MD.
National Institute of Standards and Technology, Applied and Computational Mathematics Division, Gaithersburg, MD.
ArXiv. 2022 Jun 28:arXiv:2206.14316v2.
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has emphasized the importance and challenges of correctly interpreting antibody test results. Identification of positive and negative samples requires a classification strategy with low error rates, which is hard to achieve when the corresponding measurement values overlap. Additional uncertainty arises when classification schemes fail to account for complicated structure in data. We address these problems through a mathematical framework that combines high dimensional data modeling and optimal decision theory. Specifically, we show that appropriately increasing the dimension of data better separates positive and negative populations and reveals nuanced structure that can be described in terms of mathematical models. We combine these models with optimal decision theory to yield a classification scheme that better separates positive and negative samples relative to traditional methods such as confidence intervals (CIs) and receiver operating characteristics. We validate the usefulness of this approach in the context of a multiplex salivary SARS-CoV-2 immunoglobulin G assay dataset. This example illustrates how our analysis: (i) improves the assay accuracy (e.g. lowers classification errors by up to 42 % compared to CI methods); (ii) reduces the number of indeterminate samples when an inconclusive class is permissible (e.g. by 40 % compared to the original analysis of the example multiplex dataset); and (iii) decreases the number of antigens needed to classify samples. Our work showcases the power of mathematical modeling in diagnostic classification and highlights a method that can be adopted broadly in public health and clinical settings.
严重急性呼吸综合征冠状病毒2(SARS-CoV-2)大流行凸显了正确解读抗体检测结果的重要性和挑战。识别阳性和阴性样本需要一种错误率低的分类策略,而当相应的测量值重叠时,这很难实现。当分类方案未能考虑数据中的复杂结构时,会产生额外的不确定性。我们通过一个结合高维数据建模和最优决策理论的数学框架来解决这些问题。具体而言,我们表明适当地增加数据维度能更好地分离阳性和阴性群体,并揭示可以用数学模型描述的细微结构。我们将这些模型与最优决策理论相结合,得出一种相对于传统方法(如置信区间(CI)和受试者操作特征)能更好地分离阳性和阴性样本的分类方案。我们在多重唾液SARS-CoV-2免疫球蛋白G检测数据集的背景下验证了这种方法的有效性。这个例子说明了我们的分析如何:(i)提高检测准确性(例如,与CI方法相比,分类错误率降低多达42%);(ii)在允许不确定类别的情况下减少不确定样本的数量(例如,与示例多重数据集的原始分析相比减少40%);以及(iii)减少分类样本所需的抗原数量。我们的工作展示了数学建模在诊断分类中的力量,并突出了一种可在公共卫生和临床环境中广泛采用的方法。