Department of Infectious Disease, Faculty of Medicine, Imperial College London, LondonW12 0NN, U.K.
Department of Electrical and Electronic Engineering, Faculty of Engineering, Imperial College London, LondonSW7 2AZ, U.K.
Anal Chem. 2022 Oct 18;94(41):14159-14168. doi: 10.1021/acs.analchem.2c01883. Epub 2022 Oct 3.
Real-time digital polymerase chain reaction (qdPCR) coupled with machine learning (ML) methods has shown the potential to unlock scientific breakthroughs, particularly in the field of molecular diagnostics for infectious diseases. One promising application of this emerging field explores single fluorescent channel PCR multiplex by extracting target-specific kinetic and thermodynamic information contained in amplification curves, also known as data-driven multiplexing. However, accurate target classification is compromised by the presence of undesired amplification events and not ideal reaction conditions. Therefore, here, we proposed a novel framework to identify and filter out nonspecific and low-efficient reactions from qdPCR data using outlier detection algorithms purely based on sigmoidal trends of amplification curves. As a proof-of-concept, this framework is implemented to improve the classification performance of the recently reported data-driven multiplexing method called amplification curve analysis (ACA), using available published data where the ACA is demonstrated to screen carbapenemase-producing organisms in clinical isolates. Furthermore, we developed a novel strategy, named adaptive mapping filter (AMF), to adjust the percentage of outliers removed according to the number of positive counts in qdPCR. From an overall total of 152,000 amplification events, 116,222 positive amplification reactions were evaluated before and after filtering by comparing against melting peak distribution, proving that abnormal amplification curves (outliers) are linked to shifted melting distribution or decreased PCR efficiency. The ACA was applied to assess classification performance before and after AMF, showing an improved sensitivity of 1.2% when using inliers compared to a decrement of 19.6% when using outliers (-value < 0.0001), removing 53.5% of all wrong melting curves based only on the amplification shape. This work explores the correlation between the kinetics of amplification curves and the thermodynamics of melting curves, and it demonstrates that filtering out nonspecific or low-efficient reactions can significantly improve the classification accuracy for cutting-edge multiplexing methodologies.
实时数字聚合酶链反应(qdPCR)与机器学习(ML)方法相结合,显示出了在分子诊断传染病领域取得科学突破的潜力。这一新兴领域的一个有前途的应用是通过提取扩增曲线中包含的目标特异性动力学和热力学信息,对单一荧光通道 PCR 多重扩增进行探索,这种方法也被称为数据驱动的多重扩增。然而,由于存在非特异性扩增事件和不理想的反应条件,准确的目标分类会受到影响。因此,在这里,我们提出了一种新的框架,该框架使用基于扩增曲线的对数趋势的异常值检测算法,从 qdPCR 数据中识别和过滤非特异性和低效率的反应。作为概念验证,该框架用于提高最近报道的数据驱动多重扩增方法(称为扩增曲线分析(ACA))的分类性能,该方法使用已发表的可用数据,在这些数据中,ACA 被证明可以筛选临床分离物中的碳青霉烯酶产生菌。此外,我们开发了一种新的策略,称为自适应映射滤波器(AMF),根据 qdPCR 中的阳性计数数量调整去除异常值的百分比。在总共 152000 个扩增事件中,通过比较熔解峰分布,在过滤前后对 116222 个阳性扩增反应进行了评估,证明异常扩增曲线(异常值)与熔解分布的偏移或 PCR 效率的降低有关。ACA 用于评估 AMF 前后的分类性能,与使用异常值相比,使用内插值可将灵敏度提高 1.2%(-值<0.0001),与使用异常值相比,灵敏度降低 19.6%(-值<0.0001),仅根据扩增形状就可以去除 53.5%的错误熔解曲线。这项工作探索了扩增曲线动力学与熔解曲线热力学之间的相关性,并证明了过滤非特异性或低效率反应可以显著提高尖端多重扩增方法的分类准确性。