National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA.
Mol Ecol Resour. 2022 Nov;22(8):2994-3005. doi: 10.1111/1755-0998.13681. Epub 2022 Jul 12.
Environmental DNA (eDNA) sampling is a highly sensitive and cost-effective technique for wildlife monitoring, notably through the use of qPCR assays. However, it can be difficult to ensure assay specificity when many closely related species co-occur. In theory, specificity may be assessed in silico by determining whether assay oligonucleotides have enough base-pair mismatches with nontarget sequences to preclude amplification. However, the mismatch qualities required are poorly understood, making in silico assessments difficult and often necessitating extensive in vitro testing-typically the greatest bottleneck in assay development. Increasing the accuracy of in silico assessments would therefore streamline the assay development process. In this study, we paired 10 qPCR assays with 82 synthetic gene fragments for 530 specificity tests using SYBR Green intercalating dye (n = 262) and TaqMan hydrolysis probes (n = 268). Test results were used to train random forest classifiers to predict amplification. The primer-only model (SYBR Green results) and full-assay model (TaqMan probe-based results) were 99.6% and 100% accurate, respectively, in cross-validation. We further assessed model performance using six independent assays not used in model training. In these tests the primer-only model was 92.4% accurate (n = 119) and the full-assay model was 96.5% accurate (n = 144). The high performance achieved by these models makes it possible for eDNA practitioners to more quickly and confidently develop assays specific to the intended target. Practitioners can access the full-assay model online via eDNAssay (https://NationalGenomicsCenter.shinyapps.io/eDNAssay), a user-friendly tool for predicting qPCR cross-amplification.
环境 DNA(eDNA)采样是一种高度敏感且具有成本效益的野生动物监测技术,尤其是通过 qPCR 检测方法。然而,当许多密切相关的物种同时存在时,确保检测方法的特异性可能会变得很困难。从理论上讲,可以通过确定检测寡核苷酸与非目标序列之间的碱基对错配数量是否足以阻止扩增来在计算机上评估特异性。然而,对所需的错配质量知之甚少,这使得计算机评估变得困难,并且通常需要进行广泛的体外测试-这通常是检测方法开发的最大瓶颈。因此,提高计算机评估的准确性将简化检测方法的开发过程。在这项研究中,我们将 10 个 qPCR 检测方法与 82 个合成基因片段配对,使用 SYBR Green 嵌入染料(n=262)和 TaqMan 水解探针(n=268)进行了 530 次特异性测试。测试结果用于训练随机森林分类器以预测扩增。在交叉验证中,仅引物模型(SYBR Green 结果)和完整检测方法模型(基于 TaqMan 探针的结果)的准确率分别为 99.6%和 100%。我们还使用未用于模型训练的六个独立检测方法进一步评估了模型性能。在这些测试中,仅引物模型的准确率为 92.4%(n=119),完整检测方法模型的准确率为 96.5%(n=144)。这些模型的高性能使得 eDNA 从业人员能够更快、更有信心地开发针对特定目标的检测方法。从业人员可以通过 eDNAssay(https://NationalGenomicsCenter.shinyapps.io/eDNAssay)在线访问完整的检测方法模型,这是一个用于预测 qPCR 交叉扩增的用户友好工具。