Department of Biology, University of Turku, 20014, Turku, Finland.
Department of Environmental Microbiology, Eawag, Swiss Federal Institute of Aquatic Science and Technology, 8600, Dübendorf, Switzerland.
BMC Bioinformatics. 2022 May 12;23(1):174. doi: 10.1186/s12859-022-04710-1.
Designing oligonucleotide primers and probes is one of the key steps of various laboratory experiments such as multiplexed PCR or digital multiplexed ligation assays. When designing multiplexed primers and probes to complex, heterogeneous DNA data sets, an optimization problem can arise where the smallest number of oligonucleotides covering the largest diversity of the input dataset needs to be identified. Tools that provide this optimization in an efficient manner for large input data are currently lacking.
Here we present Prider, an R package for designing primers and probes with a nearly optimal coverage for complex and large sequence sets. Prider initially prepares a full primer coverage of the input sequences, the complexity of which is subsequently reduced by removing components of high redundancy or narrow coverage. The primers from the resulting near-optimal coverage are easily accessible as data frames and their coverage across the input sequences can be visualised as heatmaps using Prider's plotting function. Prider permits efficient design of primers to large DNA datasets by scaling linearly to increasing sequence data, regardless of the diversity of the dataset.
Prider solves a recalcitrant problem in molecular diagnostics: how to cover a maximal sequence diversity with a minimal number of oligonucleotide primers or probes. The combination of Prider with highly scalable molecular quantification techniques will permit an unprecedented molecular screening capability with immediate applicability in fields such as clinical microbiology, epidemic virus surveillance or antimicrobial resistance surveillance.
设计寡核苷酸引物和探针是各种实验室实验(如多重 PCR 或数字多重连接检测)的关键步骤之一。当设计用于复杂、异质 DNA 数据集的多重引物和探针时,可能会出现一个优化问题,即需要确定覆盖输入数据集最大多样性的最小数量的寡核苷酸。目前缺乏以高效方式为大型输入数据提供这种优化的工具。
这里我们介绍了 Prider,这是一个用于设计引物和探针的 R 包,可实现复杂和大型序列集的近乎最优覆盖。Prider 最初为输入序列准备了完整的引物覆盖,随后通过去除高冗余或覆盖范围狭窄的组件来降低其复杂性。由此产生的近乎最优覆盖的引物可以作为数据框轻松访问,并且可以使用 Prider 的绘图功能将它们在输入序列上的覆盖范围可视化作为热图。无论数据集的多样性如何,Prider 都可以通过线性扩展到不断增加的序列数据来高效设计用于大型 DNA 数据集的引物。
Prider 解决了分子诊断中的一个顽固问题:如何用最少的寡核苷酸引物或探针覆盖最大的序列多样性。Prider 与高度可扩展的分子定量技术相结合,将为临床微生物学、流行病毒监测或抗菌药物耐药性监测等领域提供前所未有的分子筛选能力,并具有即时适用性。