Yang Guannan, Menkhorst Ellen, Dimitriadis Evdokia, Lê Cao Kim-Anh
Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Parkville, Victoria 3010, Australia.
Department of Obstetrics and Gynaecology, The University of Melbourne, Parkville, Victoria 3010, Australia.
Bioinformatics. 2025 Aug 29. doi: 10.1093/bioinformatics/btaf475.
Integrating the knockoff framework with any variable-selection method delivers stringent false discovery rate (FDR) control without recourse to p-values, offering a powerful alternative for differential expression analysis of high-throughput omics datasets. However, existing knockoff generators rely on restrictive modelling assumptions or coarse approximations that often inflate the FDR when applied to real-world data.
We introduce Partial Least Squares Knockoff (PLSKO), an efficient, assumption-free generator that remains robust across diverse omics platforms. Our extensive simulations show that PLSKO is the only method to maintain FDR control with sufficient power in complex non-linear settings. Our semi-simulation studies drawn from RNA-seq, proteomics, metabolomics, and microbiome experiments confirm PLSKO generates valid knockoff variables. In pre-eclampsia multi-omics case studies, we combine PLSKO with Aggregation Knockoff to address the randomness of knockoffs and improve power, and demonstrate the method's ability to recover biologically meaningful features.
Our proposed algorithm is available on Github (https://github.com/guannan-yang/PLSKO) and Zenodo (https://doi.org/10.5281/zenodo.16879594).
is available online.
将仿冒框架与任何变量选择方法相结合,无需借助p值即可实现严格的错误发现率(FDR)控制,为高通量组学数据集的差异表达分析提供了一种强大的替代方法。然而,现有的仿冒生成器依赖于限制性建模假设或粗略近似,在应用于实际数据时往往会使FDR膨胀。
我们引入了偏最小二乘仿冒(PLSKO),这是一种高效的、无假设的生成器,在各种组学平台上都保持稳健。我们广泛的模拟表明,PLSKO是唯一一种在复杂非线性设置中以足够的功效维持FDR控制的方法。我们从RNA测序、蛋白质组学、代谢组学和微生物组实验中进行的半模拟研究证实,PLSKO生成了有效的仿冒变量。在子痫前期多组学案例研究中,我们将PLSKO与聚合仿冒相结合,以解决仿冒的随机性并提高功效,并证明了该方法恢复生物学上有意义特征的能力。
我们提出的算法可在Github(https://github.com/guannan-yang/PLSKO)和Zenodo(https://doi.org/10.5281/zenodo.16879594)上获取。
可在线获取。