Kanduri Chakravarthi, Mamica Maria, Olstad Emilie Willoch, Zucknick Manuela, Li Jingyi Jessica, Sandve Geir Kjetil
Scientific Computing and Machine Learning Section, Department of Informatics, University of Oslo, Oslo, Norway.
UiORealArt Convergence Environment, University of Oslo, Oslo, Norway.
Genome Biol. 2025 Aug 18;26(1):249. doi: 10.1186/s13059-025-03734-z.
The false discovery rate (FDR) controlling method by Benjamini and Hochberg (BH) is a popular choice in the omics fields. Here, we demonstrate that in datasets with a large degree of dependencies between features, FDR correction methods like BH can sometimes counter-intuitively report very high numbers of false positives, potentially misleading researchers. We call the attention of researchers to use suited multiple testing strategies and approaches like synthetic null data (negative control) to identify and minimize caveats related to false discoveries, as in the cases where false findings do occur, they may be numerous.
本雅明尼和霍奇伯格(BH)提出的错误发现率(FDR)控制方法是组学领域中常用的选择。在此,我们证明,在特征之间存在高度依赖性的数据集中,像BH这样的FDR校正方法有时可能会违反直觉地报告大量假阳性结果,这可能会误导研究人员。我们提醒研究人员使用合适的多重检验策略和方法,如合成零数据(阴性对照),以识别并尽量减少与错误发现相关的问题,因为在确实出现错误结果的情况下,可能会有很多。