Torres Isaac, Zhang Shufan, Bouffier Amanda, Skaro Michael, Wu Yue, Stupp Lauren, Arnold Jonathan, Chung Y Anny, Schuttler H-Bernd
Institute of Bioinformatics, University of Georgia, Athens, GA 30602 USA.
Department of Genetics, Stanford University, Stanford, CA 94309 USA.
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf167.
The Maximally Informative Next Experiment or MINE is a new experimental design approach for experiments, such as those in omics, in which the number of effects or parameters p greatly exceeds the number of samples n (p > n). Classical experimental design presumes n > p for inference about parameters and its application to p > n can lead to over-fitting. To overcome p > n, MINE is an ensemble method, which makes predictions about future experiments from an existing ensemble of models consistent with available data in order to select the most informative next experiment. Its advantages are in exploration of the data for new relationships with n < p and being able to integrate smaller and more tractable experiments to replace adaptively one large classic experiment as discoveries are made. Thus, using MINE is model-guided and adaptive over time in a large omics study. Here, MINE is illustrated in two distinct multiyear experiments, one involving genetic networks in Neurospora crassa and a second one involving a genome-wide association study in Sorghum bicolor as a comparison to classic experimental design in an agricultural setting.
最大信息下一个实验(MINE)是一种用于实验的新实验设计方法,例如在组学实验中,效应或参数的数量p远超过样本数量n(p > n)。经典实验设计假定n > p以推断参数,将其应用于p > n可能会导致过拟合。为了克服p > n的情况,MINE是一种集成方法,它根据与现有数据一致的现有模型集成对未来实验进行预测,以便选择信息量最大的下一个实验。其优点在于在n < p的情况下探索数据中的新关系,并且能够整合较小且更易于处理的实验,以便随着发现的进行自适应地取代一个大型经典实验。因此,在大型组学研究中,使用MINE是由模型引导且随时间自适应的。在此,通过两个不同的多年实验展示了MINE,一个涉及粗糙脉孢菌中的遗传网络,另一个涉及双色高粱的全基因组关联研究,作为在农业环境中与经典实验设计的比较。