Computational Biology Unit (CBU), Department of Informatics, University of Bergen, 5008 Bergen, Norway, Proteomics Unit (PROBE), Department of Biomedicine, University of Bergen, 5020 Bergen, Norway, and Department of Medical Genetics, Haukeland University Hospital, 5021 Bergen, Norway.
Department of Clinical Science, University of Bergen, 5020 Bergen, Norway.
Biostatistics. 2023 Oct 18;24(4):1031-1044. doi: 10.1093/biostatistics/kxac014.
Experimental design usually focuses on the setting where treatments and/or other aspects of interest can be manipulated. However, in observational biomedical studies with sequential processing, the set of available samples is often fixed, and the problem is thus rather the ordering and allocation of samples to batches such that comparisons between different treatments can be made with similar precision. In certain situations, this allocation can be done by hand, but this rapidly becomes impractical with more challenging cohort setups. Here, we present a fast and intuitive algorithm to generate balanced allocations of samples to batches for any single-variable model where the treatment variable is nominal. This greatly simplifies the grouping of samples into batches, makes the process reproducible, and provides a marked improvement over completely random allocations. The general challenges of allocation and why good solutions can be hard to find are also discussed, as well as potential extensions to multivariable settings.
实验设计通常侧重于可以处理治疗方法和/或其他感兴趣方面的设置。然而,在具有连续处理的观察性生物医学研究中,可用样本的集合通常是固定的,因此问题是如何对样本进行排序和分配批次,以便可以在类似的精度下对不同的处理方法进行比较。在某些情况下,可以手动完成这种分配,但随着更具挑战性的队列设置,这很快就变得不切实际。在这里,我们提出了一种快速而直观的算法,用于为任何处理变量为名义变量的单变量模型生成样本到批次的平衡分配。这极大地简化了将样本分组到批次的过程,使该过程具有可重复性,并与完全随机分配相比有显著的改进。还讨论了分配的一般挑战以及为什么很难找到好的解决方案,以及向多变量设置的潜在扩展。