University of Missouri.
Florida St. University.
Multivariate Behav Res. 2021 Jan-Feb;56(1):57-69. doi: 10.1080/00273171.2020.1717921. Epub 2020 Feb 13.
Using complete enumeration (e.g., generating all possible subsets of item combinations) to evaluate clustering problems has the benefit of locating globally optimal solutions automatically without the concern of sampling variability. The proposed method is meant to combine clustering variables in such a way as to create groups that are maximally different on a theoretically sound derivation variable(s). After the population of all unique sets is permuted, optimization on some predefined, user-specific function can occur. We apply this technique to optimizing the diagnosis of Alcohol Use Disorder. This is a unique application, from a clustering point of view, in that the decision rule for clustering observations into the "diagnosis" group relies on both the set of items being considered and a predefined threshold on the number of items required to be endorsed for the "diagnosis" to occur. In optimizing diagnostic rules, criteria set sizes can be reduced without a loss of significant information when compared to current and proposed, alternative, diagnostic schemes.
使用完全枚举(例如,生成项目组合的所有可能子集)来评估聚类问题的优点是可以自动定位全局最优解,而无需担心抽样变异性。所提出的方法旨在以这样的方式组合聚类变量,即在理论上合理的推导变量(s)上创建最大差异的组。在所有唯一集的群体被置换之后,可以在一些预定义的、特定于用户的函数上进行优化。我们将此技术应用于优化酒精使用障碍的诊断。从聚类的角度来看,这是一个独特的应用,因为将观察结果聚类到“诊断”组的决策规则既依赖于正在考虑的项目集,也依赖于发生“诊断”所需的项目数量的预定义阈值。在优化诊断规则时,与当前和提议的替代诊断方案相比,可以在不损失重要信息的情况下减少标准集的大小。