From the Division of Epidemiology and Biostatistics, University of California, Berkeley, CA.
Department of Neurological Surgery, University of California, San Francisco, CA.
Epidemiology. 2020 Mar;31(2):224-228. doi: 10.1097/EDE.0000000000001122.
Until recently, large individual-level longitudinal data were unavailable to investigate clusters of disease, driving a need for suitable statistical tools. We introduce a robust, efficient, intuitive R package, ClustR, for space-time cluster analysis of individual-level data.
We developed ClustR and evaluated the tool using a simulated dataset mirroring the population of California with constructed clusters. We assessed Cluster's performance under various conditions and compared it with another space-time clustering algorithm: SaTScan.
ClustR mostly exhibited high sensitivity for urban clusters and low sensitivity for rural clusters. Specificity was generally high. Compared with SaTScan, ClustR ran faster and demonstrated similar sensitivity, but had lower specificity. Select cluster types were detected better by ClustR than SaTScan and vice versa.
ClustR is a user-friendly, publicly available tool designed to perform efficient cluster analysis on individual-level data, filling a gap among current tools. ClustR and SaTScan exhibited different strengths and may be useful in conjunction.
直到最近,大规模的个体水平纵向数据还无法用于研究疾病集群,这就需要合适的统计工具。我们引入了一个强大、高效、直观的 R 包 ClustR,用于个体水平数据的时空聚类分析。
我们开发了 ClustR,并使用模拟数据集(模仿加利福尼亚州的人群构建聚类)对该工具进行了评估。我们评估了 Cluster 在各种条件下的性能,并将其与另一种时空聚类算法 SaTScan 进行了比较。
ClustR 对城市集群的敏感性较高,对农村集群的敏感性较低。特异性通常较高。与 SaTScan 相比,ClustR 运行速度更快,敏感性相似,但特异性较低。一些特定类型的聚类 ClustR 比 SaTScan 检测得更好,反之亦然。
ClustR 是一个用户友好的、公开可用的工具,旨在对个体水平数据进行高效的聚类分析,填补了现有工具之间的空白。ClustR 和 SaTScan 表现出不同的优势,可能会结合使用。