Zhang Zhongheng, Gayle Alberto Alexander, Wang Juan, Zhang Haoyang, Cardinal-Fernández Pablo
Department of Emergency Medicine, Sir Run-Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou 310016, China.
Department of Immunology, Mie University Graduate School of Medicine, Mie, Japan.
Ann Transl Med. 2017 Dec;5(24):484. doi: 10.21037/atm.2017.09.39.
A usual practice in observational studies is the comparison of baseline characteristics of participants between study groups. The overall population can be grouped by clinical outcome or exposure status. A combined table reporting baseline characteristics is usually displayed, for the overall population and then separately for each group. The last column usually gives the P value for the comparison between study groups. In the conventional research model, the variables for which data are collected are limited in number. It is thus feasible to calculate descriptive data one by one and to manually create the table. The availability of EHR and big data mining techniques makes it possible to explore a far larger number of variables. However, manual tabulation of big data is particularly error prone; it is exceedingly time-consuming to create and revise such tables manually. In this paper, we introduce an R package called CBCgrps, which is designed to automate and streamline the generation of such tables when working with big data. The package contains two functions, twogrps() and multigrps(), which are used for comparisons between two and multiple groups, respectively.
观察性研究中的一个常见做法是比较研究组之间参与者的基线特征。总体人群可以按临床结局或暴露状态进行分组。通常会展示一个合并表格,报告总体人群以及每个组各自的基线特征。最后一列通常给出研究组之间比较的P值。在传统研究模型中,收集数据的变量数量有限。因此,逐一计算描述性数据并手动创建表格是可行的。电子健康记录(EHR)和大数据挖掘技术的出现使得探索大量变量成为可能。然而,大数据的手动制表特别容易出错;手动创建和修订此类表格极其耗时。在本文中,我们介绍了一个名为CBCgrps的R包,它旨在在处理大数据时自动且简化此类表格的生成。该包包含两个函数,twogrps()和multigrps(),分别用于两组和多组之间的比较。