Indiana University School of Social Work, 902 West New York Street, Indianapolis, IN 46202-5156, USA,
J Ment Health Policy Econ. 2023 Jun 1;26(2):63-76.
Human resources (HR) departments collect extensive employee data that can be useful for predicting turnover. Yet, these data are not often used to address turnover due to the complex nature of recorded data forms.
The goal of the current study was to predict community mental health center employees' turnover by applying machine learning (ML) methods to HR data and to evaluate the feasibility of the ML approaches.
Historical HR data were obtained from two community mental health centers, and ML approaches with random forest and lasso regression as training models were applied.
The results suggested a good level of predictive accuracy for turnover, particularly with the random forest model (e.g., Area Under the Curve was above .8) compared to the lasso regression model overall. The study also found that the ML methods could identify several important predictors (e.g., past work years, wage, work hours, age, job position, training hours, and marital status) for turnover using historical HR data. The HR data extraction processes for ML applications were also evaluated as feasible.
The current study confirmed the feasibility of ML approaches for predicting individual employees' turnover probabilities by using HR data the organizations had already collected in their routine organizational management practice. The developed approaches can be used to identify employees who are at high risk for turnover. Because our primary purpose was to apply ML methods to estimate an individual employee's turnover probability given their available HR data (rather than determining generalizable predictors at the wider population level), our findings are limited or restricted to the specific organizations under the study. As ML applications are accumulated across organizations, it may be expected that some findings might be more generalizable across different organizations while others may be more organization-specific (idiographic).
The organization-specific findings can be useful for the organization's HR and leadership to evaluate and address turnover in their specific organizational contexts. Preventing extensive turnover has been a significant priority for many mental health organizations to maintain the quality of services for clients.
The generalizable findings may contribute to broader policy and workforce development efforts.
As our continuing research effort, it is important to study how the ML methods and outputs can be meaningfully utilized in routine management and leadership practice settings in mental health (including how to develop organization-tailored intervention strategies to support and retain employees) beyond identifying high turnover risk individuals. Such organization-based intervention strategies with ML applications can be accumulated and shared by organizations, which will facilitate the evidence-based learning communities to address turnover. This, in turn, may enhance the quality of care we can offer to clients. The continuing efforts will provide new insights and avenues to address data-driven, evidence-based turnover prediction and prevention strategies using HR data that are often under-utilized.
人力资源部门收集了大量员工数据,这些数据可用于预测离职率。然而,由于记录数据形式的复杂性,这些数据通常未被用于解决离职问题。
本研究旨在通过应用机器学习 (ML) 方法对人力资源数据进行分析,预测社区心理健康中心员工的离职率,并评估 ML 方法的可行性。
从两家社区心理健康中心获取历史人力资源数据,并应用随机森林和套索回归作为训练模型的 ML 方法。
研究结果表明,与套索回归模型相比,随机森林模型(例如,曲线下面积高于.8)在预测离职率方面具有较好的准确性。此外,该研究还发现,ML 方法可以使用历史人力资源数据识别出几个重要的离职预测因素(例如,过去的工作年限、工资、工作时间、年龄、职位、培训时间和婚姻状况)。此外,还评估了用于 ML 应用的人力资源数据提取过程的可行性。
本研究证实了 ML 方法在使用组织在常规组织管理实践中已经收集的人力资源数据预测个体员工离职概率方面的可行性。所开发的方法可用于识别离职风险较高的员工。由于我们的主要目的是应用 ML 方法根据员工的可用人力资源数据来估计其离职概率(而不是在更广泛的人群水平上确定可推广的预测因素),因此我们的发现仅限于研究中的特定组织。随着 ML 应用在各组织中的积累,预计某些发现可能在不同组织中更具普遍性,而其他发现则更具组织特异性(个体化)。
组织特有的发现可帮助组织的人力资源和领导层在其特定的组织环境中评估和解决离职问题。防止大量离职一直是许多心理健康组织的首要任务,以维持客户服务的质量。
可推广的发现可能有助于更广泛的政策和劳动力发展努力。
作为我们持续的研究工作,重要的是要研究如何在心理健康的常规管理和领导实践环境中(包括如何制定组织定制的干预策略以支持和留住员工)有意义地利用 ML 方法和输出,而不仅仅是识别高离职风险个体。通过 ML 应用的这种基于组织的干预策略可以由组织积累和共享,这将促进循证学习社区来解决离职问题。这反过来又可以提高我们为客户提供的护理质量。持续的努力将为使用人力资源数据进行数据驱动、循证的离职预测和预防策略提供新的见解和途径,这些数据通常未得到充分利用。