Shin Eun Kyong, Mahajan Ruhi, Akbilgic Oguz, Shaban-Nejad Arash
1Department of Pediatrics, University of Tennessee Health Science Center - Oak Ridge National Laboratory- (UTHSC-ORNL), Center for Biomedical Informatics, Memphis, TN USA.
2Department of Preventive Medicine, UTHSC, Memphis, TN USA.
NPJ Digit Med. 2018 Oct 2;1:50. doi: 10.1038/s41746-018-0056-y. eCollection 2018.
The importance of social components of health has been emphasized both in epidemiology and public health. This paper highlights the significant impact of social components on health outcomes in a novel way. Introducing the concept of sociomarkers, which are measurable indicators of social conditions in which a patient is embedded, we employed a machine learning approach that uses both biomarkers and sociomarkers to identify asthma patients at risk of a hospital revisit after an initial visit with an accuracy of 66%. The analysis has been performed over an integrated dataset consisting of individual-level patient information such as gender, race, insurance type, and age, along with ZIP code-level sociomarkers such as poverty level, blight prevalence, and housing quality. Using this uniquely integrated database, we then compare the traditional biomarker-based risk model and the sociomarker-based risk model. A biomarker-based predictive model yields an accuracy of 65% and the sociomarker-based model predicts with an accuracy of 61%. Without knowing specific symptom-related features, the sociomarker-based model can correctly predict two out of three patients at risk. We systematically show that sociomarkers play an important role in predicting health outcomes at the individual level in pediatric asthma cases. Additionally, by merging multiple data sources with detailed neighborhood-level data, we directly measure the importance of residential conditions for predicting individual health outcomes.
健康的社会因素的重要性在流行病学和公共卫生领域都得到了强调。本文以一种新颖的方式突出了社会因素对健康结果的重大影响。引入社会标志物的概念,即患者所处社会状况的可测量指标,我们采用了一种机器学习方法,该方法同时使用生物标志物和社会标志物来识别初次就诊后有再次入院风险的哮喘患者,准确率为66%。分析是在一个综合数据集上进行的,该数据集包括个体层面的患者信息,如性别、种族、保险类型和年龄,以及邮政编码层面的社会标志物,如贫困水平、破败率和住房质量。利用这个独特的综合数据库,我们随后比较了传统的基于生物标志物的风险模型和基于社会标志物的风险模型。基于生物标志物的预测模型准确率为65%,基于社会标志物的模型预测准确率为61%。在不了解特定症状相关特征的情况下,基于社会标志物的模型能够正确预测三分之二有风险的患者。我们系统地表明,社会标志物在预测儿童哮喘病例个体层面的健康结果方面发挥着重要作用。此外,通过将多个数据源与详细的社区层面数据合并,我们直接测量了居住条件对预测个体健康结果的重要性。