Tilmon Sandra, Nyenhuis Sharmilee, Solomonides Anthony, Barbarioli Bruno, Bhargava Ankur, Birz Suzi, Bouzein Kathryn, Cardenas Celine, Carlson Bradley, Cohen Ellen, Dillon Emily, Furner Brian, Huang Zhong, Johnson Julie, Krishnan Nivedha, Lazenby Kevin, Li Kaitlyn, Makhni Sonya, Miler Doriane, Ozik Jonathan, Santos Carlos, Sleiman Marc, Solway Julian, Krishnan Sanjay, Volchenboum Samuel
Pediatrics, University of Chicago, Chicago, IL, USA.
Medicine, University of Chicago, Chicago, IL, USA.
J Clin Transl Sci. 2023 Nov 7;7(1):e255. doi: 10.1017/cts.2023.670. eCollection 2023.
BACKGROUND/OBJECTIVE: Non-clinical aspects of life, such as social, environmental, behavioral, psychological, and economic factors, what we call the sociome, play significant roles in shaping patient health and health outcomes. This paper introduces the Sociome Data Commons (SDC), a new research platform that enables large-scale data analysis for investigating such factors.
This platform focuses on "hyper-local" data, i.e., at the neighborhood or point level, a geospatial scale of data not adequately considered in existing tools and projects. We enumerate key insights gained regarding data quality standards, data governance, and organizational structure for long-term project sustainability. A pilot use case investigating sociome factors associated with asthma exacerbations in children residing on the South Side of Chicago used machine learning and six SDC datasets.
The pilot use case reveals one dominant spatial cluster for asthma exacerbations and important roles of housing conditions and cost, proximity to Superfund pollution sites, urban flooding, violent crime, lack of insurance, and a poverty index.
The SDC has been purposefully designed to support and encourage extension of the platform into new data sets as well as the continued development, refinement, and adoption of standards for dataset quality, dataset inclusion, metadata annotation, and data access/governance. The asthma pilot has served as the first driver use case and demonstrates promise for future investigation into the sociome and clinical outcomes. Additional projects will be selected, in part for their ability to exercise and grow the capacity of the SDC to meet its ambitious goals.
背景/目的:生活中的非临床方面,如社会、环境、行为、心理和经济因素,即我们所说的社会组学,在塑造患者健康和健康结果方面发挥着重要作用。本文介绍了社会组学数据共享平台(SDC),这是一个新的研究平台,能够进行大规模数据分析以研究此类因素。
该平台专注于“超局部”数据,即邻里或点层面的数据,这是现有工具和项目未充分考虑的地理空间数据尺度。我们列举了在数据质量标准、数据治理和组织结构方面获得的关键见解,以实现长期项目的可持续性。一个试点用例调查了居住在芝加哥南区的儿童哮喘发作相关的社会组学因素,使用了机器学习和六个SDC数据集。
试点用例揭示了哮喘发作的一个主要空间集群,以及住房条件和成本、靠近超级基金污染场地、城市洪水、暴力犯罪、缺乏保险和贫困指数的重要作用。
SDC经过精心设计,旨在支持和鼓励将该平台扩展到新的数据集,以及持续开发、完善和采用数据集质量、数据集纳入、元数据注释和数据访问/治理的标准。哮喘试点项目已成为第一个驱动用例,并展示了未来对社会组学和临床结果进行调查的前景。将选择其他项目,部分原因是它们有能力锻炼和提升SDC的能力,以实现其宏伟目标。