Sriram Vivek, Conard Ashley Mae, Rosenberg Ilyana, Kim Dokyoon, Saponas T Scott, Hall Amanda K
Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
Genomics and Computational Biology Graduate Group, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
Sci Rep. 2025 Feb 21;15(1):6291. doi: 10.1038/s41598-025-90453-x.
Biomedical discovery is fraught with challenges stemming from diverse data types and siloed analysis. In this study, we explored common biomedical data tasks and pain points that could be addressed to elevate data quality, enhance sharing, streamline analysis, and foster collaboration across stakeholders. We recruited fifteen professionals from various biomedical roles and industries to participate in sixty-minute semi-structured interviews, which involved an assessment of their challenges, needs, and tasks as well as a brainstorm exercise to validate each professional's research process. We applied a qualitative analysis of individual interviews using an inductive-deductive thematic coding approach for emerging themes. We identified a common set of challenges related to procuring and validating data, applying new analysis techniques and navigating varied computational environments, distributing results effectively and reproducibly, and managing the flow of data across phases of the data lifecycle. Our findings emphasize the importance of secure data sharing and facilities for collaboration throughout the discovery process. Our identified pain points provide researchers with an opportunity to align workstreams and enhance research data lifecycles to conduct biomedical discovery. We conclude our study with a summary of key actionable recommendations to tackle multiomic data challenges across the stages and phases of biomedical discovery.
生物医学发现面临着诸多挑战,这些挑战源于多样的数据类型和孤立的分析。在本研究中,我们探索了常见的生物医学数据任务和痛点,这些问题若能得到解决,可提升数据质量、加强共享、简化分析并促进各利益相关方之间的协作。我们招募了15位来自不同生物医学领域和行业的专业人士,参与时长60分钟的半结构化访谈,访谈内容包括评估他们面临的挑战、需求和任务,以及开展头脑风暴以验证每位专业人士的研究过程。我们采用归纳 - 演绎主题编码方法对个人访谈进行定性分析,以找出新出现的主题。我们确定了一系列常见挑战,涉及获取和验证数据、应用新的分析技术以及应对各种计算环境、有效且可重复地发布结果,以及管理数据在数据生命周期各阶段的流动。我们的研究结果强调了在整个发现过程中安全数据共享和协作设施的重要性。我们确定的痛点为研究人员提供了一个机会,使其能够调整工作流程并优化研究数据生命周期,以进行生物医学发现。我们在研究结尾总结了关键的可操作建议,以应对生物医学发现各阶段的多组学数据挑战。