文献检索，用中文搜 PubMed

BACKGROUND

Health care data are increasing in volume and complexity. Storing and analyzing these data to implement precision medicine initiatives and data-driven research has exceeded the capabilities of traditional computer systems. Modern big data platforms must be adapted to the specific demands of health care and designed for scalability and growth.

OBJECTIVE

The objectives of our study were to (1) demonstrate the implementation of a data science platform built on open source technology within a large, academic health care system and (2) describe 2 computational health care applications built on such a platform.

METHODS

We deployed a data science platform based on several open source technologies to support real-time, big data workloads. We developed data-acquisition workflows for Apache Storm and NiFi in Java and Python to capture patient monitoring and laboratory data for downstream analytics.

RESULTS

Emerging data management approaches, along with open source technologies such as Hadoop, can be used to create integrated data lakes to store large, real-time datasets. This infrastructure also provides a robust analytics platform where health care and biomedical research data can be analyzed in near real time for precision medicine and computational health care use cases.

CONCLUSIONS

The implementation and use of integrated data science platforms offer organizations the opportunity to combine traditional datasets, including data from the electronic health record, with emerging big data sources, such as continuous patient monitoring and real-time laboratory results. These platforms can enable cost-effective and scalable analytics for the information that will be key to the delivery of precision medicine initiatives. Organizations that can take advantage of the technical advances found in data science platforms will have the opportunity to provide comprehensive access to health care data for computational health care and precision medicine research.

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

医疗保健数据的数量和复杂性正在不断增加。存储和分析这些数据以实施精准医疗计划和数据驱动型研究已超出传统计算机系统的能力范围。现代大数据平台必须适应医疗保健的特定需求，并设计为具有可扩展性和增长性。

目的

我们研究的目的是：（1）展示在大型学术医疗保健系统中基于开源技术构建的数据科学平台的实施情况；（2）描述基于此类平台构建的两个计算医疗保健应用程序。

方法

我们部署了一个基于多种开源技术的数据科学平台，以支持实时大数据工作负载。我们用Java和Python为Apache Storm和NiFi开发了数据采集工作流程，以捕获患者监测数据和实验室数据，用于下游分析。

结果

新兴的数据管理方法以及诸如Hadoop之类的开源技术可用于创建集成数据湖，以存储大型实时数据集。该基础设施还提供了一个强大的分析平台，可在该平台上对医疗保健和生物医学研究数据进行近实时分析，以用于精准医疗和计算医疗保健用例。

结论

集成数据科学平台的实施和使用为各组织提供了机会，可将包括电子健康记录数据在内的传统数据集与新兴大数据源（如持续的患者监测数据和实时实验室结果）相结合。这些平台可为对精准医疗计划的实施至关重要的信息提供具有成本效益且可扩展的分析。能够利用数据科学平台中技术进步的组织将有机会为计算医疗保健和精准医学研究提供全面的医疗保健数据访问。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

医疗保健与精准医学研究：一个可扩展数据科学平台的分析

Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

相似文献

引用本文的文献

本文引用的文献

医疗保健与精准医学研究：一个可扩展数据科学平台的分析

Health Care and Precision Medicine Research: Analysis of a Scalable Data Science Platform.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献