Suppr超能文献

基于 Spark 引擎的疾病负担大数据平台的设计与开发。

Design and Development of a Big Data Platform for Disease Burden Based on the Spark Engine.

机构信息

School of Public Health and Management, Guangzhou University of Chinese Medicine, Guangzhou 510006, China.

College of Physical Education and Health, Guangxi Medical University, Nanning 530021, China.

出版信息

Comput Intell Neurosci. 2023 Feb 6;2023:8963053. doi: 10.1155/2023/8963053. eCollection 2023.

Abstract

OBJECTIVE

This study attempts to build a big data platform for disease burden that can realize the deep coupling of artificial intelligence and public health. This is a highly open and shared intelligent platform, including big data collection, analysis, and result visualization.

METHODS

Based on data mining theory and technology, the current situation of multisource data on disease burden was analyzed. Putting forward the disease burden big data management model, functional modules, and technical framework, Kafka technology is used to optimize the transmission efficiency of the underlying data. This will be an efficient and highly scalable data analysis platform through embedding embedded Sparkmlib in the Hadoop ecosystem.

RESULTS

With the concept of "Internet + medical integration," the overall architecture design of the big data platform for disease burden management was proposed based on the Spark engine and Python language. The main system composition and application scenarios are given at four levels: multisource data collection, data processing, data analysis, and the application layer, according to application scenarios and use requirements.

CONCLUSION

The big data platform of disease burden management helps to promote the multisource convergence of disease burden data and provides a new path for the standardized paradigm of disease burden measurement. Provide methods and ideas for the deep integration of medical big data and the formation of a broader standard paradigm.

摘要

目的

本研究旨在构建一个可实现人工智能与公共卫生深度耦合的疾病负担大数据平台。这是一个高度开放和共享的智能平台,包括大数据采集、分析和结果可视化。

方法

基于数据挖掘理论和技术,分析了疾病负担多源数据的现状。提出了疾病负担大数据管理模型、功能模块和技术框架,利用 Kafka 技术优化底层数据的传输效率。通过在 Hadoop 生态系统中嵌入嵌入式 Sparkmlib,这将是一个高效且具有高可扩展性的数据分析平台。

结果

基于“互联网+医疗”的理念,提出了基于 Spark 引擎和 Python 语言的疾病负担管理大数据平台的总体架构设计。根据应用场景和使用要求,给出了主要系统组成和四个层次的应用场景:多源数据采集、数据处理、数据分析和应用层。

结论

疾病负担管理大数据平台有助于促进疾病负担数据的多源融合,为疾病负担测量的规范化范式提供了新的途径。为医疗大数据的深度融合和更广泛的标准范式的形成提供了方法和思路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c7e/9925246/f026d5ffc3b9/CIN2023-8963053.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验