• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在用于临床服务的大数据分析平台中通过HBase使用分布式数据。

Using Distributed Data over HBase in Big Data Analytics Platform for Clinical Services.

作者信息

Chrimes Dillon, Zamani Hamid

机构信息

Database Integration and Management, IMIT Quality Systems, Vancouver Island Health Authority, Vancouver, BC, Canada V8R 1J8.

School of Health Information Science, Faculty of Human and Social Development, University of Victoria, Victoria, BC, Canada V8P 5C2.

出版信息

Comput Math Methods Med. 2017;2017:6120820. doi: 10.1155/2017/6120820. Epub 2017 Dec 11.

DOI:10.1155/2017/6120820
PMID:29375652
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5742497/
Abstract

Big data analytics (BDA) is important to reduce healthcare costs. However, there are many challenges of data aggregation, maintenance, integration, translation, analysis, and security/privacy. The study objective to establish an interactive BDA platform with simulated patient data using open-source software technologies was achieved by construction of a platform framework with Hadoop Distributed File System (HDFS) using HBase (key-value NoSQL database). Distributed data structures were generated from benchmarked hospital-specific metadata of nine billion patient records. At optimized iteration, HDFS ingestion of HFiles to HBase store files revealed sustained availability over hundreds of iterations; however, to complete MapReduce to HBase required a week (for 10 TB) and a month for three billion (30 TB) indexed patient records, respectively. Found inconsistencies of MapReduce limited the capacity to generate and replicate data efficiently. Apache Spark and Drill showed high performance with high usability for technical support but poor usability for clinical services. Hospital system based on patient-centric data was challenging in using HBase, whereby not all data profiles were fully integrated with the complex patient-to-hospital relationships. However, we recommend using HBase to achieve secured patient data while querying entire hospital volumes in a simplified clinical event model across clinical services.

摘要

大数据分析(BDA)对于降低医疗成本很重要。然而,在数据聚合、维护、集成、转换、分析以及安全/隐私方面存在诸多挑战。本研究旨在利用开源软件技术建立一个带有模拟患者数据的交互式BDA平台,通过使用HBase(键值型非关系型数据库)构建一个带有Hadoop分布式文件系统(HDFS)的平台框架来实现。分布式数据结构是从90亿条患者记录的特定医院基准元数据生成的。在优化迭代过程中,将HFiles从HDFS摄取到HBase存储文件在数百次迭代中显示出持续可用性;然而,完成从MapReduce到HBase的操作分别需要一周时间(对于10TB数据)和一个月时间(对于300亿条,即30TB的索引患者记录)。发现MapReduce的不一致性限制了高效生成和复制数据的能力。Apache Spark和Drill在技术支持方面表现出高可用性和高性能,但在临床服务方面可用性较差。基于以患者为中心的数据的医院系统在使用HBase时具有挑战性,因为并非所有数据配置文件都能与复杂的患者与医院关系完全整合。然而,我们建议在通过简化的临床事件模型跨临床服务查询整个医院数据量时,使用HBase来实现安全的患者数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/405aa7ebe45e/CMMM2017-6120820.figbox.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/43ffbf3a8738/CMMM2017-6120820.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/3e590757f62b/CMMM2017-6120820.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/f5d553b34411/CMMM2017-6120820.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/dc41b53451b1/CMMM2017-6120820.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/09bb595662c7/CMMM2017-6120820.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/c2a238c0f0d4/CMMM2017-6120820.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/b31045441f65/CMMM2017-6120820.figbox.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/405aa7ebe45e/CMMM2017-6120820.figbox.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/43ffbf3a8738/CMMM2017-6120820.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/3e590757f62b/CMMM2017-6120820.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/f5d553b34411/CMMM2017-6120820.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/dc41b53451b1/CMMM2017-6120820.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/09bb595662c7/CMMM2017-6120820.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/c2a238c0f0d4/CMMM2017-6120820.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/b31045441f65/CMMM2017-6120820.figbox.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d368/5742497/405aa7ebe45e/CMMM2017-6120820.figbox.002.jpg

相似文献

1
Using Distributed Data over HBase in Big Data Analytics Platform for Clinical Services.在用于临床服务的大数据分析平台中通过HBase使用分布式数据。
Comput Math Methods Med. 2017;2017:6120820. doi: 10.1155/2017/6120820. Epub 2017 Dec 11.
2
Big health data for elderly employees job performance of SOEs: visionary and enticing challenges.国有企业老年员工工作绩效的大健康数据:富有远见且诱人的挑战。
Multimed Tools Appl. 2023 May 25:1-34. doi: 10.1007/s11042-023-15355-4.
3
Implementation of a Big Data Accessing and Processing Platform for Medical Records in Cloud.云端医疗记录大数据访问与处理平台的实现
J Med Syst. 2017 Aug 18;41(10):149. doi: 10.1007/s10916-017-0777-5.
4
A Hadoop/MapReduce Based Platform for Supporting Health Big Data Analytics.一个基于Hadoop/MapReduce的支持健康大数据分析的平台。
Stud Health Technol Inform. 2019;257:229-235.
5
Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends.MapReduce 编程框架在临床大数据分析中的应用:现状与未来趋势。
BioData Min. 2014 Oct 29;7:22. doi: 10.1186/1756-0381-7-22. eCollection 2014.
6
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics.Hadoop/MapReduce/HBase 框架概述及其在生物信息学中的当前应用。
BMC Bioinformatics. 2010 Dec 21;11 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-11-S12-S1.
7
An Efficient Middle Layer Platform for Medical Imaging Archives.医学影像归档的高效中间层平台。
J Healthc Eng. 2018 Jun 21;2018:3984061. doi: 10.1155/2018/3984061. eCollection 2018.
8
Big Data Analytics in Medicine and Healthcare.医学与医疗保健中的大数据分析
J Integr Bioinform. 2018 May 10;15(3):20170030. doi: 10.1515/jib-2017-0030.
9
How can Big Data Analytics Support People-Centred and Integrated Health Services: A Scoping Review.大数据分析如何支持以人为主的综合健康服务:一项范围综述
Int J Integr Care. 2022 Jun 16;22(2):23. doi: 10.5334/ijic.5543. eCollection 2022 Apr-Jun.
10
Design and development of a medical big data processing system based on Hadoop.基于Hadoop的医学大数据处理系统的设计与开发。
J Med Syst. 2015 Mar;39(3):23. doi: 10.1007/s10916-015-0220-8. Epub 2015 Feb 10.

引用本文的文献

1
Digital twin: Data exploration, architecture, implementation and future.数字孪生:数据探索、架构、实现与未来。
Heliyon. 2024 Feb 21;10(5):e26503. doi: 10.1016/j.heliyon.2024.e26503. eCollection 2024 Mar 15.
2
Psychosocial Factors and Psychological Characteristics of Personality of Patients with Chronic Diseases Using Artificial Intelligence Data Mining Technology and Wireless Network Cloud Service Platform.利用人工智能数据挖掘技术和无线网络云服务平台的慢性病患者心理社会因素与人格心理特征。
Comput Intell Neurosci. 2022 Apr 13;2022:8418589. doi: 10.1155/2022/8418589. eCollection 2022.
3
Application of Big Data and Artificial Intelligence in COVID-19 Prevention, Diagnosis, Treatment and Management Decisions in China.

本文引用的文献

1
Constellation: a tool for rapid, automated phenotype assignment of a highly polymorphic pharmacogene, , from whole-genome sequences.星座:一种用于从全基因组序列中对高度多态性药物基因进行快速、自动表型分配的工具。
NPJ Genom Med. 2016 Jan 13;1:15007. doi: 10.1038/npjgenmed.2015.7. eCollection 2016.
2
A 26-hour system of highly sensitive whole genome sequencing for emergency management of genetic diseases.用于遗传疾病应急管理的26小时高灵敏度全基因组测序系统。
Genome Med. 2015 Sep 30;7:100. doi: 10.1186/s13073-015-0221-8.
3
Toward a Literature-Driven Definition of Big Data in Healthcare.
大数据和人工智能在中国 COVID-19 预防、诊断、治疗和管理决策中的应用。
J Med Syst. 2021 Jul 24;45(9):84. doi: 10.1007/s10916-021-01757-0.
4
Desiderata for delivering NLP to accelerate healthcare AI advancement and a Mayo Clinic NLP-as-a-service implementation.推动自然语言处理(NLP)以加速医疗人工智能发展的需求以及梅奥诊所的NLP即服务实施。
NPJ Digit Med. 2019 Dec 17;2:130. doi: 10.1038/s41746-019-0208-8. eCollection 2019.
迈向基于文献的医疗大数据定义。
Biomed Res Int. 2015;2015:639021. doi: 10.1155/2015/639021. Epub 2015 Jun 2.
4
Design and development of a medical big data processing system based on Hadoop.基于Hadoop的医学大数据处理系统的设计与开发。
J Med Syst. 2015 Mar;39(3):23. doi: 10.1007/s10916-015-0220-8. Epub 2015 Feb 10.
5
High dimensional biological data retrieval optimization with NoSQL technology.使用NoSQL技术进行高维生物数据检索优化
BMC Genomics. 2014;15 Suppl 8(Suppl 8):S3. doi: 10.1186/1471-2164-15-S8-S3. Epub 2014 Nov 13.
6
Applications of the MapReduce programming framework to clinical big data analysis: current landscape and future trends.MapReduce 编程框架在临床大数据分析中的应用:现状与未来趋势。
BioData Min. 2014 Oct 29;7:22. doi: 10.1186/1756-0381-7-22. eCollection 2014.
7
Big data: survey, technologies, opportunities, and challenges.大数据:调查、技术、机遇与挑战。
ScientificWorldJournal. 2014;2014:712826. doi: 10.1155/2014/712826. Epub 2014 Jul 17.
8
"Big data" and the electronic health record.“大数据”与电子健康记录
Yearb Med Inform. 2014 Aug 15;9(1):97-104. doi: 10.15265/IY-2014-0003.
9
Big Data Usage Patterns in the Health Care Domain: A Use Case Driven Approach Applied to the Assessment of Vaccination Benefits and Risks. Contribution of the IMIA Primary Healthcare Working Group.医疗保健领域的大数据使用模式:一种应用于疫苗接种益处和风险评估的用例驱动方法。国际医学信息学会初级卫生保健工作组的贡献。
Yearb Med Inform. 2014 Aug 15;9(1):27-35. doi: 10.15265/IY-2014-0016.
10
Big Data in Science and Healthcare: A Review of Recent Literature and Perspectives. Contribution of the IMIA Social Media Working Group.科学与医疗保健领域的大数据:近期文献综述与展望。IMIA社交媒体工作组的贡献
Yearb Med Inform. 2014 Aug 15;9(1):21-6. doi: 10.15265/IY-2014-0004.