• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于表型算法的癌症诊断大数据分析。

Phenotype Algorithm based Big Data Analytics for Cancer Diagnose.

机构信息

Anna University, Chennai, India.

出版信息

J Med Syst. 2019 Jul 4;43(8):264. doi: 10.1007/s10916-019-1409-z.

DOI:10.1007/s10916-019-1409-z
PMID:31270694
Abstract

Nowadays, Cancer diagnosis is one of the major challenging characteristics for treating cancer. The reality of cancer patients rely on the diagnosis of cancer at the early stages (either in stage 1 or stage 2). If the cancer is diagnosed in stage 3 or later stages means the changes of survival of the patient will become more critical. Normally, single patient records will generate a huge amount of data if the data could be manage and analyze means to solve many problems for identifying the patterns it will leads to diagnose the cancer. Recent work several machine learning algorithms are introduced for the classification of cancer. However still the classification accuracy of machine learning algorithms are reduced because of huge number of samples. So the proposed work introduces a new Hadoop Distributed File System (HDFS) is focused in this work. In this paper, the proposed phenotype techniques are used which handle and classifies the raw EHR (Electronic Health Record) and EMR (Electronic Medical Record). It is based on the HDFS and Two-Phase Map Reduce. Phenotype algorithm uses NLP (National Language Processing) tool which will analyze and classify the cancer patient data like gene mapping, age related data, image and ultrasonic frequency processing, identification and analysis of irregularities, disease and personal histories. In this paper, the three factorized model is used which calculates the mean score values. The values are calculated by disease stage, pain status, etc. This paper focuses big data analytics for cancer diagnosis and the simulation results shows the proposed system produces the highest performance.

摘要

如今,癌症诊断是癌症治疗的主要挑战性特征之一。癌症患者的现实情况依赖于早期(第 1 期或第 2 期)的癌症诊断。如果癌症在第 3 期或更晚阶段被诊断出来,这意味着患者的生存变化将变得更加关键。通常,如果能够管理和分析这些数据,那么单个患者的记录会产生大量的数据,这意味着可以解决许多问题,从而识别出模式,进而诊断癌症。最近已经引入了几种机器学习算法来对癌症进行分类。然而,由于样本数量巨大,机器学习算法的分类准确性仍然会降低。因此,本项工作引入了一种新的 Hadoop 分布式文件系统(HDFS)。在本文中,引入了一种新的 Hadoop 分布式文件系统(HDFS),该系统专注于处理和分类原始的 EHR(电子健康记录)和 EMR(电子病历)。它基于 HDFS 和两阶段 Map Reduce。表型算法使用 NLP(自然语言处理)工具来分析和分类癌症患者的数据,如基因图谱、年龄相关数据、图像和超声波频率处理、不规则性、疾病和个人病史的识别和分析。在本文中,使用了三因子化模型来计算平均值。这些值是通过疾病阶段、疼痛状况等来计算的。本文专注于癌症诊断的大数据分析,模拟结果表明,所提出的系统产生了最高的性能。

相似文献

1
Phenotype Algorithm based Big Data Analytics for Cancer Diagnose.基于表型算法的癌症诊断大数据分析。
J Med Syst. 2019 Jul 4;43(8):264. doi: 10.1007/s10916-019-1409-z.
2
Automated feature selection of predictors in electronic medical records data.电子病历数据中预测指标的自动特征选择
Biometrics. 2019 Mar;75(1):268-277. doi: 10.1111/biom.12987. Epub 2019 Apr 2.
3
Scaling-up NLP Pipelines to Process Large Corpora of Clinical Notes.扩大自然语言处理管道以处理大量临床记录语料库。
Methods Inf Med. 2015;54(6):548-52. doi: 10.3414/ME14-02-0018. Epub 2015 Nov 4.
4
A Novel Intelligent Hybrid Optimized Analytics and Streaming Engine for Medical Big Data.一种用于医疗大数据的新型智能混合优化分析和流引擎。
Comput Math Methods Med. 2022 Mar 17;2022:7120983. doi: 10.1155/2022/7120983. eCollection 2022.
5
Evaluating electronic health record data sources and algorithmic approaches to identify hypertensive individuals.评估电子健康记录数据源及识别高血压个体的算法方法。
J Am Med Inform Assoc. 2017 Jan;24(1):162-171. doi: 10.1093/jamia/ocw071. Epub 2016 Aug 7.
6
Risk prediction using natural language processing of electronic mental health records in an inpatient forensic psychiatry setting.利用电子心理健康记录的自然语言处理进行住院法医精神病学环境中的风险预测。
J Biomed Inform. 2018 Oct;86:49-58. doi: 10.1016/j.jbi.2018.08.007. Epub 2018 Aug 14.
7
Attribute based honey encryption algorithm for securing big data: Hadoop distributed file system perspective.用于保护大数据的基于属性的蜜罐加密算法:从Hadoop分布式文件系统角度看
PeerJ Comput Sci. 2020 Feb 17;6:e259. doi: 10.7717/peerj-cs.259. eCollection 2020.
8
Machine Learning for Knowledge Extraction from PHR Big Data.用于从个人健康记录大数据中提取知识的机器学习
Stud Health Technol Inform. 2014;202:36-9.
9
A method for cohort selection of cardiovascular disease records from an electronic health record system.一种从电子健康记录系统中选择心血管疾病记录队列的方法。
Int J Med Inform. 2017 Jun;102:138-149. doi: 10.1016/j.ijmedinf.2017.03.015. Epub 2017 Mar 30.
10
Predicting of anaphylaxis in big data EMR by exploring machine learning approaches.利用机器学习方法探索大数据电子病历中的过敏预测。
J Biomed Inform. 2018 Nov;87:50-59. doi: 10.1016/j.jbi.2018.09.012. Epub 2018 Sep 25.

引用本文的文献

1
Are ICD codes reliable for observational studies? Assessing coding consistency for data quality.国际疾病分类代码用于观察性研究是否可靠?评估数据质量的编码一致性。
Digit Health. 2024 Oct 29;10:20552076241297056. doi: 10.1177/20552076241297056. eCollection 2024 Jan-Dec.
2
Using natural language processing to analyze unstructured patient-reported outcomes data derived from electronic health records for cancer populations: a systematic review.利用自然语言处理分析从癌症患者电子健康记录中获取的非结构化患者报告结局数据:一项系统综述。
Expert Rev Pharmacoecon Outcomes Res. 2024 Apr;24(4):467-475. doi: 10.1080/14737167.2024.2322664. Epub 2024 Mar 5.
3

本文引用的文献

1
On the convergence of nanotechnology and Big Data analysis for computer-aided diagnosis.纳米技术与大数据分析在计算机辅助诊断中的融合
Nanomedicine (Lond). 2016 Apr;11(8):959-82. doi: 10.2217/nnm.16.35. Epub 2016 Mar 16.
2
Desiderata for computable representations of electronic health records-driven phenotype algorithms.电子健康记录驱动的表型算法可计算表示的要求。
J Am Med Inform Assoc. 2015 Nov;22(6):1220-30. doi: 10.1093/jamia/ocv112. Epub 2015 Sep 5.
3
Computer-Aided Prostate Cancer Diagnosis From Digitized Histopathology: A Review on Texture-Based Systems.
Classification of skin cancer stages using a AHP fuzzy technique within the context of big data healthcare.
基于大数据医疗的层次分析法模糊技术在皮肤癌分期中的应用
J Cancer Res Clin Oncol. 2023 Sep;149(11):8743-8757. doi: 10.1007/s00432-023-04815-x. Epub 2023 May 2.
4
MALDI-MSI as a Complementary Diagnostic Tool in Cytopathology: A Pilot Study for the Characterization of Thyroid Nodules.基质辅助激光解吸电离质谱成像作为细胞病理学中的一种辅助诊断工具:甲状腺结节特征的初步研究
Cancers (Basel). 2019 Sep 16;11(9):1377. doi: 10.3390/cancers11091377.
基于纹理特征的计算机辅助前列腺癌病理诊断系统研究进展
IEEE Rev Biomed Eng. 2015;8:98-113. doi: 10.1109/RBME.2014.2340401. Epub 2014 Jul 17.
4
The use of phenome-wide association studies (PheWAS) for exploration of novel genotype-phenotype relationships and pleiotropy discovery.利用表型全基因组关联研究(PheWAS)探索新的基因型-表型关系和多效性发现。
Genet Epidemiol. 2011 Jul;35(5):410-22. doi: 10.1002/gepi.20589. Epub 2011 May 18.
5
The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies.eMERGE 网络:一个由生物库组成的联盟,与电子病历数据相关联,用于进行基因组研究。
BMC Med Genomics. 2011 Jan 26;4:13. doi: 10.1186/1755-8794-4-13.
6
Evolutionary undersampling for classification with imbalanced datasets: proposals and taxonomy.用于不平衡数据集分类的进化欠采样:提议与分类法
Evol Comput. 2009 Fall;17(3):275-306. doi: 10.1162/evco.2009.17.3.275.
7
Applications of machine learning in cancer prediction and prognosis.机器学习在癌症预测和预后中的应用。
Cancer Inform. 2007 Feb 11;2:59-77.
8
Development of an advanced hyperspectral imaging (HSI) system with applications for cancer detection.开发一种用于癌症检测的先进高光谱成像(HSI)系统。
Ann Biomed Eng. 2006 Jun;34(6):1061-8. doi: 10.1007/s10439-006-9121-9. Epub 2006 May 9.