基于监督机器学习技术的负样本检测风险估计算法的新方法。

New Approach for Risk Estimation Algorithms of Negativeness Detection with Modelling Supervised Machine Learning Techniques.

机构信息

Istanbul University, Oncology Institute, Department of Basic Oncology, Division of Cancer Genetics, 34093 Fatih, Istanbul, Turkey.

Istanbul University-Cerrahpasa, Engineering Faculty, Computer Engineering Department, 34320 Avcilar, Istanbul, Turkey.

出版信息

Dis Markers. 2020 Dec 9;2020:8594090. doi: 10.1155/2020/8594090. eCollection 2020.

DOI:10.1155/2020/8594090

PMID:33488844

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7787793/

Abstract

gene testing is a difficult, expensive, and time-consuming test which requires excessive work load. The identification of the gene mutations is significantly important in the selection of treatment and the risk of secondary cancer. We aimed to develop an algorithm considering all the clinical, demographic, and genetic features of patients for identifying the negativity in the present study. An experimental dataset was created with the collection of the all clinical, demographic, and genetic features of breast cancer patients for 20 years. This dataset consisted of 125 features of 2070 high-risk breast cancer patients. All data were numeralized and normalized for detection of the negativity in the machine learning algorithm. The performance of the algorithm was identified by studying the machine learning model with the test data. nearest neighbours (KNN) and decision tree (DT) accuracy rates of 9 features involving Dataset 2 were found to be the most effective. The removal of the unnecessary data in the dataset by reducing the number of features was shown to increase the accuracy rate of algorithm compared with the DT. negativity was identified without performing the gene test with 92.88% accuracy within minutes in high-risk breast cancer patients with this algorithm, and the test associated result waiting stress, time, and money loss were prevented. That algorithm is suggested be useful in fast performing of the treatment plans of patients and accurately in addition to speeding up the clinical practice.

摘要

基因检测是一项困难、昂贵且耗时的测试，需要大量的工作负荷。基因突变的鉴定对于治疗方案的选择和二次癌症的风险具有重要意义。本研究旨在开发一种算法，该算法考虑了患者所有的临床、人口统计学和遗传特征，以确定当前研究中的阴性结果。通过收集 20 年来所有乳腺癌患者的临床、人口统计学和遗传特征，创建了一个实验数据集。该数据集包含 2070 名高危乳腺癌患者的 125 个特征。所有数据均经过数字化和归一化处理，以检测机器学习算法中的阴性结果。通过使用测试数据研究机器学习模型来确定算法的性能。在涉及数据集 2 的 9 个特征中，最近邻 (KNN) 和决策树 (DT) 的准确率最高。通过减少特征数量来去除数据集中的不必要数据，与 DT 相比，算法的准确率有所提高。该算法可在几分钟内以 92.88%的准确率识别高危乳腺癌患者的阴性结果，无需进行基因检测，避免了检测相关的等待压力、时间和金钱损失。该算法建议在快速制定患者治疗计划、准确判断以及加快临床实践方面具有一定的应用价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1f6e/7787793/993a481cca34/DM2020-8594090.001.jpg

相似文献

New Approach for Risk Estimation Algorithms of Negativeness Detection with Modelling Supervised Machine Learning Techniques.基于监督机器学习技术的负样本检测风险估计算法的新方法。

Dis Markers. 2020 Dec 9;2020:8594090. doi: 10.1155/2020/8594090. eCollection 2020.

Uptake, Results, and Outcomes of Germline Multiple-Gene Sequencing After Diagnosis of Breast Cancer.乳腺癌诊断后种系多基因测序的检测结果和预后。

JAMA Oncol. 2018 Aug 1;4(8):1066-1072. doi: 10.1001/jamaoncol.2018.0644.

Multigene Panel Testing Detects Equal Rates of Pathogenic BRCA1/2 Mutations and has a Higher Diagnostic Yield Compared to Limited BRCA1/2 Analysis Alone in Patients at Risk for Hereditary Breast Cancer.多基因检测面板检测出的致病性BRCA1/2突变率相同，与仅对有遗传性乳腺癌风险的患者进行有限的BRCA1/2分析相比，具有更高的诊断率。

Ann Surg Oncol. 2015 Oct;22(10):3282-8. doi: 10.1245/s10434-015-4754-2. Epub 2015 Jul 29.

A new scoring system for the chances of identifying a BRCA1/2 mutation outperforms existing models including BRCAPRO.一种用于识别BRCA1/2基因突变可能性的新评分系统优于包括BRCAPRO在内的现有模型。

J Med Genet. 2004 Jun;41(6):474-80. doi: 10.1136/jmg.2003.017996.

Genetic testing in women with breast cancer: implications for treatment.乳腺癌女性的基因检测：对治疗的影响

Expert Rev Anticancer Ther. 2017 Nov;17(11):991-1002. doi: 10.1080/14737140.2017.1374175. Epub 2017 Sep 8.

Impact of germline and somatic BRCA1/2 mutations: tumor spectrum and detection platforms.种系和体细胞BRCA1/2突变的影响：肿瘤谱和检测平台

Gene Ther. 2017 Oct;24(10):601-609. doi: 10.1038/gt.2017.73. Epub 2017 Aug 3.

Frequency of mutations in individuals with breast cancer referred for BRCA1 and BRCA2 testing using next-generation sequencing with a 25-gene panel.使用 25 基因组合新一代测序对乳腺癌患者进行 BRCA1 和 BRCA2 检测时个体突变的频率。

Cancer. 2015 Jan 1;121(1):25-33. doi: 10.1002/cncr.29010. Epub 2014 Sep 3.

Genetic testing in Poland and Ukraine: should comprehensive germline testing of and be recommended for women with breast and ovarian cancer?波兰和乌克兰的基因检测：是否应推荐对乳腺癌和卵巢癌女性进行全面胚系检测？

Genet Res (Camb). 2020 Aug 10;102:e6. doi: 10.1017/S0016672320000075.

Prevalence of BRCA1 and BRCA2 large genomic rearrangements in Tunisian high risk breast/ovarian cancer families: Implications for genetic testing.突尼斯高危乳腺癌/卵巢癌家族中BRCA1和BRCA2大基因组重排的患病率：对基因检测的影响。

Cancer Genet. 2017 Jan;210:22-27. doi: 10.1016/j.cancergen.2016.11.002. Epub 2016 Nov 18.

Performance of BRCA1/2 mutation prediction models in male breast cancer patients.BRCA1/2 基因突变预测模型在男性乳腺癌患者中的表现。

Clin Genet. 2018 Jan;93(1):52-59. doi: 10.1111/cge.13065. Epub 2017 Sep 25.

本文引用的文献

Risks of Breast, Ovarian, and Contralateral Breast Cancer for BRCA1 and BRCA2 Mutation Carriers.BRCA1 和 BRCA2 基因突变携带者的乳腺癌、卵巢癌和对侧乳腺癌风险。

JAMA. 2017 Jun 20;317(23):2402-2416. doi: 10.1001/jama.2017.7112.

Machine Learning for Medical Imaging.用于医学成像的机器学习

Radiographics. 2017 Mar-Apr;37(2):505-515. doi: 10.1148/rg.2017160130. Epub 2017 Feb 17.

Machine Learning and Network Methods for Biology and Medicine.生物学与医学中的机器学习和网络方法

Comput Math Methods Med. 2015;2015:915124. doi: 10.1155/2015/915124. Epub 2015 Nov 29.

NMFBFS: A NMF-Based Feature Selection Method in Identifying Pivotal Clinical Symptoms of Hepatocellular Carcinoma.NMFBFS：一种基于非负矩阵分解的肝细胞癌关键临床症状识别特征选择方法。

Comput Math Methods Med. 2015;2015:846942. doi: 10.1155/2015/846942. Epub 2015 Oct 12.

Machine Learning in Medicine.医学中的机器学习

Circulation. 2015 Nov 17;132(20):1920-30. doi: 10.1161/CIRCULATIONAHA.115.001593.

Identifying Novel Candidate Genes Related to Apoptosis from a Protein-Protein Interaction Network.从蛋白质-蛋白质相互作用网络中鉴定与细胞凋亡相关的新型候选基因。

Comput Math Methods Med. 2015;2015:715639. doi: 10.1155/2015/715639. Epub 2015 Oct 4.

A Five-Gene Signature Predicts Prognosis in Patients with Kidney Renal Clear Cell Carcinoma.一种五基因特征可预测肾透明细胞癌患者的预后。

Comput Math Methods Med. 2015;2015:842784. doi: 10.1155/2015/842784. Epub 2015 Oct 11.

Cell Pluripotency Levels Associated with Imprinted Genes in Human.与人类印记基因相关的细胞多能性水平

Comput Math Methods Med. 2015;2015:471076. doi: 10.1155/2015/471076. Epub 2015 Oct 4.

Identifying New Candidate Genes and Chemicals Related to Prostate Cancer Using a Hybrid Network and Shortest Path Approach.使用混合网络和最短路径方法识别与前列腺癌相关的新候选基因和化学物质。

Comput Math Methods Med. 2015;2015:462363. doi: 10.1155/2015/462363. Epub 2015 Oct 4.

Nonsynonymous Single-Nucleotide Variations on Some Posttranslational Modifications of Human Proteins and the Association with Diseases.人类蛋白质某些翻译后修饰上的非同义单核苷酸变异及其与疾病的关联。

Comput Math Methods Med. 2015;2015:124630. doi: 10.1155/2015/124630. Epub 2015 Oct 1.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于监督机器学习技术的负样本检测风险估计算法的新方法。

New Approach for Risk Estimation Algorithms of Negativeness Detection with Modelling Supervised Machine Learning Techniques.

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献