数据集成——以甲状腺癌诊断为例的分子与临床数据融合的可能性。

Data Integration-Possibilities of Molecular and Clinical Data Fusion on the Example of Thyroid Cancer Diagnostics.

机构信息

Department of Systems Biology and Engineering, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland.

Department of Technology Development, Gabos Software Sp z o.o., Mikołowska 100, 40-065 Katowice, Poland.

出版信息

Int J Mol Sci. 2022 Oct 6;23(19):11880. doi: 10.3390/ijms231911880.

DOI:10.3390/ijms231911880

PMID:36233181

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9569592/

Abstract

(1) Background: The data from independent gene expression sources may be integrated for the purpose of molecular diagnostics of cancer. So far, multiple approaches were described. Here, we investigated the impacts of different data fusion strategies on classification accuracy and feature selection stability, which allow the costs of diagnostic tests to be reduced. (2) Methods: We used molecular features (gene expression) combined with a feature extracted from the independent clinical data describing a patient's sample. We considered the dependencies between selected features in two data fusion strategies (early fusion and late fusion) compared to classification models based on molecular features only. We compared the best accuracy classification models in terms of the number of features, which is connected to the potential cost reduction of the diagnostic classifier. (3) Results: We show that for thyroid cancer, the extracted clinical feature is correlated with (but not redundant to) the molecular data. The usage of data fusion allows a model to be obtained with similar or even higher classification quality (with a statistically significant accuracy improvement, a -value below 0.05) and with a reduction in molecular dimensionality of the feature space from 15 to 3-8 (depending on the feature selection method). (4) Conclusions: Both strategies give comparable quality results, but the early fusion method provides better feature selection stability.

摘要

(1) 背景：为了进行癌症的分子诊断，可能需要整合来自独立基因表达源的数据。到目前为止，已经描述了多种方法。在这里，我们研究了不同的数据融合策略对分类准确性和特征选择稳定性的影响，这可以降低诊断测试的成本。

(2) 方法：我们使用了分子特征（基因表达）与从独立的临床数据中提取的特征相结合，该特征描述了患者样本的情况。我们考虑了两种数据融合策略（早期融合和晚期融合）中所选特征之间的依赖关系，以及仅基于分子特征的分类模型。我们根据特征的数量比较了最佳准确性分类模型，这与诊断分类器的潜在成本降低有关。

(3) 结果：我们表明，对于甲状腺癌，提取的临床特征与分子数据相关（但不是冗余的）。使用数据融合可以获得具有相似甚至更高分类质量的模型（具有统计学意义的准确性提高，a 值低于 0.05），并且特征空间的分子维度从 15 减少到 3-8（取决于特征选择方法）。

(4) 结论：两种策略都能得到可比的质量结果，但早期融合方法提供了更好的特征选择稳定性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7175/9569592/f34264005be0/ijms-23-11880-g001.jpg

相似文献

Data Integration-Possibilities of Molecular and Clinical Data Fusion on the Example of Thyroid Cancer Diagnostics.数据集成——以甲状腺癌诊断为例的分子与临床数据融合的可能性。

Int J Mol Sci. 2022 Oct 6;23(19):11880. doi: 10.3390/ijms231911880.

Development of an Automated Medical Diagnosis System for Classifying Thyroid Tumor Cells using Multiple Classifier Fusion.基于多分类器融合的甲状腺肿瘤细胞自动诊断系统的开发。

Technol Cancer Res Treat. 2015 Oct;14(5):653-62. doi: 10.7785/tcrt.2012.500430. Epub 2014 Nov 26.

Automated detection of bioimages using novel deep feature fusion algorithm and effective high-dimensional feature selection approach.利用新型深度特征融合算法和有效的高维特征选择方法对生物图像进行自动检测。

Comput Biol Med. 2021 Oct;137:104862. doi: 10.1016/j.compbiomed.2021.104862. Epub 2021 Sep 10.

Ensemble feature selection for stable biomarker identification and cancer classification from microarray expression data.基于微阵列表达数据的稳定生物标志物识别和癌症分类的集成特征选择。

Comput Biol Med. 2022 Mar;142:105208. doi: 10.1016/j.compbiomed.2021.105208. Epub 2022 Jan 5.

Computer-assisted lip diagnosis on Traditional Chinese Medicine using multi-class support vector machines.基于多类支持向量机的中医唇诊计算机辅助诊断。

BMC Complement Altern Med. 2012 Aug 16;12:127. doi: 10.1186/1472-6882-12-127.

Cancer survival classification using integrated data sets and intermediate information.基于整合数据集和中间信息的癌症生存分类。

Artif Intell Med. 2014 Sep;62(1):23-31. doi: 10.1016/j.artmed.2014.06.003. Epub 2014 Jun 21.

Feature dimensionality reduction for myoelectric pattern recognition: a comparison study of feature selection and feature projection methods.用于肌电模式识别的特征降维：特征选择与特征投影方法的比较研究

Med Eng Phys. 2014 Dec;36(12):1716-20. doi: 10.1016/j.medengphy.2014.09.011. Epub 2014 Oct 5.

Feature fusion using locally linear embedding for classification.使用局部线性嵌入进行特征融合以进行分类。

IEEE Trans Neural Netw. 2010 Jan;21(1):163-8. doi: 10.1109/TNN.2009.2036363. Epub 2009 Dec 4.

Comparative analysis of diagnostic performance, feasibility and cost of different test-methods for thyroid nodules with indeterminate cytology.甲状腺结节细针穿刺结果不明确时不同检测方法的诊断性能、可行性及成本的比较分析

Oncotarget. 2017 Jul 25;8(30):49421-49442. doi: 10.18632/oncotarget.17220.

Breast cancer detection from biopsy images using nucleus guided transfer learning and belief based fusion.基于核引导迁移学习和置信度融合的活检图像乳腺癌检测

Comput Biol Med. 2020 Sep;124:103954. doi: 10.1016/j.compbiomed.2020.103954. Epub 2020 Aug 4.

引用本文的文献

GPS: Harnessing data fusion strategies to improve the accuracy of machine learning-based genomic and phenotypic selection.GPS：利用数据融合策略提高基于机器学习的基因组和表型选择的准确性。

Plant Commun. 2025 Aug 11;6(8):101416. doi: 10.1016/j.xplc.2025.101416. Epub 2025 Jun 11.

Decoding pan-cancer treatment outcomes using multimodal real-world data and explainable artificial intelligence.使用多模态真实世界数据和可解释人工智能解码泛癌治疗结果

Nat Cancer. 2025 Feb;6(2):307-322. doi: 10.1038/s43018-024-00891-1. Epub 2025 Jan 30.

本文引用的文献

Did Introducing a New Category of Thyroid Tumors (Non-invasive Follicular Thyroid Neoplasm with Papillary-like Nuclear Features) Decrease the Risk of Malignancy for the Diagnostic Categories in the Bethesda System for Reporting Thyroid Cytopathology?引入新的甲状腺肿瘤类别（具有乳头状核特征的非浸润性滤泡性甲状腺肿瘤）是否降低了 Bethesda 系统报告甲状腺细胞学病理诊断类别恶性风险？

Endocr Pathol. 2020 Jun;31(2):143-149. doi: 10.1007/s12022-020-09619-0.

European perspective on the use of molecular tests in the diagnosis and therapy of thyroid neoplasms.欧洲对分子检测在甲状腺肿瘤诊断和治疗中应用的观点。

Gland Surg. 2020 Feb;9(Suppl 2):S69-S76. doi: 10.21037/gs.2019.10.26.

Thyroid imaging reporting and data system combined with Bethesda classification in qualitative thyroid nodule diagnosis.甲状腺影像报告和数据系统联合贝塞斯达分类法在甲状腺结节定性诊断中的应用

Medicine (Baltimore). 2019 Dec;98(50):e18320. doi: 10.1097/MD.0000000000018320.

The Role of Molecular Testing for the Indeterminate Thyroid FNA.甲状腺细针穿刺抽吸活检结果不确定时的分子检测作用

Genes (Basel). 2019 Sep 23;10(10):736. doi: 10.3390/genes10100736.

Artificial intelligence and machine learning in clinical development: a translational perspective.临床开发中的人工智能与机器学习：转化医学视角

NPJ Digit Med. 2019 Jul 26;2:69. doi: 10.1038/s41746-019-0148-3. eCollection 2019.

Large-Scale Automatic Feature Selection for Biomarker Discovery in High-Dimensional OMICs Data.用于高维组学数据中生物标志物发现的大规模自动特征选择

Front Genet. 2019 May 16;10:452. doi: 10.3389/fgene.2019.00452. eCollection 2019.

Relief-based feature selection: Introduction and review.基于缓解的特征选择：介绍与综述。

J Biomed Inform. 2018 Sep;85:189-203. doi: 10.1016/j.jbi.2018.07.014. Epub 2018 Jul 18.

The 2017 Bethesda System for Reporting Thyroid Cytopathology.2017 年甲状腺细胞病理学报告的贝塞斯达系统。

Thyroid. 2017 Nov;27(11):1341-1346. doi: 10.1089/thy.2017.0500.

Bethesda System in the evaluation of thyroid nodules: Review.用于甲状腺结节评估的贝塞斯达系统：综述

Adv Clin Exp Med. 2017 Jan-Feb;26(1):177-182. doi: 10.17219/acem/27319.

ACR Thyroid Imaging, Reporting and Data System (TI-RADS): White Paper of the ACR TI-RADS Committee.美国放射学会甲状腺影像报告和数据系统（TI-RADS）：美国放射学会TI-RADS委员会白皮书

J Am Coll Radiol. 2017 May;14(5):587-595. doi: 10.1016/j.jacr.2017.01.046. Epub 2017 Apr 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

数据集成——以甲状腺癌诊断为例的分子与临床数据融合的可能性。

Data Integration-Possibilities of Molecular and Clinical Data Fusion on the Example of Thyroid Cancer Diagnostics.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献