Suppr超能文献

与化疗治疗临床结局相关的癌症基因表达谱。

Cancer gene expression profiles associated with clinical outcomes to chemotherapy treatments.

机构信息

Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, Walnut, CA, 91788, USA.

Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Oblast, 141701, Russia.

出版信息

BMC Med Genomics. 2020 Sep 18;13(Suppl 8):111. doi: 10.1186/s12920-020-00759-0.

Abstract

BACKGROUND

Machine learning (ML) methods still have limited applicability in personalized oncology due to low numbers of available clinically annotated molecular profiles. This doesn't allow sufficient training of ML classifiers that could be used for improving molecular diagnostics.

METHODS

We reviewed published datasets of high throughput gene expression profiles corresponding to cancer patients with known responses on chemotherapy treatments. We browsed Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA) and Tumor Alterations Relevant for GEnomics-driven Therapy (TARGET) repositories.

RESULTS

We identified data collections suitable to build ML models for predicting responses on certain chemotherapeutic schemes. We identified 26 datasets, ranging from 41 till 508 cases per dataset. All the datasets identified were checked for ML applicability and robustness with leave-one-out cross validation. Twenty-three datasets were found suitable for using ML that had balanced numbers of treatment responder and non-responder cases.

CONCLUSIONS

We collected a database of gene expression profiles associated with clinical responses on chemotherapy for 2786 individual cancer cases. Among them seven datasets included RNA sequencing data (for 645 cases) and the others - microarray expression profiles. The cases represented breast cancer, lung cancer, low-grade glioma, endothelial carcinoma, multiple myeloma, adult leukemia, pediatric leukemia and kidney tumors. Chemotherapeutics included taxanes, bortezomib, vincristine, trastuzumab, letrozole, tipifarnib, temozolomide, busulfan and cyclophosphamide.

摘要

背景

由于可供临床注释的分子谱数量有限,机器学习 (ML) 方法在个性化肿瘤学中的应用仍然有限。这使得用于改善分子诊断的 ML 分类器无法进行充分的训练。

方法

我们回顾了发表的高通量基因表达谱数据集,这些数据集对应于已知对化疗治疗有反应的癌症患者。我们浏览了基因表达综合数据库(GEO)、癌症基因组图谱(TCGA)和肿瘤改变相关基因组驱动治疗(TARGET)数据库。

结果

我们确定了适合构建用于预测特定化疗方案反应的 ML 模型的数据集合。我们确定了 26 个数据集,每个数据集的病例数从 41 到 508 不等。所有确定的数据集都经过了 ML 适用性和稳健性的检查,采用了留一法交叉验证。发现 23 个数据集适合使用 ML,这些数据集具有平衡的治疗反应者和非反应者病例数。

结论

我们收集了一个与 2786 个个体癌症病例的化疗临床反应相关的基因表达谱数据库。其中 7 个数据集包含 RNA 测序数据(用于 645 个病例),其余数据集为微阵列表达谱。这些病例代表乳腺癌、肺癌、低级别胶质瘤、内皮癌、多发性骨髓瘤、成人白血病、儿科白血病和肾肿瘤。化疗药物包括紫杉醇、硼替佐米、长春新碱、曲妥珠单抗、来曲唑、替西罗莫司、替莫唑胺、白消安和环磷酰胺。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7c56/7499993/5f0fedcc8b23/12920_2020_759_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验