• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于癌症研究中水平和垂直数据集成的新框架及其在生存时间预测模型中的应用。

A novel framework for horizontal and vertical data integration in cancer studies with application to survival time prediction models.

机构信息

Faculty of Mathematics and Informatics, Sofia University, "St. Kliment Ohridski", 5 James Bourchier Blvd., Sofia, 1164, Bulgaria.

Department of Biotechnology, Boku University, Vienna, 1180, Austria.

出版信息

Biol Direct. 2019 Nov 21;14(1):22. doi: 10.1186/s13062-019-0249-6.

DOI:10.1186/s13062-019-0249-6
PMID:31752974
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6868770/
Abstract

BACKGROUND

Recently high-throughput technologies have been massively used alongside clinical tests to study various types of cancer. Data generated in such large-scale studies are heterogeneous, of different types and formats. With lack of effective integration strategies novel models are necessary for efficient and operative data integration, where both clinical and molecular information can be effectively joined for storage, access and ease of use. Such models, combined with machine learning methods for accurate prediction of survival time in cancer studies, can yield novel insights into disease development and lead to precise personalized therapies.

RESULTS

We developed an approach for intelligent data integration of two cancer datasets (breast cancer and neuroblastoma) - provided in the CAMDA 2018 'Cancer Data Integration Challenge', and compared models for prediction of survival time. We developed a novel semantic network-based data integration framework that utilizes NoSQL databases, where we combined clinical and expression profile data, using both raw data records and external knowledge sources. Utilizing the integrated data we introduced Tumor Integrated Clinical Feature (TICF) - a new feature for accurate prediction of patient survival time. Finally, we applied and validated several machine learning models for survival time prediction.

CONCLUSION

We developed a framework for semantic integration of clinical and omics data that can borrow information across multiple cancer studies. By linking data with external domain knowledge sources our approach facilitates enrichment of the studied data by discovery of internal relations. The proposed and validated machine learning models for survival time prediction yielded accurate results.

REVIEWERS

This article was reviewed by Eran Elhaik, Wenzhong Xiao and Carlos Loucera.

摘要

背景

最近,高通量技术已与临床检测一起大量用于研究各种类型的癌症。此类大规模研究中生成的数据具有异质性,类型和格式也各不相同。由于缺乏有效的整合策略,需要新型模型来实现高效且可行的数据整合,以便能够有效地将临床和分子信息结合起来进行存储、访问和使用。此类模型与用于癌症研究中生存时间精确预测的机器学习方法相结合,可以深入了解疾病的发展,并促成精确的个性化治疗。

结果

我们开发了一种方法,用于对两个癌症数据集(乳腺癌和神经母细胞瘤)进行智能数据集成-这些数据集是在 CAMDA 2018“癌症数据集成挑战赛”中提供的,并对用于预测生存时间的模型进行了比较。我们开发了一种新颖的基于语义网络的数据集成框架,该框架利用了 NoSQL 数据库,我们在其中结合了临床和表达谱数据,同时使用了原始数据记录和外部知识库。利用集成数据,我们引入了肿瘤综合临床特征(TICF)-这是一种用于精确预测患者生存时间的新特征。最后,我们应用并验证了几种用于生存时间预测的机器学习模型。

结论

我们开发了一种用于临床和组学数据语义集成的框架,该框架可以跨多个癌症研究借鉴信息。通过将数据与外部领域知识库链接,我们的方法通过发现内部关系来促进所研究数据的丰富。所提出和验证的用于生存时间预测的机器学习模型产生了准确的结果。

评论者

本文由 Eran Elhaik、Wenzhong Xiao 和 Carlos Loucera 进行了评论。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/8289a5047089/13062_2019_249_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/59686cd0bc98/13062_2019_249_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/cc135556422a/13062_2019_249_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/a81424867fd6/13062_2019_249_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/5393bd54e1a5/13062_2019_249_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/d792480d08b6/13062_2019_249_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/3b055d04a795/13062_2019_249_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/8289a5047089/13062_2019_249_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/59686cd0bc98/13062_2019_249_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/cc135556422a/13062_2019_249_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/a81424867fd6/13062_2019_249_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/5393bd54e1a5/13062_2019_249_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/d792480d08b6/13062_2019_249_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/3b055d04a795/13062_2019_249_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69bf/6868770/8289a5047089/13062_2019_249_Fig7_HTML.jpg

相似文献

1
A novel framework for horizontal and vertical data integration in cancer studies with application to survival time prediction models.一种用于癌症研究中水平和垂直数据集成的新框架及其在生存时间预测模型中的应用。
Biol Direct. 2019 Nov 21;14(1):22. doi: 10.1186/s13062-019-0249-6.
2
Robust pathway-based multi-omics data integration using directed random walks for survival prediction in multiple cancer studies.基于有向随机游走的稳健通路多组学数据整合用于多种癌症研究的生存预测。
Biol Direct. 2019 Apr 29;14(1):8. doi: 10.1186/s13062-019-0239-8.
3
Multi-omics integration for neuroblastoma clinical endpoint prediction.多组学整合用于神经母细胞瘤临床终点预测。
Biol Direct. 2018 Apr 3;13(1):5. doi: 10.1186/s13062-018-0207-8.
4
Integration of multiple types of genetic markers for neuroblastoma may contribute to improved prediction of the overall survival.多种类型的遗传标记物的整合可能有助于提高对总体生存的预测。
Biol Direct. 2018 Sep 20;13(1):17. doi: 10.1186/s13062-018-0222-9.
5
Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis.基于深度学习的多组学生物标志物数据特征层融合在乳腺癌患者生存分析中的应用。
BMC Med Inform Decis Mak. 2020 Sep 15;20(1):225. doi: 10.1186/s12911-020-01225-8.
6
Integrating multi-omics data by learning modality invariant representations for improved prediction of overall survival of cancer.通过学习模态不变表示来整合多组学数据,以提高癌症总体生存预测的准确性。
Methods. 2021 May;189:74-85. doi: 10.1016/j.ymeth.2020.07.008. Epub 2020 Aug 5.
7
Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data.基于多组学数据预测卵巢癌生存的最小冗余最大相关性多视图特征选择。
BMC Med Genomics. 2018 Sep 14;11(Suppl 3):71. doi: 10.1186/s12920-018-0388-0.
8
Predicting clinical outcomes in neuroblastoma with genomic data integration.基于基因组数据整合预测神经母细胞瘤的临床结局。
Biol Direct. 2018 Sep 27;13(1):20. doi: 10.1186/s13062-018-0223-8.
9
Predicting censored survival data based on the interactions between meta-dimensional omics data in breast cancer.基于乳腺癌元维度组学数据间的相互作用预测删失生存数据。
J Biomed Inform. 2015 Aug;56:220-8. doi: 10.1016/j.jbi.2015.05.019. Epub 2015 Jun 3.
10
Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction.多组学技术助力Cox回归模型中的变量选择以进行癌症预后预测。
Methods. 2017 Jul 15;124:100-107. doi: 10.1016/j.ymeth.2017.06.010. Epub 2017 Jun 13.

引用本文的文献

1
Prediction of Composite Clinical Outcomes for Childhood Neuroblastoma Using Multi-Omics Data and Machine Learning.利用多组学数据和机器学习预测儿童神经母细胞瘤的综合临床结局
Int J Mol Sci. 2024 Dec 27;26(1):136. doi: 10.3390/ijms26010136.
2
Scoping Review: Methods and Applications of Spatial Transcriptomics in Tumor Research.综述:空间转录组学在肿瘤研究中的方法与应用
Cancers (Basel). 2024 Sep 6;16(17):3100. doi: 10.3390/cancers16173100.
3
ASAS-NANP symposium: mathematical modeling in animal nutrition-Making sense of big data and machine learning: how open-source code can advance training of animal scientists.

本文引用的文献

1
Predicting clinical outcome of neuroblastoma patients using an integrative network-based approach.基于整合网络的方法预测神经母细胞瘤患者的临床预后。
Biol Direct. 2018 Jun 7;13(1):12. doi: 10.1186/s13062-018-0214-9.
2
Data Integration through Ontology-Based Data Access to Support Integrative Data Analysis: A Case Study of Cancer Survival.通过基于本体的数据访问进行数据集成以支持综合数据分析:癌症生存的案例研究
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2017 Nov;2017:1300-1303. doi: 10.1109/BIBM.2017.8217849. Epub 2017 Dec 18.
3
Multi-omics integration for neuroblastoma clinical endpoint prediction.
ASAS-NANP 研讨会:动物营养中的数学建模——从大数据和机器学习中得出意义:开源代码如何促进动物科学家的培训。
J Anim Sci. 2023 Jan 3;101. doi: 10.1093/jas/skad317.
4
Computational method for aromatase-related proteins using machine learning approach.基于机器学习的芳香化酶相关蛋白计算方法。
PLoS One. 2023 Mar 29;18(3):e0283567. doi: 10.1371/journal.pone.0283567. eCollection 2023.
5
Emerging roles of the HECT-type E3 ubiquitin ligases in hematological malignancies.HECT 型 E3 泛素连接酶在血液系统恶性肿瘤中的新作用。
Discov Oncol. 2021 Oct 8;12(1):39. doi: 10.1007/s12672-021-00435-4.
6
Serine and one-carbon metabolisms bring new therapeutic venues in prostate cancer.丝氨酸和一碳代谢为前列腺癌带来了新的治疗途径。
Discov Oncol. 2021 Oct 27;12(1):45. doi: 10.1007/s12672-021-00440-7.
7
Central vascular ligation and mesentery based abdominal surgery.中心血管结扎术及基于肠系膜的腹部手术。
Discov Oncol. 2021 Aug 6;12(1):24. doi: 10.1007/s12672-021-00419-4.
8
New immunological potential markers for triple negative breast cancer: IL18R1, CD53, TRIM, Jaw1, LTB, PTPRCAP.三阴性乳腺癌新的免疫潜在标志物:白细胞介素18受体1(IL18R1)、CD53、TRIM、Jaw1、淋巴毒素β(LTB)、蛋白酪氨酸磷酸酶受体C相关蛋白(PTPRCAP)
Discov Oncol. 2021 Mar 10;12(1):6. doi: 10.1007/s12672-021-00401-0.
9
Recent advances in cancer immunotherapy.癌症免疫疗法的最新进展。
Discov Oncol. 2021 Aug 18;12(1):27. doi: 10.1007/s12672-021-00422-9.
10
Understanding p53 tumour suppressor network.理解 p53 肿瘤抑制因子网络。
Biol Direct. 2021 Aug 6;16(1):14. doi: 10.1186/s13062-021-00298-3.
多组学整合用于神经母细胞瘤临床终点预测。
Biol Direct. 2018 Apr 3;13(1):5. doi: 10.1186/s13062-018-0207-8.
4
Ensembl 2018.Ensembl 2018.
Nucleic Acids Res. 2018 Jan 4;46(D1):D754-D761. doi: 10.1093/nar/gkx1098.
5
Prognostic value of cross-omics screening for kidney clear cell renal cancer survival.跨组学筛查对肾透明细胞癌生存的预后价值。
Biol Direct. 2016 Dec 20;11(1):68. doi: 10.1186/s13062-016-0170-1.
6
UniProt: the universal protein knowledgebase.通用蛋白质知识库:UniProt
Nucleic Acids Res. 2017 Jan 4;45(D1):D158-D169. doi: 10.1093/nar/gkw1099. Epub 2016 Nov 29.
7
Comparison of RNA-seq and microarray-based models for clinical endpoint prediction.用于临床终点预测的RNA测序和基于微阵列模型的比较。
Genome Biol. 2015 Jun 25;16(1):133. doi: 10.1186/s13059-015-0694-1.
8
Machine-learning prediction of cancer survival: a retrospective study using electronic administrative records and a cancer registry.癌症生存的机器学习预测:一项使用电子行政记录和癌症登记处的回顾性研究。
BMJ Open. 2014 Mar 17;4(3):e004007. doi: 10.1136/bmjopen-2013-004007.
9
Systematic analysis of challenge-driven improvements in molecular prognostic models for breast cancer.系统分析乳腺癌分子预后模型中以挑战为导向的改进。
Sci Transl Med. 2013 Apr 17;5(181):181re1. doi: 10.1126/scitranslmed.3006112.
10
Bioinformatics clouds for big data manipulation.生物信息学云服务用于大数据操作。
Biol Direct. 2012 Nov 28;7:43; discussion 43. doi: 10.1186/1745-6150-7-43.