• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

释放多机构数据的力量:整合与协调跨机构的基因组数据。

Unlocking the power of multi-institutional data: Integrating and harmonizing genomic data across institutions.

作者信息

Chen Yuan, Shen Ronglai, Feng Xiwen, Panageas Katherine

机构信息

Department of Epidemiology & Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY 10017, United States.

Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, United States.

出版信息

Biometrics. 2024 Oct 3;80(4). doi: 10.1093/biomtc/ujae146.

DOI:10.1093/biomtc/ujae146
PMID:39679742
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11647914/
Abstract

Cancer is a complex disease driven by genomic alterations, and tumor sequencing is becoming a mainstay of clinical care for cancer patients. The emergence of multi-institution sequencing data presents a powerful resource for learning real-world evidence to enhance precision oncology. GENIE BPC, led by American Association for Cancer Research, establishes a unique database linking genomic data with clinical information for patients treated at multiple cancer centers. However, leveraging sequencing data from multiple institutions presents significant challenges. Variability in gene panels can lead to loss of information when analyses focus on genes common across panels. Additionally, differences in sequencing techniques and patient heterogeneity across institutions add complexity. High data dimensionality, sparse gene mutation patterns, and weak signals at the individual gene level further complicate matters. Motivated by these real-world challenges, we introduce the Bridge model. It uses a quantile-matched latent variable approach to derive integrated features to preserve information beyond common genes and maximize the utilization of all available data, while leveraging information sharing to enhance both learning efficiency and the model's capacity to generalize. By extracting harmonized and noise-reduced lower-dimensional latent variables, the true mutation pattern unique to each individual is captured. We assess model's performance and parameter estimation through extensive simulation studies. The extracted latent features from the Bridge model consistently excel in predicting patient survival across six cancer types in GENIE BPC data.

摘要

癌症是一种由基因组改变驱动的复杂疾病,肿瘤测序正成为癌症患者临床护理的主要手段。多机构测序数据的出现为获取真实世界证据以提高精准肿瘤学水平提供了强大资源。由美国癌症研究协会牵头的GENIE BPC建立了一个独特的数据库,将多个癌症中心治疗患者的基因组数据与临床信息相联系。然而,利用来自多个机构的测序数据面临重大挑战。当分析聚焦于各基因检测板共有的基因时,基因检测板的差异可能导致信息丢失。此外,各机构测序技术的差异以及患者异质性增加了复杂性。高数据维度、稀疏的基因突变模式以及单个基因水平上的微弱信号使情况进一步复杂化。受这些现实世界挑战的推动,我们引入了桥接模型。它采用分位数匹配的潜在变量方法来推导综合特征,以保留常见基因之外的信息并最大化所有可用数据的利用率,同时利用信息共享提高学习效率和模型的泛化能力。通过提取协调且降噪的低维潜在变量,捕捉每个个体独特的真实突变模式。我们通过广泛的模拟研究评估模型的性能和参数估计。从桥接模型中提取的潜在特征在预测GENIE BPC数据中六种癌症类型患者的生存情况时始终表现出色。

相似文献

1
Unlocking the power of multi-institutional data: Integrating and harmonizing genomic data across institutions.释放多机构数据的力量:整合与协调跨机构的基因组数据。
Biometrics. 2024 Oct 3;80(4). doi: 10.1093/biomtc/ujae146.
2
Unlocking the Power of Multi-institutional Data: Integrating and Harmonizing Genomic Data Across Institutions.释放多机构数据的力量:跨机构整合与协调基因组数据
ArXiv. 2024 Oct 29:arXiv:2402.00077v2.
3
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
4
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
5
[Volume and health outcomes: evidence from systematic reviews and from evaluation of Italian hospital data].[容量与健康结果:来自系统评价和意大利医院数据评估的证据]
Epidemiol Prev. 2013 Mar-Jun;37(2-3 Suppl 2):1-100.
6
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
7
Diagnostic test accuracy and cost-effectiveness of tests for codeletion of chromosomal arms 1p and 19q in people with glioma.染色体臂 1p 和 19q 缺失的检测在胶质瘤患者中的诊断准确性和成本效益。
Cochrane Database Syst Rev. 2022 Mar 2;3(3):CD013387. doi: 10.1002/14651858.CD013387.pub2.
8
Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.液体活检能否通过低深度全基因组测序检测肉瘤患者的循环肿瘤DNA?一项初步评估。
Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21.
9
Magnetic resonance perfusion for differentiating low-grade from high-grade gliomas at first presentation.首次就诊时磁共振灌注成像用于鉴别低级别与高级别胶质瘤
Cochrane Database Syst Rev. 2018 Jan 22;1(1):CD011551. doi: 10.1002/14651858.CD011551.pub2.
10
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

本文引用的文献

1
Multiple augmented reduced rank regression for pan-cancer analysis.多组增强降秩回归分析泛癌数据。
Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujad002.
2
Analysis and Visualization of Longitudinal Genomic and Clinical Data from the AACR Project GENIE Biopharma Collaborative in cBioPortal.在 cBioPortal 中分析和可视化 AACR 项目 GENIE 生物制药协作的纵向基因组和临床数据。
Cancer Res. 2023 Dec 1;83(23):3861-3867. doi: 10.1158/0008-5472.CAN-23-0816.
3
AACR Project GENIE: 100,000 Cases and Beyond.AACR Project GENIE:10 万例及以上。
Cancer Discov. 2022 Sep 2;12(9):2044-2057. doi: 10.1158/2159-8290.CD-21-1547.
4
Learning Individualized Treatment Rules for Multiple-Domain Latent Outcomes.学习多领域潜在结果的个性化治疗规则。
J Am Stat Assoc. 2021;116(533):269-282. doi: 10.1080/01621459.2020.1817751. Epub 2020 Oct 19.
5
Clinical cancer genomic profiling.临床癌症基因组分析。
Nat Rev Genet. 2021 Aug;22(8):483-501. doi: 10.1038/s41576-021-00338-8. Epub 2021 Mar 24.
6
Inflation of tumor mutation burden by tumor-only sequencing in under-represented groups.在代表性不足的群体中,仅通过肿瘤测序导致肿瘤突变负担虚高。
NPJ Precis Oncol. 2021 Mar 19;5(1):22. doi: 10.1038/s41698-021-00164-5.
7
Tumor mutational burden quantification from targeted gene panels: major advancements and challenges.基于靶向基因panel 的肿瘤突变负荷定量:主要进展与挑战
J Immunother Cancer. 2019 Jul 15;7(1):183. doi: 10.1186/s40425-019-0647-4.
8
Structural learning and integrative decomposition of multi-view data.多视图数据的结构学习与整合分解
Biometrics. 2019 Dec;75(4):1121-1132. doi: 10.1111/biom.13108. Epub 2019 Sep 15.
9
Reliability of Whole-Exome Sequencing for Assessing Intratumor Genetic Heterogeneity.全外显子组测序评估肿瘤内遗传异质性的可靠性。
Cell Rep. 2018 Nov 6;25(6):1446-1457. doi: 10.1016/j.celrep.2018.10.046.
10
Clinical and Genomic Characterization of Treatment-Emergent Small-Cell Neuroendocrine Prostate Cancer: A Multi-institutional Prospective Study.治疗后出现的小细胞神经内分泌前列腺癌的临床和基因组特征:一项多机构前瞻性研究。
J Clin Oncol. 2018 Aug 20;36(24):2492-2503. doi: 10.1200/JCO.2017.77.6880. Epub 2018 Jul 9.