• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CYTO-SV-ML:一种利用基因组序列对体细胞类型进行细胞遗传学结构变异分析的机器学习工具。

CYTO-SV-ML: A Machine Learning Tool for Cytogenetic Structural Variant Analysis in Somatic Cell Type Using Genome Sequences.

作者信息

Zhang Tao, Auer Paul, Spellman Stephen R, Dong Jing, Saber Wael, Bolon Yung-Tsi

机构信息

CIBMTR® (Center for International Blood and Marrow Transplant Research), NMDP (National Marrow Donor Program), Minneapolis, MN 55401, USA.

Division of Biostatistics, Institute for Health and Equity, Medical College of Wisconsin, Milwaukee, WI 53226, USA.

出版信息

Life (Basel). 2025 Jun 9;15(6):929. doi: 10.3390/life15060929.

DOI:10.3390/life15060929
PMID:40566581
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12194788/
Abstract

(1) Background: Although whole genome sequencing (WGS) has enabled the comprehensive analyses of structural variants (SVs), more accurate and efficient methods are needed to distinguish large somatic SVs (SV size ≥ 1 Mb) traditionally detected through cytogenetic testing from germline SVs. (2) Methods: A customized machine learning pipeline (CYTO-SV-ML) under Snakemake automation workflow was developed with a user interface to identify somatic cytogenetic SVs in WGS data. And this tool was applied for characterizing structural variation profiles in the whole blood of patients with myelodysplastic syndromes (MDSs). Known SVs mapped from well-established open databases were split into training and validation subsets for an AUTO-ML machine learning model in a CYTO-SV-ML pipeline. (3) Results: The benchmarking performance of the CYTO-SV-ML pipeline on somatic cytogenetic SV classification displayed an area under the receiver operating characteristic curve (AUCROC) of 0.94 for translocations and 0.92 for non-translocations, a sensitivity of 0.83 for translocations and 0.85 for non-translocations, and a specificity of 0.96 for translocations and 0.82 for non-translocations. Our method (207 somatic cytogenetic SVs) outperformed a conventional SV calling pipeline (143 somatic cytogenetic SVs) in an independent validation of clinical cytogenetic records. In addition, the CYTO-SV-ML pipeline uncovered novel somatic cytogenetic SVs in 49 (89%) of 55 patients without successful clinical cytogenetic results. (4) Conclusions: Our study demonstrates the high-performance machine learning approach of CYTO-SV-ML on benchmarking SV classification from genomic sequencing data, and further validations of novel anomalies by orthogonal methods will be essential to unlock its full clinical potential of cytogenetic diagnostics.

摘要

(1) 背景:尽管全基因组测序(WGS)已能够对结构变异(SVs)进行全面分析,但仍需要更准确、高效的方法来区分传统上通过细胞遗传学检测发现的大型体细胞SVs(SV大小≥1 Mb)和种系SVs。(2) 方法:在Snakemake自动化工作流程下开发了一个定制的机器学习管道(CYTO-SV-ML),该管道带有用户界面,用于识别WGS数据中的体细胞细胞遗传学SVs。该工具被应用于表征骨髓增生异常综合征(MDSs)患者全血中的结构变异图谱。从成熟的开放数据库映射的已知SVs被分为训练子集和验证子集,用于CYTO-SV-ML管道中的自动机器学习模型。(3) 结果:CYTO-SV-ML管道在体细胞细胞遗传学SV分类方面的基准性能显示,易位的受试者工作特征曲线下面积(AUCROC)为0.94,非易位的为0.92;易位的敏感性为0.83,非易位的为0.85;易位的特异性为0.96,非易位的为0.82。在临床细胞遗传学记录的独立验证中,我们的方法(207个体细胞细胞遗传学SVs)优于传统的SV检测管道(143个体细胞细胞遗传学SVs)。此外,CYTO-SV-ML管道在55例临床细胞遗传学结果未成功的患者中的49例(89%)中发现了新的体细胞细胞遗传学SVs。(4) 结论:我们的研究证明了CYTO-SV-ML在从基因组测序数据进行SV分类基准测试方面的高性能机器学习方法,通过正交方法对新异常进行进一步验证对于释放其细胞遗传学诊断的全部临床潜力至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe9/12194788/c276caaf8c2e/life-15-00929-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe9/12194788/b0a739c16e57/life-15-00929-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe9/12194788/85390c216068/life-15-00929-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe9/12194788/73cdc65b7dac/life-15-00929-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe9/12194788/c276caaf8c2e/life-15-00929-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe9/12194788/b0a739c16e57/life-15-00929-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe9/12194788/85390c216068/life-15-00929-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe9/12194788/73cdc65b7dac/life-15-00929-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5fe9/12194788/c276caaf8c2e/life-15-00929-g004.jpg

相似文献

1
CYTO-SV-ML: A Machine Learning Tool for Cytogenetic Structural Variant Analysis in Somatic Cell Type Using Genome Sequences.CYTO-SV-ML:一种利用基因组序列对体细胞类型进行细胞遗传学结构变异分析的机器学习工具。
Life (Basel). 2025 Jun 9;15(6):929. doi: 10.3390/life15060929.
2
External validation of a machine learning prediction model for massive blood loss during surgery for spinal metastases: a multi-institutional study using 880 patients.脊柱转移瘤手术中大量失血的机器学习预测模型的外部验证:一项使用880例患者的多机构研究。
Spine J. 2025 Jul;25(7):1386-1399. doi: 10.1016/j.spinee.2025.03.018. Epub 2025 Mar 27.
3
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
4
The Machine Learning Models in Major Cardiovascular Adverse Events Prediction Based on Coronary Computed Tomography Angiography: Systematic Review.基于冠状动脉计算机断层扫描血管造影术的主要心血管不良事件预测中的机器学习模型:系统评价
J Med Internet Res. 2025 Jun 13;27:e68872. doi: 10.2196/68872.
5
Suicide Risk Screening in Jails: Protocol for a Pilot Study Leveraging the Mental Health Research Network Algorithm and Health Care Data.监狱中的自杀风险筛查:利用心理健康研究网络算法和医疗保健数据的试点研究方案
JMIR Res Protoc. 2025 Jun 25;14:e68517. doi: 10.2196/68517.
6
Multicenter Histology Image Integration and Multiscale Deep Learning for Machine Learning-Enabled Pediatric Sarcoma Classification.用于支持机器学习的小儿肉瘤分类的多中心组织学图像整合与多尺度深度学习
medRxiv. 2025 Jun 11:2025.06.10.25328700. doi: 10.1101/2025.06.10.25328700.
7
Development of a machine learning model and a web application for predicting neurological outcome at hospital discharge in spinal cord injury patients.开发用于预测脊髓损伤患者出院时神经功能结局的机器学习模型和网络应用程序。
Spine J. 2025 Jan 31. doi: 10.1016/j.spinee.2025.01.005.
8
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
9
Health professionals' experience of teamwork education in acute hospital settings: a systematic review of qualitative literature.医疗专业人员在急症医院环境中团队合作教育的经验:对定性文献的系统综述
JBI Database System Rev Implement Rep. 2016 Apr;14(4):96-137. doi: 10.11124/JBISRIR-2016-1843.
10
Mortality Risk Prediction in Patients With Antimelanoma Differentiation-Associated, Gene 5 Antibody-Positive, Dermatomyositis-Associated Interstitial Lung Disease: Algorithm Development and Validation.抗黑色素瘤分化相关基因5抗体阳性、皮肌炎相关间质性肺疾病患者的死亡风险预测:算法开发与验证
J Med Internet Res. 2025 Feb 5;27:e62836. doi: 10.2196/62836.

本文引用的文献

1
Section E6.1-6.6 of the American College of Medical Genetics and Genomics (ACMG) Technical Laboratory Standards: Cytogenomic studies of acquired chromosomal abnormalities in neoplastic blood, bone marrow, and lymph nodes.美国医学遗传学与基因组学学院(ACMG)技术实验室标准 E6.1-6.6 部分:肿瘤性血液、骨髓和淋巴结中获得性染色体异常的细胞遗传学研究。
Genet Med. 2024 Apr;26(4):101054. doi: 10.1016/j.gim.2023.101054. Epub 2024 Feb 13.
2
Whole-genome sequencing identifies novel predictors for hematopoietic cell transplant outcomes for patients with myelodysplastic syndrome: a CIBMTR study.全基因组测序鉴定骨髓增生异常综合征患者造血细胞移植结局的新预测因素:CIBMTR 研究。
J Hematol Oncol. 2023 Apr 11;16(1):37. doi: 10.1186/s13045-023-01431-7.
3
Cue: a deep-learning framework for structural variant discovery and genotyping.线索:一种用于结构变异发现和基因分型的深度学习框架。
Nat Methods. 2023 Apr;20(4):559-568. doi: 10.1038/s41592-023-01799-x. Epub 2023 Mar 23.
4
SVision: a deep learning approach to resolve complex structural variants.SVision:一种深度学习方法,用于解决复杂的结构变异。
Nat Methods. 2022 Oct;19(10):1230-1233. doi: 10.1038/s41592-022-01609-w. Epub 2022 Sep 16.
5
Guiding the global evolution of cytogenetic testing for hematologic malignancies.指导血液恶性肿瘤细胞遗传学检测的全球演变。
Blood. 2022 Apr 14;139(15):2273-2284. doi: 10.1182/blood.2021014309.
6
Genome Sequencing in Myeloid Cancers.髓系肿瘤中的基因组测序
N Engl J Med. 2021 Jun 24;384(25):e106. doi: 10.1056/NEJMc2106014.
7
Impact of Epigenomic Hypermethylation at TP53 on Allogeneic Hematopoietic Cell Transplantation Outcomes for Myelodysplastic Syndromes.抑癌基因 TP53 组蛋白高甲基化对骨髓增生异常综合征异基因造血细胞移植结局的影响。
Transplant Cell Ther. 2021 Aug;27(8):659.e1-659.e6. doi: 10.1016/j.jtct.2021.04.027. Epub 2021 May 13.
8
Genome Sequencing as an Alternative to Cytogenetic Analysis in Myeloid Cancers.基因组测序作为骨髓细胞癌细胞遗传学分析的替代方法。
N Engl J Med. 2021 Mar 11;384(10):924-935. doi: 10.1056/NEJMoa2024534.
9
Challenging conventional karyotyping by next-generation karyotyping in 281 intensively treated patients with AML.281 例强化治疗 AML 患者中应用新一代染色体组技术对传统核型分析的挑战。
Blood Adv. 2021 Feb 23;5(4):1003-1016. doi: 10.1182/bloodadvances.2020002517.
10
Characteristics and outcome of patients with acute myeloid leukaemia and t(8;16)(p11;p13): results from an International Collaborative Study.伴 t(8;16)(p11;p13)的急性髓系白血病患者的特征和结局:国际协作研究的结果。
Br J Haematol. 2021 Mar;192(5):832-842. doi: 10.1111/bjh.17336. Epub 2021 Feb 2.