• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于无信息采集张量格式的蛋白质组数据进行表型分类。

Phenotype Classification using Proteome Data in a Data-Independent Acquisition Tensor Format.

机构信息

Zhejiang Provincial Laboratory of Life Sciences and Biomedicine, Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang Province, China.

Institute of Basic Medical Sciences, Westlake Institute for Advanced Study, Hangzhou 310024, Zhejiang Province, China.

出版信息

J Am Soc Mass Spectrom. 2020 Nov 4;31(11):2296-2304. doi: 10.1021/jasms.0c00254. Epub 2020 Oct 26.

DOI:10.1021/jasms.0c00254
PMID:33104352
Abstract

A novel approach for phenotype prediction is developed for data-independent acquisition (DIA) mass spectrometric (MS) data without the need for peptide precursor identification using existing DIA software tools. The first step converts the DIA-MS data file into a new file format called DIA tensor (DIAT), which can be used for the convenient visualization of all the ions from peptide precursors and fragments. DIAT files can be fed directly into a deep neural network to predict phenotypes such as appearances of cats, dogs, and microscopic images. As a proof of principle, we applied this approach to 102 hepatocellular carcinoma samples and achieved an accuracy of 96.8% in distinguishing malignant from benign samples. We further applied a refined model to classify thyroid nodules. Deep learning based on 492 training samples achieved an accuracy of 91.7% in an independent cohort of 216 test samples. This approach surpassed the deep-learning model based on peptide and protein matrices generated by OpenSWATH. In summary, we present a new strategy for DIA data analysis based on a novel data format called DIAT, which enables facile two-dimensional visualization of DIA proteomics data. DIAT files can be directly used for deep learning for biological and clinical phenotype classification. Future research will interpret the deep-learning models emerged from DIAT analysis.

摘要

开发了一种新的方法,用于在无需使用现有 DIA 软件工具识别肽前体的情况下,对数据独立采集 (DIA) 质谱 (MS) 数据进行表型预测。该方法的第一步是将 DIA-MS 数据文件转换为一种新的文件格式,称为 DIA 张量 (DIAT),可用于方便地可视化肽前体和片段的所有离子。可以将 DIAT 文件直接输入到深度神经网络中,以预测表型,如猫、狗的外观和微观图像。作为原理验证,我们将该方法应用于 102 个肝细胞癌样本,在区分良恶性样本方面的准确率达到 96.8%。我们进一步应用了一个改进的模型来对甲状腺结节进行分类。基于 492 个训练样本的深度学习在一个独立的 216 个测试样本队列中达到了 91.7%的准确率。该方法优于基于 OpenSWATH 生成的肽和蛋白质矩阵的深度学习模型。总之,我们提出了一种新的基于 DIAT 的 DIA 数据分析策略,该策略能够方便地对 DIA 蛋白质组学数据进行二维可视化。可以直接使用 DIAT 文件进行生物和临床表型分类的深度学习。未来的研究将解释 DIAT 分析中出现的深度学习模型。

相似文献

1
Phenotype Classification using Proteome Data in a Data-Independent Acquisition Tensor Format.基于无信息采集张量格式的蛋白质组数据进行表型分类。
J Am Soc Mass Spectrom. 2020 Nov 4;31(11):2296-2304. doi: 10.1021/jasms.0c00254. Epub 2020 Oct 26.
2
MSSort-DIA: A deep learning classification tool of the peptide precursors quantified by OpenSWATH.MSSort-DIA:一种基于深度学习的 OpenSWATH 定量肽前体分类工具。
J Proteomics. 2022 May 15;259:104542. doi: 10.1016/j.jprot.2022.104542. Epub 2022 Feb 26.
3
In silico spectral libraries by deep learning facilitate data-independent acquisition proteomics.深度学习构建的虚拟光谱库促进了数据非依赖采集蛋白质组学的发展。
Nat Commun. 2020 Jan 9;11(1):146. doi: 10.1038/s41467-019-13866-z.
4
Removing the Hidden Data Dependency of DIA with Predicted Spectral Libraries.利用预测谱库去除 DIA 的隐藏数据依赖性。
Proteomics. 2020 Feb;20(3-4):e1900306. doi: 10.1002/pmic.201900306. Epub 2020 Feb 5.
5
DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput.DIA-NN:神经网络和干扰校正可实现高通量下的深度蛋白质组覆盖。
Nat Methods. 2020 Jan;17(1):41-44. doi: 10.1038/s41592-019-0638-x. Epub 2019 Nov 25.
6
SeFilter-DIA: Squeeze-and-Excitation Network for Filtering High-Confidence Peptides of Data-Independent Acquisition Proteomics.SeFilter-DIA:用于筛选数据非依赖型采集蛋白质组学中高可信度肽段的挤压激励网络。
Interdiscip Sci. 2024 Sep;16(3):579-592. doi: 10.1007/s12539-024-00611-4. Epub 2024 Mar 12.
7
Hybrid data acquisition and processing strategies with increased throughput and selectivity: pSMART analysis for global qualitative and quantitative analysis.具有更高通量和选择性的混合数据采集与处理策略:用于全局定性和定量分析的pSMART分析
J Proteome Res. 2014 Dec 5;13(12):5415-30. doi: 10.1021/pr5003017. Epub 2014 Oct 14.
8
Avant-garde: an automated data-driven DIA data curation tool.前卫:一种自动化的数据驱动型 DIA 数据管理工具。
Nat Methods. 2020 Dec;17(12):1237-1244. doi: 10.1038/s41592-020-00986-4. Epub 2020 Nov 16.
9
Computational Optimization of Spectral Library Size Improves DIA-MS Proteome Coverage and Applications to 15 Tumors.计算优化谱库大小可提高 DIA-MS 蛋白质组覆盖度及在 15 种肿瘤中的应用。
J Proteome Res. 2021 Dec 3;20(12):5392-5401. doi: 10.1021/acs.jproteome.1c00640. Epub 2021 Nov 8.
10
MSLibrarian: Optimized Predicted Spectral Libraries for Data-Independent Acquisition Proteomics.MSLibrarian:用于数据非依赖性采集蛋白质组学的优化预测谱库。
J Proteome Res. 2022 Feb 4;21(2):535-546. doi: 10.1021/acs.jproteome.1c00796. Epub 2022 Jan 19.

引用本文的文献

1
Integrating multi-omics and machine learning for disease resistance prediction in legumes.整合多组学和机器学习用于豆类抗病性预测
Theor Appl Genet. 2025 Jun 27;138(7):163. doi: 10.1007/s00122-025-04948-2.
2
ProteoNet: A CNN-based framework for analyzing proteomics MS-RGB images.ProteoNet:一种基于卷积神经网络的蛋白质组学MS-RGB图像分析框架。
iScience. 2024 Nov 12;27(12):111362. doi: 10.1016/j.isci.2024.111362. eCollection 2024 Dec 20.
3
A deep learning framework for hepatocellular carcinoma diagnosis using MS1 data.基于 MS1 数据的肝细胞癌诊断深度学习框架。
Sci Rep. 2024 Nov 4;14(1):26705. doi: 10.1038/s41598-024-77494-4.
4
Artificial intelligence defines protein-based classification of thyroid nodules.人工智能定义了基于蛋白质的甲状腺结节分类。
Cell Discov. 2022 Sep 6;8(1):85. doi: 10.1038/s41421-022-00442-x.
5
Deep learning neural network tools for proteomics.深度学习神经网络工具在蛋白质组学中的应用。
Cell Rep Methods. 2021 May 17;1(2):100003. doi: 10.1016/j.crmeth.2021.100003. eCollection 2021 Jun 21.