Suppr超能文献

基于多因素 MALDI-TOF MS T2DM 小鼠模型数据集的生物标志物发现和分类冗余减少。

Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset.

机构信息

MicroDiscovery GmbH, Marienburger Str, 1, 10405 Berlin, Germany.

出版信息

BMC Bioinformatics. 2011 May 9;12:140. doi: 10.1186/1471-2105-12-140.

Abstract

BACKGROUND

Diabetes like many diseases and biological processes is not mono-causal. On the one hand multi-factorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics.

RESULTS

We present a comprehensive work-flow tailored for analyzing complex data including data from multi-factorial studies. The developed approach aims at revealing effects caused by a distinct combination of experimental factors, in our case genotype and diet. Applying the developed work-flow to the analysis of an established polygenic mouse model for diet-induced type 2 diabetes, we found peptides with significant fold changes exclusively for the combination of a particular strain and diet. Exploitation of redundancy enables the visualization of peptide correlation and provides a natural way of feature selection for classification and prediction. Classification based on the features selected using our approach performs similar to classifications based on more complex feature selection methods.

CONCLUSIONS

The combination of ANOVA and redundancy exploitation allows for identification of biomarker candidates in multi-dimensional MALDI-TOF MS profiling studies with complex experimental design. With respect to feature selection our method provides a fast and intuitive alternative to global optimization strategies with comparable performance. The method is implemented in R and the scripts are available by contacting the corresponding author.

摘要

背景

糖尿病与许多疾病和生物过程一样,不是单一原因引起的。一方面,需要进行多因素研究,并采用复杂的实验设计来进行全面分析。另一方面,这些研究的数据通常包含大量的冗余信息,例如通常由多种肽代表的蛋白质。同时应对这两个复杂性(实验和技术)使得生物信息学的数据分析成为一个挑战。

结果

我们提出了一个全面的工作流程,专门用于分析复杂数据,包括多因素研究的数据。所开发的方法旨在揭示由实验因素的独特组合引起的影响,在我们的情况下是基因型和饮食。将所开发的工作流程应用于对已建立的多基因小鼠模型进行饮食诱导的 2 型糖尿病的分析,我们发现了仅针对特定菌株和饮食组合的具有显著倍数变化的肽。利用冗余性可以可视化肽相关性,并为分类和预测提供自然的特征选择方式。基于我们方法选择的特征进行的分类与基于更复杂特征选择方法的分类性能相当。

结论

ANOVA 与冗余性利用的结合可用于鉴定具有复杂实验设计的多维 MALDI-TOF MS 分析研究中的生物标志物候选物。在特征选择方面,我们的方法提供了一种快速直观的替代方案,与全局优化策略具有相当的性能。该方法已在 R 中实现,脚本可通过联系相应作者获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0359/3116487/641b89ceef33/1471-2105-12-140-1.jpg

相似文献

8
Identification of urinary biomarkers for type 2 diabetes using bead-based proteomic approach.
Diabetes Res Clin Pract. 2013 Aug;101(2):187-93. doi: 10.1016/j.diabres.2013.05.004. Epub 2013 Jun 13.
9
Shotgun nanoLC-MS/MS proteogenomics to document MALDI-TOF biomarkers for screening new members of the Ruegeria genus.
Environ Microbiol. 2013 Jan;15(1):133-47. doi: 10.1111/j.1462-2920.2012.02812.x. Epub 2012 Jun 19.
10
Screening for potential serum-based proteomic biomarkers for human type 2 diabetes mellitus using MALDI-TOF MS.
Proteomics Clin Appl. 2017 Mar;11(3-4). doi: 10.1002/prca.201600079. Epub 2016 Nov 30.

引用本文的文献

1
Recent applications of chemometrics in one- and two-dimensional chromatography.
J Sep Sci. 2020 May;43(9-10):1678-1727. doi: 10.1002/jssc.202000011. Epub 2020 Mar 19.
3
Informed baseline subtraction of proteomic mass spectrometry data aided by a novel sliding window algorithm.
Proteome Sci. 2016 Dec 7;14:19. doi: 10.1186/s12953-016-0107-8. eCollection 2016.
4
Identifying technical aliases in SELDI mass spectra of complex mixtures of proteins.
BMC Res Notes. 2013 Sep 8;6:358. doi: 10.1186/1756-0500-6-358.

本文引用的文献

1
Mass spectrometry in clinical proteomics - from the present to the future.
Proteomics Clin Appl. 2009 Jan;3(1):6-17. doi: 10.1002/prca.200800090. Epub 2008 Nov 20.
2
Evaluation of peak-picking algorithms for protein mass spectrometry.
Methods Mol Biol. 2011;696:341-52. doi: 10.1007/978-1-60761-987-1_22.
3
A well-characterised peak identification list of MALDI MS profile peaks for human blood serum.
Proteomics. 2010 Sep;10(18):3388-92. doi: 10.1002/pmic.201000100.
4
Global healthcare expenditure on diabetes for 2010 and 2030.
Diabetes Res Clin Pract. 2010 Mar;87(3):293-301. doi: 10.1016/j.diabres.2010.01.026. Epub 2010 Feb 19.
5
Peptides generated ex vivo from serum proteins by tumor-specific exopeptidases are not useful biomarkers in ovarian cancer.
Clin Chem. 2010 Feb;56(2):262-71. doi: 10.1373/clinchem.2009.133363. Epub 2010 Jan 21.
6
Diet-induced gene expression of isolated pancreatic islets from a polygenic mouse model of the metabolic syndrome.
Diabetologia. 2010 Feb;53(2):309-20. doi: 10.1007/s00125-009-1576-4. Epub 2009 Nov 10.
7
Global estimates of the prevalence of diabetes for 2010 and 2030.
Diabetes Res Clin Pract. 2010 Jan;87(1):4-14. doi: 10.1016/j.diabres.2009.10.007. Epub 2009 Nov 6.
8
MALDI profiling of human lung cancer subtypes.
PLoS One. 2009 Nov 5;4(11):e7731. doi: 10.1371/journal.pone.0007731.
9
Computational protein profile similarity screening for quantitative mass spectrometry experiments.
Bioinformatics. 2010 Jan 1;26(1):77-83. doi: 10.1093/bioinformatics/btp607. Epub 2009 Oct 27.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验