Suppr超能文献

数据增强技术在代谢组学中的应用。

Application of data augmentation techniques towards metabolomics.

机构信息

Departamento de Lenguajes y Ciencias de la Computación, Escuela Técnica Superior de Ingeniería Informática, Universidad de Málaga, No. 35, Bulevar Louis Pasteur, Málaga, 29071, Spain.

School of Computer Science and Informatics, Faculty of Technology, De Montfort University, The Gateway, Leicester, LE1 9BH, United Kingdom.

出版信息

Comput Biol Med. 2022 Sep;148:105916. doi: 10.1016/j.compbiomed.2022.105916. Epub 2022 Jul 27.

Abstract

Niemann-Pick Class 1 (NPC1) disease is a rare and debilitating neurodegenerative lysosomal storage disease (LSD). Metabolomics datasets of NPC1 patients available to perform this type of analysis are often limited in the number of samples and severely unbalanced. In order to improve the predictive capability and identify new biomarkers in an NPC1 disease urinary dataset, data augmentation (DA) techniques based on computational intelligence have been employed to create synthetic samples, i.e. the addition of noise, oversampling techniques and conditional generative adversarial networks. These techniques have been used to evaluate their predictive capacities on a set of urine samples donated by 13 untreated NPC1 disease and 47 heterozygous (parental) carrier control participants. Results on the prediction have also been obtained using different machine learning classification models and the partial least squares techniques. These results provide strong evidence for the ability of DA techniques to generate good quality synthetic data. Results acquired show increases in sensitivity of 20%-50%, an F score of 6%-30%, and a predictive capacity of 0.3 (out of 1). Additionally, more conventional forms of multivariate data analysis have been employed. These have allowed the detection of unusual urinary metabolite profiles, and the identification of biomarkers through the use of synthetically augmented datasets. Results indicate that urinary branched-chain amino acids such as valine, 3-aminoisobutyrate and quinolinate, may be employable as valuable biomarkers for the diagnosis and prognostic monitoring of NPC1 disease.

摘要

尼曼-匹克症 1 型(NPC1)疾病是一种罕见的、使人衰弱的神经退行性溶酶体贮积症(LSD)。可用于执行此类分析的 NPC1 患者代谢组学数据集通常在样本数量上受到限制,且严重失衡。为了提高 NPC1 疾病尿液数据集的预测能力并确定新的生物标志物,已采用基于计算智能的数据扩充(DA)技术来创建合成样本,即添加噪声、过采样技术和条件生成对抗网络。已使用这些技术来评估它们在一组由 13 名未经治疗的 NPC1 疾病和 47 名杂合(父母)携带者对照参与者捐赠的尿液样本上的预测能力。还使用不同的机器学习分类模型和偏最小二乘技术获得了预测结果。这些结果为 DA 技术生成高质量合成数据的能力提供了有力证据。获得的结果表明,敏感性提高了 20%-50%,F 分数提高了 6%-30%,预测能力提高了 0.3(满分 1)。此外,还采用了更传统的多元数据分析形式。这些方法允许检测到异常的尿代谢物谱,并通过使用合成增强数据集来识别生物标志物。结果表明,尿支链氨基酸(如缬氨酸、3-氨基异丁酸和喹啉酸)可能可用作 NPC1 疾病诊断和预后监测的有价值的生物标志物。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验