Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA.
Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA.
Sci Data. 2018 Apr 17;5:180060. doi: 10.1038/sdata.2018.60.
The analysis of bronchoalveolar lavage fluid (BALF) using mass spectrometry-based metabolomics can provide insight into lung diseases, such as asthma. However, the important step of compound identification is hindered by the lack of a small molecule database that is specific for BALF. Here we describe prototypic, small molecule databases derived from human BALF samples (n=117). Human BALF was extracted into lipid and aqueous fractions and analyzed using liquid chromatography mass spectrometry. Following filtering to reduce contaminants and artifacts, the resulting BALF databases (BALF-DBs) contain 11,736 lipid and 658 aqueous compounds. Over 10% of these were found in 100% of samples. Testing the BALF-DBs using nested test sets produced a 99% match rate for lipids and 47% match rate for aqueous molecules. Searching an independent dataset resulted in 45% matching to the lipid BALF-DB compared to<25% when general databases are searched. The BALF-DBs are available for download from MetaboLights. Overall, the BALF-DBs can reduce false positives and improve confidence in compound identification compared to when general databases are used.
基于质谱的代谢组学分析支气管肺泡灌洗液(BALF)可以深入了解哮喘等肺部疾病。然而,化合物鉴定的重要步骤受到缺乏针对 BALF 的小分子数据库的阻碍。在这里,我们描述了源自人 BALF 样本(n=117)的原型小分子数据库。人 BALF 被提取到脂质和水相部分,并使用液相色谱-质谱进行分析。在过滤以减少污染物和伪影后,所得的 BALF 数据库(BALF-DB)包含 11736 种脂质和 658 种水性化合物。这些化合物中有超过 10%存在于 100%的样本中。使用嵌套测试集测试 BALF-DB 可得到脂质 99%的匹配率和水性分子 47%的匹配率。在搜索独立数据集时,与脂质 BALF-DB 的匹配率为 45%,而使用通用数据库搜索时的匹配率则低于 25%。BALF-DB 可从 MetaboLights 下载。总体而言,与使用通用数据库相比,BALF-DB 可减少假阳性并提高化合物鉴定的可信度。