Department of Computer Science and Engineering, University of Nebraska-Lincoln, 122E Avery Hall, 1144 T St., Lincoln, NE 68588, USA.
Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, 2699 Qianjin Street, Changchun 130012, China.
Database (Oxford). 2021 Oct 13;2021. doi: 10.1093/database/baab065.
Body fluid proteome has been intensively studied as a primary source for disease biomarker discovery. Using advanced proteomics technologies, early research success has resulted in increasingly accumulated proteins detected in different body fluids, among which many are promising biomarkers. However, despite a handful of small-scale and specific data resources, current research is clearly lacking effort compiling published body fluid proteins into a centralized and sustainable repository that can provide users with systematic analytic tools. In this study, we developed a new database of human body fluid proteome (HBFP) that focuses on experimentally validated proteome in 17 types of human body fluids. The current database archives 11 827 unique proteins reported by 164 scientific publications, with a maximal false discovery rate of 0.01 on both the peptide and protein levels since 2001, and enables users to query, analyze and download protein entries with respect to each body fluid. Three unique features of this new system include the following: (i) the protein annotation page includes detailed abundance information based on relative qualitative measures of peptides reported in the original references, (ii) a new score is calculated on each reported protein to indicate the discovery confidence and (iii) HBFP catalogs 7354 proteins with at least two non-nested uniquely mapping peptides of nine amino acids according to the Human Proteome Project Data Interpretation Guidelines, while the remaining 4473 proteins have more than two unique peptides without given sequence information. As an important resource for human protein secretome, we anticipate that this new HBFP database can be a powerful tool that facilitates research in clinical proteomics and biomarker discovery. Database URL: https://bmbl.bmi.osumc.edu/HBFP/.
体液蛋白质组已被广泛研究,作为疾病生物标志物发现的主要来源。使用先进的蛋白质组学技术,早期研究的成功已经导致在不同体液中检测到越来越多的蛋白质,其中许多是有前途的生物标志物。然而,尽管有一些小规模和特定的数据资源,但当前的研究显然缺乏将已发表的体液蛋白质编译成一个集中且可持续的存储库的努力,该存储库可以为用户提供系统的分析工具。在这项研究中,我们开发了一个新的人体体液蛋白质组数据库(HBFP),该数据库专注于 17 种人体体液中经过实验验证的蛋白质组。目前的数据库归档了 164 篇科学出版物中报道的 11827 个独特蛋白质,自 2001 年以来,肽和蛋白质水平的最大假发现率均为 0.01,并且允许用户查询、分析和下载与每种体液相关的蛋白质条目。这个新系统有三个独特的特点:(i)蛋白质注释页面包括基于原始参考文献中报道的肽的相对定性测量的详细丰度信息,(ii)为每个报道的蛋白质计算一个新的分数,以指示发现的置信度,(iii)HBFP 根据人类蛋白质组计划数据解释指南,将至少有两个非嵌套、具有九个氨基酸的唯一映射肽的 7354 个蛋白质编入目录,而其余 4473 个蛋白质具有两个以上没有给定序列信息的独特肽。作为人类蛋白质分泌组的重要资源,我们预计这个新的 HBFP 数据库可以成为促进临床蛋白质组学和生物标志物发现研究的有力工具。数据库网址:https://bmbl.bmi.osumc.edu/HBFP/。