Lam Maggie P Y, Venkatraman Vidya, Xing Yi, Lau Edward, Cao Quan, Ng Dominic C M, Su Andrew I, Ge Junbo, Van Eyk Jennifer E, Ping Peipei
Advanced Clinical Biosystems Research Institute, Department of Medicine and The Heart Institute, Cedars-Sinai Medical Center , Los Angeles, California 90048, United States.
Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan Hospital, Fudan University , Shanghai, 200433, China.
J Proteome Res. 2016 Nov 4;15(11):4126-4134. doi: 10.1021/acs.jproteome.6b00095. Epub 2016 Jul 19.
Amidst the proteomes of human tissues lie subsets of proteins that are closely involved in conserved pathophysiological processes. Much of biomedical research concerns interrogating disease signature proteins and defining their roles in disease mechanisms. With advances in proteomics technologies, it is now feasible to develop targeted proteomics assays that can accurately quantify protein abundance as well as their post-translational modifications; however, with rapidly accumulating number of studies implicating proteins in diseases, current resources are insufficient to target every protein without judiciously prioritizing the proteins with high significance and impact for assay development. We describe here a data science method to prioritize and expedite assay development on high-impact proteins across research fields by leveraging the biomedical literature record to rank and normalize proteins that are popularly and preferentially published by biomedical researchers. We demonstrate this method by finding priority proteins across six major physiological systems (cardiovascular, cerebral, hepatic, renal, pulmonary, and intestinal). The described method is data-driven and builds upon the collective knowledge of previous publications referenced on PubMed to lend objectivity to target selection. The method and resulting popular protein lists may also be useful for exploring biological processes associated with various physiological systems and research topics, in addition to benefiting ongoing efforts to facilitate the broad translation of proteomics technologies.
在人体组织的蛋白质组中,存在着与保守病理生理过程密切相关的蛋白质亚群。许多生物医学研究都涉及探究疾病特征蛋白并确定它们在疾病机制中的作用。随着蛋白质组学技术的进步,现在开发能够准确量化蛋白质丰度及其翻译后修饰的靶向蛋白质组学检测方法是可行的;然而,随着越来越多的研究表明蛋白质与疾病有关,目前的资源不足以针对每一种蛋白质进行检测,而不谨慎地优先考虑对检测方法开发具有高度重要性和影响力的蛋白质。我们在此描述一种数据科学方法,通过利用生物医学文献记录对生物医学研究人员普遍且优先发表的蛋白质进行排名和标准化,从而在各个研究领域中对高影响力蛋白质的检测方法开发进行优先级排序并加快其进程。我们通过在六个主要生理系统(心血管、大脑、肝脏、肾脏、肺和肠道)中寻找优先级蛋白质来证明这种方法。所描述的方法是数据驱动的,它基于PubMed上引用的先前出版物的集体知识,为目标选择提供客观性。除了有助于蛋白质组学技术的广泛转化的现有努力之外,该方法和由此产生的热门蛋白质列表还可能有助于探索与各种生理系统和研究主题相关的生物过程。