Chou Kuo-Chen
Gordon Life Science Institute, Boston, MA 02478,United States.
Curr Med Chem. 2019;26(26):4918-4943. doi: 10.2174/0929867326666190507082559.
The smallest unit of life is a cell, which contains numerous protein molecules. Most of the functions critical to the cell's survival are performed by these proteins located in its different organelles, usually called ''subcellular locations". Information of subcellular localization for a protein can provide useful clues about its function. To reveal the intricate pathways at the cellular level, knowledge of the subcellular localization of proteins in a cell is prerequisite. Therefore, one of the fundamental goals in molecular cell biology and proteomics is to determine the subcellular locations of proteins in an entire cell. It is also indispensable for prioritizing and selecting the right targets for drug development. Unfortunately, it is both timeconsuming and costly to determine the subcellular locations of proteins purely based on experiments. With the avalanche of protein sequences generated in the post-genomic age, it is highly desired to develop computational methods for rapidly and effectively identifying the subcellular locations of uncharacterized proteins based on their sequences information alone. Actually, considerable progresses have been achieved in this regard. This review is focused on those methods, which have the capacity to deal with multi-label proteins that may simultaneously exist in two or more subcellular location sites. Protein molecules with this kind of characteristic are vitally important for finding multi-target drugs, a current hot trend in drug development. Focused in this review are also those methods that have use-friendly web-servers established so that the majority of experimental scientists can use them to get the desired results without the need to go through the detailed mathematics involved.
生命的最小单位是细胞,细胞包含众多蛋白质分子。对细胞存活至关重要的大多数功能是由位于其不同细胞器(通常称为“亚细胞定位”)中的这些蛋白质执行的。蛋白质的亚细胞定位信息可以为其功能提供有用线索。为了揭示细胞水平上错综复杂的途径,了解细胞中蛋白质的亚细胞定位是先决条件。因此,分子细胞生物学和蛋白质组学的基本目标之一是确定整个细胞中蛋白质的亚细胞定位。这对于确定药物开发的优先目标和选择正确的靶点也不可或缺。不幸的是,单纯基于实验来确定蛋白质的亚细胞定位既耗时又昂贵。随着后基因组时代产生的蛋白质序列雪崩式增长,迫切需要开发仅基于序列信息快速有效地识别未表征蛋白质亚细胞定位的计算方法。实际上,在这方面已经取得了相当大的进展。本综述聚焦于那些能够处理可能同时存在于两个或更多亚细胞定位位点的多标签蛋白质的方法。具有这种特性的蛋白质分子对于寻找多靶点药物至关重要,多靶点药物是当前药物开发的热门趋势。本综述还聚焦于那些已建立用户友好型网络服务器的方法,以便大多数实验科学家无需深入研究详细的数学知识就能使用它们获得所需结果。