Sahin Mehmet Emre, Can Tolga, Son Cagdas Devrim
1 Department of Computer Engineering, Middle East Technical University , Ankara, Turkey .
OMICS. 2014 Oct;18(10):636-44. doi: 10.1089/omi.2014.0073. Epub 2014 Aug 18.
Next generation sequencing (NGS) and the attendant data deluge are increasingly impacting molecular life sciences research. Chief among the challenges and opportunities is to enhance our ability to classify molecular target data into meaningful and cohesive systematic nomenclature. In this vein, the G protein-coupled receptors (GPCRs) are the largest and most divergent receptor family that plays a crucial role in a host of pathophysiological pathways. For the pharmaceutical industry, GPCRs are a major drug target and it is estimated that 60%-70% of all medicines in development today target GPCRs. Hence, they require an efficient and rapid classification to group the members according to their functions. In addition to NGS and the Big Data challenge we currently face, an emerging number of orphan GPCRs further demand for novel, rapid, and accurate classification of the receptors since the current classification tools are inadequate and slow. This study presents the development of a new classification tool for GPCRs using the structural features derived from their primary sequences: GPCRsort. Comparison experiments with the current known GPCR classification techniques showed that GPCRsort is able to rapidly (in the order of minutes) classify uncharacterized GPCRs with 97.3% accuracy, whereas the best available technique's accuracy is 90.7%. GPCRsort is available in the public domain for postgenomics life scientists engaged in GPCR research with NGS: http://bioserver.ceng.metu.edu.tr/GPCRSort .
下一代测序(NGS)以及随之而来的数据洪流正日益影响着分子生命科学研究。其中最主要的挑战和机遇在于提高我们将分子靶点数据分类为有意义且连贯的系统命名法的能力。在这方面,G蛋白偶联受体(GPCRs)是最大且差异最大的受体家族,在众多病理生理途径中发挥着关键作用。对于制药行业而言,GPCRs是主要的药物靶点,据估计,目前正在研发的所有药物中有60%-70%靶向GPCRs。因此,它们需要一种高效快速的分类方法,以便根据其功能对成员进行分组。除了NGS和我们目前面临的大数据挑战外,由于目前的分类工具不够完善且速度缓慢,越来越多的孤儿GPCRs进一步需要对这些受体进行新颖、快速且准确的分类。本研究展示了一种利用GPCRs一级序列衍生的结构特征开发的新分类工具:GPCRsort。与当前已知的GPCR分类技术进行的比较实验表明,GPCRsort能够快速(以分钟计)对未表征的GPCRs进行分类,准确率达97.3%,而现有最佳技术的准确率为90.7%。GPCRsort已在公共领域供从事NGS相关GPCR研究的后基因组时代生命科学家使用:http://bioserver.ceng.metu.edu.tr/GPCRSort 。