IBBT Future Health Department/ESAT-SCD, KU Leuven, Kasteelpark Arenberg 10, 3001, Heverlee-Leuven, Belgium.
Bioinformatics. 2012 Sep 15;28(18):i569-i574. doi: 10.1093/bioinformatics/bts391.
The prediction of receptor-ligand pairings is an important area of research as intercellular communications are mediated by the successful interaction of these key proteins. As the exhaustive assaying of receptor-ligand pairs is impractical, a computational approach to predict pairings is necessary. We propose a workflow to carry out this interaction prediction task, using a text mining approach in conjunction with a state of the art prediction method, as well as a widely accessible and comprehensive dataset. Among several modern classifiers, random forests have been found to be the best at this prediction task. The training of this classifier was carried out using an experimentally validated dataset of Database of Ligand-Receptor Partners (DLRP) receptor-ligand pairs. New examples, co-cited with the training receptors and ligands, are then classified using the trained classifier. After applying our method, we find that we are able to successfully predict receptor-ligand pairs within the GPCR family with a balanced accuracy of 0.96. Upon further inspection, we find several supported interactions that were not present in the Database of Interacting Proteins (DIPdatabase). We have measured the balanced accuracy of our method resulting in high quality predictions stored in the available database ReLiance.
http://homes.esat.kuleuven.be/~bioiuser/ReLianceDB/index.php
yves.moreau@esat.kuleuven.be; ernesto.iacucci@gmail.com
Supplementary data are available at Bioinformatics online.
受体-配体对的预测是一个重要的研究领域,因为细胞间的通讯是通过这些关键蛋白的成功相互作用来介导的。由于 exhaustive assaying 受体-配体对是不切实际的,因此需要一种计算方法来预测配对。我们提出了一种工作流程来执行这项交互预测任务,使用文本挖掘方法结合最先进的预测方法,以及广泛可访问和全面的数据集。在几种现代分类器中,随机森林被发现最适合这项预测任务。该分类器的训练是使用经过实验验证的 Database of Ligand-Receptor Partners (DLRP) 受体-配体对数据集进行的。然后,使用训练有素的分类器对与训练受体和配体共同引用的新示例进行分类。在应用我们的方法后,我们发现我们能够成功地预测 GPCR 家族中的受体-配体对,平衡准确性为 0.96。进一步检查后,我们发现了一些数据库中不存在的支持相互作用的配体。我们已经测量了我们的方法的平衡准确性,从而得到了高质量的预测,并将其存储在可用的数据库 ReLiance 中。
http://homes.esat.kuleuven.be/~bioiuser/ReLianceDB/index.php
yves.moreau@esat.kuleuven.be; ernesto.iacucci@gmail.com
补充数据可在生物信息学在线获得。