Institute of Microbial Technology, Sector 39A, Chandigarh, India.
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S48. doi: 10.1186/1471-2105-11-S1-S48.
Flavin binding proteins (FBP) plays a critical role in several biological functions such as electron transport system (ETS). These flavoproteins contain very tightly bound, sometimes covalently, flavin adenine dinucleotide (FAD) or flavin mono nucleotide (FMN). The interaction between flavin nucleotide and amino acids of flavoprotein is essential for their functionality. Thus identification of FAD interacting residues in a FBP is an important step for understanding their function and mechanism.
In this study, we describe models developed for predicting FAD interacting residues using 15, 17 and 19 window pattern. Support vector machine (SVM) based models have been developed using binary pattern of amino acid sequence of protein and achieved maximum accuracy 69.65% with Mathew's Correlation Coefficient (MCC) 0.39 and Area Under Curve (AUC) 0.773. The performance of these models have been improved significantly from 69.65% to 82.86% with MCC 0.66 and AUC 0.904, when evolutionary information is used as input in SVM. The evolutionary information was generated in form of position specific score matrix (PSSM) profile by using PSI-BLAST at e-value 0.001. All models were developed on 198 non-redundant FAD binding protein chains containing 5172 FAD interacting residues and evaluated using fivefold cross-validation technique.
This study suggests that evolutionary information of 17 amino acid patterns perform best for FAD interacting residues prediction. We also developed a web server which predicts FAD interacting residues in a protein which is freely available for academics.
黄素结合蛋白(FBP)在许多生物学功能中起着关键作用,例如电子传递系统(ETS)。这些黄素蛋白含有非常紧密结合的黄素腺嘌呤二核苷酸(FAD)或黄素单核苷酸(FMN),有时是共价结合的。黄素核苷酸与黄素蛋白氨基酸之间的相互作用对于它们的功能至关重要。因此,鉴定 FBP 中的 FAD 相互作用残基是理解其功能和机制的重要步骤。
在这项研究中,我们描述了使用 15、17 和 19 窗口模式开发的用于预测 FAD 相互作用残基的模型。使用蛋白质氨基酸序列的二进制模式开发了支持向量机(SVM)模型,最大精度为 69.65%,马修相关系数(MCC)为 0.39,曲线下面积(AUC)为 0.773。当将进化信息用作 SVM 的输入时,这些模型的性能从 69.65%显著提高到 82.86%,MCC 为 0.66,AUC 为 0.904。进化信息通过在 e 值为 0.001 时使用 PSI-BLAST 以位置特异性评分矩阵(PSSM)形式生成。所有模型均在包含 5172 个 FAD 相互作用残基的 198 个非冗余 FAD 结合蛋白链上开发,并使用五重交叉验证技术进行评估。
本研究表明,17 个氨基酸模式的进化信息最适合 FAD 相互作用残基预测。我们还开发了一个免费提供给学术界使用的预测蛋白质中 FAD 相互作用残基的网络服务器。