York Centre for Complex Systems Analysis (YCCSA), University of York, Heslington, York, YO10 5DD, UK.
BMC Bioinformatics. 2010 Jun 9;11:310. doi: 10.1186/1471-2105-11-310.
Partitioning of a protein into structural components, known as domains, is an important initial step in protein classification and for functional and evolutionary studies. While the systematic assignments of domains by human experts exist (CATH and SCOP), the introduction of high throughput technologies for structure determination threatens to overwhelm expert approaches. A variety of algorithmic methods have been developed to expedite this process, allowing almost instant structural decomposition into domains. The performance of algorithmic methods can approach 85% agreement on the number of domains with the consensus reached by experts. However, each algorithm takes a somewhat different conceptual approach, each with unique strengths and weaknesses. Currently there is no simple way to automatically compare assignments from different structure-based domain assignment methods, thereby providing a comprehensive understanding of possible structure partitioning as well as providing some insight into the tendencies of particular algorithms. Most importantly, a consensus assignment drawn from multiple assignment methods can provide a singular and presumably more accurate view.
We introduce dConsensus http://pdomains.sdsc.edu/dConsensus; a web resource that displays the results of calculations from multiple algorithmic methods and generates a domain assignment consensus with an associated reliability score. Domain assignments from seven structure-based algorithms - PDP, PUU, DomainParser2, NCBI method, DHcL, DDomains and Dodis are available for analysis and comparison alongside assignments made by expert methods. The assignments are available for all protein chains in the Protein Data Bank (PDB). A consensus domain assignment is built by either allowing each algorithm to contribute equally (simple approach) or by weighting the contribution of each method by its prior performance and observed tendencies. An analysis of secondary structure around domain and fragment boundaries is also available for display and further analysis.
dConsensus provides a comprehensive assignment of protein domains. For the first time, seven algorithmic methods are brought together with no need to access each method separately via a webserver or local copy of the software. This aggregation permits a consensus domain assignment to be computed. Comparison viewing of the consensus and choice methods provides the user with insights into the fundamental units of protein structure so important to the study of evolutionary and functional relationships.
将蛋白质划分为结构组件(称为结构域)是蛋白质分类以及功能和进化研究的重要初始步骤。虽然存在由人类专家进行的系统结构域分配(CATH 和 SCOP),但是高通量技术的引入也威胁到了专家方法的应用。已经开发了多种算法方法来加速这个过程,从而几乎可以立即将结构分解为结构域。算法方法的性能可以在结构域数量上达到与专家共识 85%的一致性。但是,每种算法都采用了略有不同的概念方法,每种方法都有其独特的优势和劣势。目前,没有简单的方法可以自动比较来自不同基于结构的结构域分配方法的分配,从而提供对可能的结构划分的全面理解,并提供对特定算法趋势的一些见解。最重要的是,从多种分配方法得出的共识分配可以提供单一且可能更准确的观点。
我们引入了 dConsensus http://pdomains.sdsc.edu/dConsensus;这是一个网络资源,可显示来自多种算法的计算结果,并生成具有相关可靠性得分的结构域分配共识。可用于分析和比较的结构域分配来自七种基于结构的算法 - PDP、PUU、DomainParser2、NCBI 方法、DHcL、DDomains 和 Dodis,以及专家方法生成的分配。这些分配适用于蛋白质数据库(PDB)中的所有蛋白质链。共识结构域分配可以通过允许每个算法平等贡献(简单方法)或通过根据先前的性能和观察到的趋势为每种方法的贡献加权来构建。还可以提供有关结构域和片段边界周围二级结构的分析以供显示和进一步分析。
dConsensus 提供了蛋白质结构域的全面分配。这是首次将七种算法方法汇集在一起,而无需通过网络服务器或软件的本地副本分别访问每种方法。这种聚合允许计算共识结构域分配。共识和选择方法的比较视图为用户提供了对蛋白质结构基本单元的深入了解,这对于研究进化和功能关系非常重要。