Department of Biomedical Engineering, Wroclaw University of Science and Technology, 50-370 Wroclaw, Poland.
Methods Mol Biol. 2022;2340:1-15. doi: 10.1007/978-1-0716-1546-1_1.
Several computational methods have been developed to predict amyloid propensity of a protein or peptide. These bioinformatics tools are time- and cost-saving alternatives to expensive and laborious experimental methods which are used to confirm self-aggregation of a protein. Computational approaches not only allow preselection of reliable candidates for amyloids but, most importantly, are capable of a thorough and informative analysis of a protein, indicating the sequence determinants of protein aggregation, identifying the potential causal mutations and likely mechanisms. Bioinformatics modeling applies several different approaches, which most typically include physicochemical or structure-based modeling, machine learning, or statistics based modeling. Bioinformatics methods typically use the amino acid sequence of a protein as an input, some also include additional information, for example, an available structure. This chapter describes the methods currently used to computationally predict amyloid propensity of a protein or peptide. Since the accuracy of bioinformatics methods may be highly dependent on reference data used to develop and evaluate the predictors, we also briefly present the main databases of amyloids used by the authors of bioinformatics tools.
已经开发了几种计算方法来预测蛋白质或肽的淀粉样倾向。这些生物信息学工具是替代昂贵且费力的实验方法的省时省钱的方法,这些方法用于确认蛋白质的自聚集。计算方法不仅允许预先选择可靠的淀粉样候选物,而且最重要的是能够对蛋白质进行彻底和有信息的分析,指示蛋白质聚集的序列决定因素,识别潜在的因果突变和可能的机制。生物信息学建模应用了几种不同的方法,这些方法最典型的包括基于物理化学或结构的建模、机器学习或基于统计的建模。生物信息学方法通常使用蛋白质的氨基酸序列作为输入,有些方法还包括其他信息,例如可用的结构。本章介绍了目前用于计算预测蛋白质或肽的淀粉样倾向的方法。由于生物信息学方法的准确性可能高度依赖于用于开发和评估预测器的参考数据,因此我们还简要介绍了生物信息学工具的作者使用的主要淀粉样数据库。