Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, 40126 Bologna, Italy.
Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM), Italian National Research Council (CNR), 70126 Bari, Italy.
Int J Mol Sci. 2019 Mar 27;20(7):1530. doi: 10.3390/ijms20071530.
Modern sequencing technologies provide an unprecedented amount of data of single-nucleotide variations occurring in coding regions and leading to changes in the expressed protein sequences. A significant fraction of these single-residue variations is linked to disease onset and collected in public databases. In recent years, many scientific studies have been focusing on the dissection of salient features of disease-related variations from different perspectives. In this work, we complement previous analyses by updating a dataset of disease-related variations occurring in proteins with 3D structure. Within this dataset, we describe functional and structural features that can be of interest for characterizing disease-related variations, including major chemico-physical properties, the strength of association to disease of variation types, their effect on protein stability, their location on the protein structure, and their distribution in Pfam structural/functional protein models. Our results support previous findings obtained in different data sets and introduce Pfam models as possible fingerprints of patterns of disease related single-nucleotide variations.
现代测序技术提供了前所未有的编码区域中单核苷酸变异的数据,这些变异导致了表达蛋白序列的改变。这些单一位点变异中有很大一部分与疾病的发生有关,并被收集在公共数据库中。近年来,许多科学研究都集中在从不同角度剖析与疾病相关变异的显著特征。在这项工作中,我们通过更新一个包含具有 3D 结构的蛋白质中与疾病相关的变异的数据集,对之前的分析进行了补充。在这个数据集中,我们描述了一些功能和结构特征,这些特征可能有助于表征与疾病相关的变异,包括主要的理化性质、变异类型与疾病的关联强度、它们对蛋白质稳定性的影响、它们在蛋白质结构上的位置以及它们在 Pfam 结构/功能蛋白质模型中的分布。我们的结果支持了在不同数据集上获得的先前发现,并提出 Pfam 模型作为与疾病相关的单核苷酸变异模式的可能特征。