Department of Biochemistry, Vanderbilt University School of Medicine-Basic Sciences, Nashville, Tennessee, USA.
Quantitative Chemical and Physical Biology Graduate Program, Vanderbilt University School of Medicine-Basic Sciences, Nashville, Tennessee, USA.
Protein Sci. 2022 Sep;31(9):e4408. doi: 10.1002/pro.4408.
Genetic missense tolerance ratio (MTR) analysis systematically evaluates all possible segments in a given protein-encoding transcript found in the human population. This method scores each segment for the number of observed missense variants versus the number of silent mutations in that same segment. An MTR score of 0 indicates that no missense mutations are observed within a given segment. This is indicative of evolutionary purifying selection, which excludes mutations in that segment from the general human population. Here, we conducted MTR analysis on each of the roughly 20,000 protein-encoding human genes. It was seen that there are 257 genes with at least one 31-residue encoding segment with MTR = 0 (1.3% of all human genes). The proteins encoded by these 257 genes were tabulated along with information regarding the sequence location of each intolerant segment, the likely function of the protein, and so forth. The most functionally-enriched family among these proteins is a collection of several dozen proteins that are directly involved in RNA splicing. Some of the other proteins with zero-tolerance segments have thus far escaped significant characterization. Indeed, while a number of these proteins have previously been genetically linked to human disorders, many have not. We hypothesize that this compendium of human proteins with zero-tolerance segments can be used to complement disease mutation data as a pointer to genes and proteins that are associated with interesting and underexplored human biology.
遗传错义容忍比(MTR)分析系统地评估了在人类群体中发现的给定蛋白编码转录本中的所有可能片段。该方法对每个片段的观察到的错义变异数量与该片段中相同数量的沉默突变数量进行评分。MTR 得分为 0 表示在给定片段内未观察到错义突变。这表明进化净化选择排除了该片段中的突变,使其不能在一般人群中存在。在这里,我们对大约 20000 个人类蛋白编码基因中的每一个都进行了 MTR 分析。发现有 257 个基因至少有一个 31 个残基编码片段的 MTR=0(所有人类基因的 1.3%)。这些基因编码的蛋白质与每个不耐受片段的序列位置、蛋白质的可能功能等信息一起列出。这些蛋白质中功能最丰富的家族是几十种直接参与 RNA 剪接的蛋白质的集合。其中一些零容忍片段的其他蛋白质迄今尚未得到显著表征。事实上,虽然这些蛋白质中的许多以前与人类疾病有关,但其中许多没有。我们假设,这些具有零容忍片段的人类蛋白质汇总会被用来补充疾病突变数据,作为与有趣和未充分探索的人类生物学相关的基因和蛋白质的指示。