Suppr超能文献

人类 CpG 位点的适合度效应的突变饱和。

Mutation saturation for fitness effects at human CpG sites.

机构信息

Department of Biological Sciences, Columbia University, New York, United States.

Department of Systems Biology, Columbia University, New York, United States.

出版信息

Elife. 2021 Nov 22;10:e71513. doi: 10.7554/eLife.71513.

Abstract

Whole exome sequences have now been collected for millions of humans, with the related goals of identifying pathogenic mutations in patients and establishing reference repositories of data from unaffected individuals. As a result, we are approaching an important limit, in which datasets are large enough that, in the absence of natural selection, every highly mutable site will have experienced at least one mutation in the genealogical history of the sample. Here, we focus on CpG sites that are methylated in the germline and experience mutations to T at an elevated rate of ~10 per site per generation; considering synonymous mutations in a sample of 390,000 individuals, ~ 99 % of such CpG sites harbor a C/T polymorphism. Methylated CpG sites provide a natural mutation saturation experiment for fitness effects: as we show, at nt sample sizes, not seeing a non-synonymous polymorphism is indicative of strong selection against that mutation. We rely on this idea in order to directly identify a subset of CpG transitions that are likely to be highly deleterious, including ~27 % of possible loss-of-function mutations, and up to 20 % of possible missense mutations, depending on the type of functional site in which they occur. Unlike methylated CpGs, most mutation types, with rates on the order of 10 or 10, remain very far from saturation. We discuss what these findings imply for interpreting the potential clinical relevance of mutations from their presence or absence in reference databases and for inferences about the fitness effects of new mutations.

摘要

现在已经为数百万人收集了整个外显子组序列,其相关目标是识别患者中的致病性突变,并建立未受影响个体数据的参考数据库。因此,我们即将达到一个重要的限制,即数据集足够大,以至于在没有自然选择的情况下,每个高度易变的位点在样本的系谱历史中至少经历过一次突变。在这里,我们关注的是在生殖系中甲基化且以约 10 个/代的速率突变为 T 的 CpG 位点;在 39 万个人的样本中考虑同义突变,约 99%的此类 CpG 位点携带 C/T 多态性。甲基化的 CpG 位点为适合度效应提供了自然的突变饱和实验:正如我们所表明的,在 nt 样本大小下,没有观察到非同义多态性表明该突变受到强烈的选择压力。我们依赖于这一想法,以便直接识别出一组可能高度有害的 CpG 转换,包括约 27%的可能无功能突变和高达 20%的可能错义突变,具体取决于它们发生的功能位点的类型。与甲基化的 CpG 不同,大多数突变类型的速率为 10 或 10,仍然远未达到饱和。我们讨论了这些发现对从参考数据库中存在或不存在的突变来解释其潜在临床相关性以及对新突变的适合度效应的推断意味着什么。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b93c/8683084/d95b7a8d5d1a/elife-71513-fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验