Department of Molecular Biology and Biochemistry, University of Malaga (UMA), 29010 Malaga, Spain.
CIBER de Enfermedades Raras, ISCIII, Madrid, Spain and Department of Molecular Biology and Biochemistry, University of Malaga (UMA), 29010 Malaga, Spain.
Brief Bioinform. 2019 Sep 27;20(5):1639-1654. doi: 10.1093/bib/bby039.
Variants within non-coding genomic regions can greatly affect disease. In recent years, increasing focus has been given to these variants, and how they can alter regulatory elements, such as enhancers, transcription factor binding sites and DNA methylation regions. Such variants can be considered regulatory variants. Concurrently, much effort has been put into establishing international consortia to undertake large projects aimed at discovering regulatory elements in different tissues, cell lines and organisms, and probing the effects of genetic variants on regulation by measuring gene expression. Here, we describe methods and techniques for discovering disease-associated non-coding variants using sequencing technologies. We then explain the computational procedures that can be used for annotating these variants using the information from the aforementioned projects, and prediction of their putative effects, including potential pathogenicity, based on rule-based and machine learning approaches. We provide the details of techniques to validate these predictions, by mapping chromatin-chromatin and chromatin-protein interactions, and introduce Clustered Regularly Interspaced Short Palindromic Repeats-Associated Protein 9 (CRISPR-Cas9) technology, which has already been used in this field and is likely to have a big impact on its future evolution. We also give examples of regulatory variants associated with multiple complex diseases. This review is aimed at bioinformaticians interested in the characterization of regulatory variants, molecular biologists and geneticists interested in understanding more about the nature and potential role of such variants from a functional point of views, and clinicians who may wish to learn about variants in non-coding genomic regions associated with a given disease and find out what to do next to uncover how they impact on the underlying mechanisms.
非编码基因组区域内的变体可以极大地影响疾病。近年来,人们越来越关注这些变体,以及它们如何改变调节元件,如增强子、转录因子结合位点和 DNA 甲基化区域。这些变体可以被认为是调节变体。同时,人们也付出了很大的努力来建立国际联盟,承担旨在发现不同组织、细胞系和生物体中的调节元件的大型项目,并通过测量基因表达来探测遗传变异对调节的影响。在这里,我们描述了使用测序技术发现与疾病相关的非编码变体的方法和技术。然后,我们解释了可以用于注释这些变体的计算程序,这些变体可以利用上述项目的信息来注释,并且可以根据基于规则和机器学习的方法预测其潜在的效应,包括潜在的致病性。我们提供了通过绘制染色质-染色质和染色质-蛋白质相互作用来验证这些预测的技术细节,并介绍了已经在该领域使用的 CRISPR-Cas9 技术,该技术可能对其未来的发展产生重大影响。我们还提供了与多种复杂疾病相关的调节变体的例子。这篇综述旨在为对调节变体进行特征描述的生物信息学家、对从功能角度理解这些变体的性质和潜在作用感兴趣的分子生物学家和遗传学家,以及可能希望了解与特定疾病相关的非编码基因组区域中的变异体并了解下一步该做什么以揭示它们如何影响潜在机制的临床医生提供参考。