Prakash Anil, Banerjee Moinak
Human Molecular Genetics Lab, Neurobiology and Genetics Division, BRIC-Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, Kerala 695014, India.
Department of Biotechnology, University of Kerala, Kariavattom, Thiruvananthapuram, Kerala 695581, India.
NAR Genom Bioinform. 2025 Jun 13;7(2):lqaf080. doi: 10.1093/nargab/lqaf080. eCollection 2025 Jun.
Large-scale quantitative studies have identified significant genetic associations for various neurological disorders. Expression quantitative trait locus (eQTL) studies have shown the effect of single-nucleotide polymorphisms (SNPs) on the differential expression of genes in brain tissues. However, a large majority of the associations are contributed by SNPs in the noncoding regions that can have significant regulatory function but are often ignored. Besides, mutations that are in high linkage disequilibrium with actual regulatory SNPs will also show significant associations. Therefore, it is important to differentiate a regulatory noncoding SNP with a nonregulatory one. To resolve this, we developed a deep learning model named Neur-Ally, which was trained on epigenomic datasets from nervous tissue and cell line samples. The model predicts differential occurrence of regulatory features like chromatin accessibility, histone modifications, and transcription factor binding on genomic regions using DNA sequence as input. The model was used to predict the regulatory effect of neurological condition-specific noncoding SNPs using mutagenesis. The effect of associated SNPs reported in genome-wide association studies of neurological condition, brain eQTLs, autism spectrum disorder, and reported probable regulatory SNPs in neurological conditions were predicted by Neur-Ally.
大规模定量研究已经确定了各种神经疾病的显著基因关联。表达定量性状基因座(eQTL)研究表明单核苷酸多态性(SNP)对脑组织中基因差异表达的影响。然而,绝大多数关联是由非编码区的SNP导致的,这些SNP具有重要的调控功能,但常常被忽视。此外,与实际调控SNP处于高度连锁不平衡状态的突变也会显示出显著关联。因此,区分调控性非编码SNP和非调控性SNP很重要。为了解决这个问题,我们开发了一种名为Neur-Ally的深度学习模型,该模型在神经组织和细胞系样本的表观基因组数据集上进行训练。该模型以DNA序列为输入,预测基因组区域上染色质可及性、组蛋白修饰和转录因子结合等调控特征的差异出现情况。该模型被用于通过诱变预测神经疾病特异性非编码SNP的调控作用。Neur-Ally预测了神经疾病全基因组关联研究、脑eQTL、自闭症谱系障碍中报道的相关SNP的作用,以及神经疾病中报道的可能的调控SNP的作用。