Suppr超能文献

疾病基因的基因表达图谱

The Gene Expression Landscape of Disease Genes.

作者信息

García-González Judit, Garcia-Gonzalez Saul, Liou Lathan, O'Reilly Paul F

机构信息

Department of Genetics and Genomic Sciences, Icahn School of Medicine, Mount Sinai, New York City, NY 10029, USA.

Center for Excellence in Youth Education, Icahn School of Medicine, Mount Sinai, New York City, NY 10029, USA.

出版信息

medRxiv. 2024 Jun 21:2024.06.20.24309121. doi: 10.1101/2024.06.20.24309121.

Abstract

Fine-mapping and gene-prioritisation techniques applied to the latest Genome-Wide Association Study (GWAS) results have prioritised hundreds of genes as causally associated with disease. Here we leverage these recently compiled lists of high-confidence causal genes to interrogate where in the body disease genes operate. Specifically, we combine GWAS summary statistics, gene prioritisation results and gene expression RNA-seq data from 46 tissues and 204 cell types in relation to 16 major diseases (including 8 cancers). In tissues and cell types with well-established relevance to the disease, the prioritised genes typically have higher absolute and relative (i.e. tissue/cell specific) expression compared to non-prioritised 'control' genes. Examples include brain tissues in psychiatric disorders (-value < 1×10), microglia cells in Alzheimer's Disease (-value = 9.8×10) and colon mucosa in colorectal cancer (-value < 1×10). We also observe significantly higher expression for disease genes in multiple tissues and cell types with no established links to the corresponding disease. While some of these results may be explained by cell types that span multiple tissues, such as macrophages in brain, blood, lung and spleen in relation to Alzheimer's disease (-values < 1×10), the cause for others is unclear and motivates further investigation that may provide novel insights into disease etiology. For example, mammary tissue in Type 2 Diabetes (-value < 1×10); reproductive tissues such as breast, uterus, vagina, and prostate in Coronary Artery Disease (-value < 1×10); and motor neurons in psychiatric disorders (-value < 3×10). In the GTEx dataset, tissue type is the major predictor of gene expression but the contribution of each predictor (tissue, sample, subject, batch) varies widely among disease-associated genes. Finally, we highlight genes with the highest levels of gene expression in relevant tissues to guide functional follow-up studies. Our results could offer novel insights into the tissues and cells involved in disease initiation, inform drug target and delivery strategies, highlighting potential off-target effects, and exemplify the relative performance of different statistical tests for linking disease genes with tissue and cell type gene expression.

摘要

应用于最新全基因组关联研究(GWAS)结果的精细定位和基因优先级排序技术已将数百个基因列为与疾病有因果关系的基因。在此,我们利用这些最近汇编的高可信度因果基因列表,来探究疾病基因在体内的作用部位。具体而言,我们结合了GWAS汇总统计数据、基因优先级排序结果以及来自46种组织和204种细胞类型的基因表达RNA测序数据,涉及16种主要疾病(包括8种癌症)。在与疾病有明确相关性的组织和细胞类型中,与未被列为优先级的“对照”基因相比,被列为优先级的基因通常具有更高的绝对表达和相对(即组织/细胞特异性)表达。例如,精神疾病中的脑组织(P值<1×10⁻⁶)、阿尔茨海默病中的小胶质细胞(P值 = 9.8×10⁻⁴)以及结直肠癌中的结肠黏膜(P值<1×10⁻⁶)。我们还观察到,在与相应疾病没有既定联系的多种组织和细胞类型中,疾病基因的表达也显著更高。虽然其中一些结果可能由跨越多种组织的细胞类型来解释,比如与阿尔茨海默病相关的脑、血液、肺和脾脏中的巨噬细胞(P值<1×10⁻⁶),但其他结果的原因尚不清楚,这促使我们进一步研究,可能会为疾病病因学提供新的见解。例如,2型糖尿病中的乳腺组织(P值<1×10⁻⁶);冠状动脉疾病中的生殖组织,如乳腺、子宫、阴道和前列腺(P值<1×10⁻⁶);以及精神疾病中的运动神经元(P值<3×10⁻⁶)。在GTEx数据集中,组织类型是基因表达的主要预测因子,但每个预测因子(组织、样本、受试者、批次)对与疾病相关基因的贡献差异很大。最后,我们突出了在相关组织中基因表达水平最高的基因,以指导功能后续研究。我们的结果可能会为疾病起始所涉及的组织和细胞提供新的见解,为药物靶点和给药策略提供信息,突出潜在的脱靶效应,并举例说明将疾病基因与组织和细胞类型基因表达联系起来的不同统计测试的相对性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01a7/12234044/5cc805e7073d/nihpp-2024.06.20.24309121v2-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验