Suppr超能文献

长读蛋白质组学将疾病相关的 sQTL 与疾病的蛋白质同工型效应物联系起来。

Long-read proteogenomics to connect disease-associated sQTLs to the protein isoform effectors of disease.

机构信息

Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA; Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA.

Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22908, USA; Department of Public Health Sciences, University of Virginia, Charlottesville, VA 22908, USA.

出版信息

Am J Hum Genet. 2024 Sep 5;111(9):1914-1931. doi: 10.1016/j.ajhg.2024.07.003. Epub 2024 Jul 29.

Abstract

A major fraction of loci identified by genome-wide association studies (GWASs) mediate alternative splicing, but mechanistic interpretation is hindered by the technical limitations of short-read RNA sequencing (RNA-seq), which cannot directly link splicing events to full-length protein isoforms. Long-read RNA-seq represents a powerful tool to characterize transcript isoforms, and recently, infer protein isoform existence. Here, we present an approach that integrates information from GWASs, splicing quantitative trait loci (sQTLs), and PacBio long-read RNA-seq in a disease-relevant model to infer the effects of sQTLs on the ultimate protein isoform products they encode. We demonstrate the utility of our approach using bone mineral density (BMD) GWAS data. We identified 1,863 sQTLs from the Genotype-Tissue Expression (GTEx) project in 732 protein-coding genes that colocalized with BMD associations (H4PP ≥ 0.75). We generated PacBio Iso-Seq data (N = ∼22 million full-length reads) on human osteoblasts, identifying 68,326 protein-coding isoforms, of which 17,375 (25%) were unannotated. By casting the sQTLs onto protein isoforms, we connected 809 sQTLs to 2,029 protein isoforms from 441 genes expressed in osteoblasts. Overall, we found that 74 sQTLs influenced isoforms likely impacted by nonsense-mediated decay and 190 that potentially resulted in the expression of unannotated protein isoforms. Finally, we functionally validated colocalizing sQTLs in TPM2, in which siRNA-mediated knockdown in osteoblasts showed two TPM2 isoforms with opposing effects on mineralization but exhibited no effect upon knockdown of the entire gene. Our approach should be to generalize across diverse clinical traits and to provide insights into protein isoform activities modulated by GWAS loci.

摘要

全基因组关联研究 (GWAS) 鉴定的大多数基因座介导可变剪接,但由于短读长 RNA 测序 (RNA-seq) 的技术限制,机制解释受到阻碍,因为短读长 RNA-seq 无法直接将剪接事件与全长蛋白质亚型联系起来。长读长 RNA-seq 是一种强大的工具,可用于表征转录物亚型,并最近推断蛋白质亚型的存在。在这里,我们提出了一种方法,该方法将 GWAS、剪接数量性状基因座 (sQTL) 和 PacBio 长读长 RNA-seq 的信息整合到疾病相关模型中,以推断 sQTL 对其编码的最终蛋白质亚型产物的影响。我们使用骨密度 (BMD) GWAS 数据证明了我们方法的实用性。我们从 GTEx 项目中鉴定了 732 个蛋白质编码基因中的 1,863 个 sQTL,这些基因座与 BMD 关联 (H4PP ≥ 0.75) 共定位。我们在人类成骨细胞上生成了 PacBio Iso-Seq 数据 (N = ∼2200 万全长读数),鉴定了 68,326 个蛋白质编码亚型,其中 17,375(25%)未注释。通过将 sQTL 映射到蛋白质亚型上,我们将 441 个基因中表达的 809 个 sQTL 与 2,029 个蛋白质亚型联系起来。总的来说,我们发现 74 个 sQTL 影响了可能受无意义介导的衰变影响的亚型,而 190 个 sQTL 可能导致未注释蛋白质亚型的表达。最后,我们在 TPM2 中对共定位的 sQTL 进行了功能验证,其中 siRNA 介导的成骨细胞敲低显示出两种 TPM2 亚型对矿化有相反的影响,但敲低整个基因则没有影响。我们的方法应该推广到不同的临床特征,并深入了解 GWAS 基因座调节的蛋白质亚型活性。

相似文献

引用本文的文献

6
Full-length transcriptome sequencing of seven tissues of GuShi chickens.固始鸡七个组织的全长转录组测序
Poult Sci. 2025 Feb;104(2):104697. doi: 10.1016/j.psj.2024.104697. Epub 2024 Dec 19.

本文引用的文献

9

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验