Suppr超能文献

生物学知识和基因表达背景对基因组数据解读的重要性。

The Importance of Biologic Knowledge and Gene Expression Context for Genomic Data Interpretation.

作者信息

Zimmermann Michael T

机构信息

Bioinformatics Research and Development Laboratory, Genomic Sciences and Precision Medicine Center, Medical College of Wisconsin, Milwaukee, WI, United States.

Clinical and Translational Sciences Institute, Medical College of Wisconsin, Milwaukee, WI, United States.

出版信息

Front Genet. 2018 Dec 18;9:670. doi: 10.3389/fgene.2018.00670. eCollection 2018.

Abstract

Genomic sequencing, including whole exome sequencing (WES), is enabling a higher resolution for defining diseases, understand mechanisms, and improving the practice of clinical care. However, WES routinely identifies genomic variants with uncertain functional effects. Furthering uncertainty in WES data interpretation is that many genes can express multiple transcripts and their relative expression may differ by body tissue. In order to interpret WES data, we not only need to understand which transcript is most relevant, but what tissue is most relevant. In this work, we quantify how frequently differences in transcript and tissue expression affect WES data interpretation at gene, pathway, disease, and biologic network levels. We combined and analyzed multiple large and publically available datasets to inform genomic data interpretation. Across well-established biologic pathways and genes with pathogenic disease variants, 54 and 40% have a different protein coding effect by transcript selection for, respectively, 25 and 50% of the genes contained. Additionally, strong differences in human tissue expression levels affect 33 and 19% of the same set of pathways and diseases for, respectively, 25 and 50% of the genes contained. Whole exome sequencing identifies genomic variants, but to interpret the functional effects of those variants in high-resolution, we recommend building transcript selection and cross-tissue gene expression levels into hypotheses and analyses. Using current large-scale data, we show how extensively interpretation of genomic variants may differ according to transcript and tissue, across most pathways and disease. Thus, their inclusion is necessary for WES data interpretation.

摘要

基因组测序,包括全外显子组测序(WES),能够以更高的分辨率来定义疾病、理解发病机制并改善临床护理实践。然而,WES通常会识别出功能效应不确定的基因组变异。WES数据解读中的进一步不确定性在于,许多基因能够表达多种转录本,而且它们在不同身体组织中的相对表达可能存在差异。为了解读WES数据,我们不仅需要了解哪种转录本最为相关,还需要知道哪种组织最为相关。在这项研究中,我们量化了转录本和组织表达差异在基因、通路、疾病和生物网络层面影响WES数据解读的频率。我们合并并分析了多个大型且公开可用的数据集,以为基因组数据解读提供参考。在已确立的生物通路和携带致病疾病变异的基因中,分别有54%和40%的基因,由于转录本选择的不同而具有不同的蛋白质编码效应,这些基因分别占所包含基因的25%和50%。此外,人类组织表达水平的显著差异分别影响了同一组通路和疾病中的33%和19%,这些基因分别占所包含基因的25%和50%。全外显子组测序能够识别基因组变异,但为了高分辨率地解读这些变异的功能效应,我们建议在假设和分析中纳入转录本选择和跨组织基因表达水平。利用当前的大规模数据,我们展示了在大多数通路和疾病中,基因组变异的解读如何因转录本和组织的不同而存在广泛差异。因此,在WES数据解读中纳入这些因素是必要的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/351a/6305277/9b90d1f337fb/fgene-09-00670-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验