Suppr超能文献

SVExpress:识别在表达上与附近结构变异断点反复改变的基因特征。

SVExpress: identifying gene features altered recurrently in expression with nearby structural variant breakpoints.

作者信息

Zhang Yiqun, Chen Fengju, Creighton Chad J

机构信息

Dan L. Duncan Comprehensive Cancer Center Division of Biostatistics, Baylor College of Medicine, Houston, TX, 77030, USA.

Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA.

出版信息

BMC Bioinformatics. 2021 Mar 21;22(1):135. doi: 10.1186/s12859-021-04072-0.

Abstract

BACKGROUND

Combined whole-genome sequencing (WGS) and RNA sequencing of cancers offer the opportunity to identify genes with altered expression due to genomic rearrangements. Somatic structural variants (SVs), as identified by WGS, can involve altered gene cis-regulation, gene fusions, copy number alterations, or gene disruption. The absence of computational tools to streamline integrative analysis steps may represent a barrier in identifying genes recurrently altered by genomic rearrangement.

RESULTS

Here, we introduce SVExpress, a set of tools for carrying out integrative analysis of SV and gene expression data. SVExpress enables systematic cataloging of genes that consistently show increased or decreased expression in conjunction with the presence of nearby SV breakpoints. SVExpress can evaluate breakpoints in proximity to genes for potential enhancer translocation events or disruption of topologically associated domains, two mechanisms by which SVs may deregulate genes. The output from any commonly used SV calling algorithm may be easily adapted for use with SVExpress. SVExpress can readily analyze genomic datasets involving hundreds of cancer sample profiles. Here, we used SVExpress to analyze SV and expression data across 327 cancer cell lines with combined SV and expression data in the Cancer Cell Line Encyclopedia (CCLE). In the CCLE dataset, hundreds of genes showed altered gene expression in relation to nearby SV breakpoints. Altered genes involved TAD disruption, enhancer hijacking, and gene fusions. When comparing the top set of SV-altered genes from cancer cell lines with the top SV-altered genes previously reported for human tumors from The Cancer Genome Atlas and the Pan-Cancer Analysis of Whole Genomes datasets, a significant number of genes overlapped in the same direction for both cell lines and tumors, while some genes were significant for cell lines but not for human tumors and vice versa.

CONCLUSION

Our SVExpress tools allow computational biologists with a working knowledge of R to integrate gene expression with SV breakpoint data to identify recurrently altered genes. SVExpress is freely available for academic or commercial use at https://github.com/chadcreighton/SVExpress . SVExpress is implemented as a set of Excel macros and R code. All source code (R and Visual Basic for Applications) is available.

摘要

背景

癌症的全基因组测序(WGS)与RNA测序相结合,为鉴定因基因组重排而导致表达改变的基因提供了机会。通过WGS鉴定出的体细胞结构变异(SVs)可能涉及基因顺式调控改变、基因融合、拷贝数改变或基因破坏。缺乏简化整合分析步骤的计算工具可能会成为鉴定因基因组重排而反复改变的基因的障碍。

结果

在此,我们介绍了SVExpress,这是一套用于对SV和基因表达数据进行整合分析的工具。SVExpress能够系统地编目那些在附近存在SV断点时持续显示表达增加或减少的基因。SVExpress可以评估基因附近的断点,以寻找潜在的增强子易位事件或拓扑相关结构域的破坏,这是SVs可能使基因失调的两种机制。任何常用的SV调用算法的输出都可以很容易地适配用于SVExpress。SVExpress可以轻松分析涉及数百个癌症样本概况的基因组数据集。在此,我们使用SVExpress分析了癌症细胞系百科全书(CCLE)中327个具有SV和表达数据的癌细胞系的SV和表达数据。在CCLE数据集中,数百个基因显示出与附近SV断点相关的基因表达改变。改变的基因涉及TAD破坏、增强子劫持和基因融合。当将癌细胞系中SV改变的顶级基因集与先前在癌症基因组图谱和全基因组泛癌分析数据集中报道的人类肿瘤的顶级SV改变基因进行比较时,大量基因在细胞系和肿瘤中在相同方向上重叠,而一些基因对细胞系有显著意义但对人类肿瘤无显著意义,反之亦然。

结论

我们的SVExpress工具允许具备R实用知识的计算生物学家将基因表达与SV断点数据整合,以鉴定反复改变的基因。SVExpress可在https://github.com/chadcreighton/SVExpress上免费用于学术或商业用途。SVExpress作为一组Excel宏和R代码实现。所有源代码(R和应用程序可视化Basic)均可获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2df9/7981925/6c4975475c28/12859_2021_4072_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验