Suppr超能文献

FusionGDB:融合基因注释数据库。

FusionGDB: fusion gene annotation DataBase.

机构信息

Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

出版信息

Nucleic Acids Res. 2019 Jan 8;47(D1):D994-D1004. doi: 10.1093/nar/gky1067.

Abstract

Gene fusion is one of the hallmarks of cancer genome via chromosomal rearrangement initiated by DNA double-strand breakage. To date, many fusion genes (FGs) have been established as important biomarkers and therapeutic targets in multiple cancer types. To better understand the function of FGs in cancer types and to promote the discovery of clinically relevant FGs, we built FusionGDB (Fusion Gene annotation DataBase) available at https://ccsm.uth.edu/FusionGDB. We collected 48 117 FGs across pan-cancer from three representative fusion gene resources: the improved database of chimeric transcripts and RNA-seq data (ChiTaRS 3.1), an integrative resource for cancer-associated transcript fusions (TumorFusions), and The Cancer Genome Atlas (TCGA) fusions by Gao et al. For these ∼48K FGs, we performed functional annotations including gene assessment across pan-cancer fusion genes, open reading frame (ORF) assignment, and retention search of 39 protein features based on gene structures of multiple isoforms with different breakpoints. We also provided the fusion transcript and amino acid sequences according to multiple breakpoints and transcript isoforms. Our analyses identified 331, 303 and 667 in-frame FGs with retaining kinase, DNA-binding, and epigenetic factor domains, respectively, as well as 976 FGs lost protein-protein interaction. FusionGDB provides six categories of annotations: FusionGeneSummary, FusionProtFeature, FusionGeneSequence, FusionGenePPI, RelatedDrug and RelatedDisease.

摘要

基因融合是癌症基因组的特征之一,通过 DNA 双链断裂引发染色体重排。迄今为止,许多融合基因 (FGs) 已被确立为多种癌症类型中的重要生物标志物和治疗靶点。为了更好地了解 FGs 在癌症类型中的功能,并促进临床相关 FGs 的发现,我们构建了 FusionGDB(融合基因注释数据库),可在 https://ccsm.uth.edu/FusionGDB 上访问。我们从三个代表性的融合基因资源中收集了泛癌的 48117 个 FGs:改进的嵌合转录本和 RNA-seq 数据数据库 (ChiTaRS 3.1)、癌症相关转录融合的综合资源 (TumorFusions) 和 Gao 等人的 TCGA 融合。对于这些约 48K 的 FGs,我们进行了功能注释,包括泛癌融合基因的基因评估、开放阅读框 (ORF) 分配以及基于具有不同断点的多种异构体的基因结构进行的 39 种蛋白质特征的保留搜索。我们还根据多个断点和转录异构体提供了融合转录本和氨基酸序列。我们的分析确定了 331、303 和 667 个具有保留激酶、DNA 结合和表观遗传因子结构域的框内 FGs,以及 976 个失去蛋白质-蛋白质相互作用的 FGs。FusionGDB 提供了六类注释:FusionGeneSummary、FusionProtFeature、FusionGeneSequence、FusionGenePPI、RelatedDrug 和 RelatedDisease。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cfda/6323909/e1925a4ab9b2/gky1067fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验