Suppr超能文献

一个用于快速进行跨队列变异和等位基因偏差分析的Bioconductor软件包。

: a Bioconductor package for rapid, cross-cohort variant and allelic bias analysis.

作者信息

Huang Yizhou Peter, Harmon Lauren, Deering-Gardner Eve, Ma Xiaotu, Harsh Josiah, Xue Zhaoyu, Wen Hong, Ramos Marcel, Davis Sean, Triche Timothy J

机构信息

Michigan State University, East Lansing, MI, US.

Van Andel Institute, Grand Rapids, MI, US.

出版信息

bioRxiv. 2024 Nov 27:2023.09.15.558026. doi: 10.1101/2023.09.15.558026.

Abstract

The NCI Genomic Data Commons (GDC) provides controlled access to sequencing data from thousands of subjects, enabling large-scale study of impactful genetic alterations such as simple and complex germline and structural variants. However, efficient analysis requires significant computational resources and expertise, especially when recalling variants from raw sequence reads. We thus developed , an R/Bioconductor package that builds upon the package to extract aligned sequence reads from cross-GDC meta-cohorts, followed by targeted analysis of variants and effects (including transcript-aware variant annotation from transcriptome-aligned GDC RNA data). Here we demonstrate population-scale genomic & transcriptomic analyses with minimal compute burden via , identifying recurrent, clinically relevant sequence and structural variants in the TARGET AML and BEAT-AML cohorts. We then validate results in the (non-GDC) Leucegene cohort, demonstrating how the pipeline can be seamlessly applied to replicate findings in non-GDC cohorts. These variants directly yield clinically impactful and biologically testable hypotheses for mechanistic investigation. has been submitted to the Bioconductor project, where it is presently under review, and is available on GitHub at https://github.com/trichelab/bamSliceR.

摘要

美国国立癌症研究所基因组数据共享库(GDC)提供对数千名受试者测序数据的受控访问,从而能够对诸如简单和复杂的种系及结构变异等有影响力的基因改变进行大规模研究。然而,高效分析需要大量的计算资源和专业知识,尤其是从原始序列读数中召回变异时。因此,我们开发了bamSliceR,这是一个基于R/Bioconductor的软件包,它在SummarizedExperiment软件包的基础上进行构建,用于从跨GDC元队列中提取比对后的序列读数,随后对变异及其影响进行靶向分析(包括从与转录组比对的GDC RNA数据中进行转录本感知的变异注释)。在这里,我们展示了通过bamSliceR以最小的计算负担进行群体规模的基因组和转录组分析,在TARGET AML和BEAT-AML队列中识别出复发性、临床相关的序列和结构变异。然后我们在(非GDC)Leucegene队列中验证结果,展示了bamSliceR流程如何能够无缝应用于在非GDC队列中重复研究结果。这些变异直接产生了对机制研究具有临床影响力且可进行生物学检验的假设。bamSliceR已提交给Bioconductor项目,目前正在审核中,可在GitHub上获取,网址为https://github.com/trichelab/bamSliceR。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/abc9/11639341/b961cd49742d/nihpp-2023.09.15.558026v3-f0001.jpg

相似文献

1
: a Bioconductor package for rapid, cross-cohort variant and allelic bias analysis.
bioRxiv. 2024 Nov 27:2023.09.15.558026. doi: 10.1101/2023.09.15.558026.
2
: a Bioconductor package for rapid, cross-cohort variant and allelic bias analysis.
Bioinform Adv. 2025 Apr 28;5(1):vbaf098. doi: 10.1093/bioadv/vbaf098. eCollection 2025.
7
The Black Book of Psychotropic Dosing and Monitoring.
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
8
SAKit: An all-in-one analysis pipeline for identifying novel proteins resulting from variant events at both large and small scales.
J Bioinform Comput Biol. 2024 Oct;22(5):2450022. doi: 10.1142/S0219720024500227. Epub 2024 Oct 1.
10
Portion, package or tableware size for changing selection and consumption of food, alcohol and tobacco.
Cochrane Database Syst Rev. 2015 Sep 14;2015(9):CD011045. doi: 10.1002/14651858.CD011045.pub2.

本文引用的文献

1
Dysregulated Lipid Synthesis by Oncogenic IDH1 Mutation Is a Targetable Synthetic Lethal Vulnerability.
Cancer Discov. 2023 Feb 6;13(2):496-515. doi: 10.1158/2159-8290.CD-21-0218.
2
4
Impaired cell fate through gain-of-function mutations in a chromatin reader.
Nature. 2020 Jan;577(7788):121-126. doi: 10.1038/s41586-019-1842-7. Epub 2019 Dec 18.
5
Targeted variant detection using unaligned RNA-Seq reads.
Life Sci Alliance. 2019 Aug 19;2(4). doi: 10.26508/lsa.201900336. Print 2019 Aug.
7
Maftools: efficient and comprehensive analysis of somatic variants in cancer.
Genome Res. 2018 Nov;28(11):1747-1756. doi: 10.1101/gr.239244.118. Epub 2018 Oct 19.
8
Functional genomic landscape of acute myeloid leukaemia.
Nature. 2018 Oct;562(7728):526-531. doi: 10.1038/s41586-018-0623-z. Epub 2018 Oct 17.
9
Minimap2: pairwise alignment for nucleotide sequences.
Bioinformatics. 2018 Sep 15;34(18):3094-3100. doi: 10.1093/bioinformatics/bty191.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验