Romanel Alessandro, Lago Sara, Prandi Davide, Sboner Andrea, Demichelis Francesca
Centre for Integrative Biology (CIBIO), University of Trento, Trento, Italy.
Department of Pathology and Laboratory Medicine, Weill Cornell Medical College, New York, USA.
BMC Med Genomics. 2015 Mar 1;8:9. doi: 10.1186/s12920-015-0084-2.
Single base level information from next-generation sequencing (NGS) allows for the quantitative assessment of biological phenomena such as mosaicism or allele-specific features in healthy and diseased cells. Such studies often present with computationally challenging burdens that hinder genome-wide investigations across large datasets that are now becoming available through the 1,000 Genomes Project and The Cancer Genome Atlas (TCGA) initiatives.
We present ASEQ, a tool to perform gene-level allele-specific expression (ASE) analysis from paired genomic and transcriptomic NGS data without requiring paternal and maternal genome data. ASEQ offers an easy-to-use set of modes that transparently to the user takes full advantage of a built-in fast computational engine. We report its performances on a set of 20 individuals from the 1,000 Genomes Project and show its detection power on imprinted genes. Next we demonstrate high level of ASE calls concordance when comparing it to AlleleSeq and MBASED tools. Finally, using a prostate cancer dataset we report on a higher fraction of ASE genes with respect to healthy individuals and show allele-specific events nominated by ASEQ in genes that are implicated in the disease.
ASEQ can be used to rapidly and reliably screen large NGS datasets for the identification of allele specific features. It can be integrated in any NGS pipeline and runs on computer systems with multiple CPUs, CPUs with multiple cores or across clusters of machines.
来自新一代测序(NGS)的单碱基水平信息能够对生物现象进行定量评估,例如健康细胞和患病细胞中的嵌合体或等位基因特异性特征。此类研究常常面临计算上的挑战,阻碍了对通过千人基因组计划和癌症基因组图谱(TCGA)计划现已可得的大型数据集进行全基因组范围的研究。
我们展示了ASEQ,这是一种从配对的基因组和转录组NGS数据中进行基因水平等位基因特异性表达(ASE)分析的工具,无需父本和母本基因组数据。ASEQ提供了一组易于使用的模式,对用户透明地充分利用了内置的快速计算引擎。我们报告了它在来自千人基因组计划的一组20个个体上的性能,并展示了其对印记基因的检测能力。接下来,当将其与AlleleSeq和MBASED工具进行比较时,我们证明了ASE调用的高度一致性。最后,使用一个前列腺癌数据集,我们报告了与健康个体相比更高比例的ASE基因,并展示了ASEQ在与该疾病相关的基因中提名的等位基因特异性事件。
ASEQ可用于快速且可靠地筛选大型NGS数据集,以识别等位基因特异性特征。它可以集成到任何NGS流程中,并在具有多个CPU、多核CPU的计算机系统上或跨机器集群运行。