RVboost：使用增强方法对RNA测序变异进行优先级排序。

RVboost: RNA-seq variants prioritization using a boosting method.

作者信息

Wang Chen, Davila Jaime I, Baheti Saurabh, Bhagwate Aditya V, Wang Xue, Kocher Jean-Pierre A, Slager Susan L, Feldman Andrew L, Novak Anne J, Cerhan James R, Thompson E Aubrey, Asmann Yan W

机构信息

Division of Biomedical Statistics and Informatics, Mayo Clinic, 200 First Street SW, Rochester MN 55905, Department of Health Sciences Research, Mayo Clinic, 4500 San Pablo Road South, Jacksonville FL 32224, Department of Laboratory Medicine and Pathology, Division of Hematology, Department of Internal Medicine, Division of Epidemiology, Department of Health Sciences Research, Mayo Clinic, 200 First Street SW, Rochester MN 55905 and Department of Cancer Biology, Mayo Clinic, 4500 San Pablo Road South, Jacksonville FL 32224, USA.

出版信息

Bioinformatics. 2014 Dec 1;30(23):3414-6. doi: 10.1093/bioinformatics/btu577. Epub 2014 Aug 27.

DOI:10.1093/bioinformatics/btu577

PMID:25170027

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4296157/

Abstract

MOTIVATION

RNA-seq has become the method of choice to quantify genes and exons, discover novel transcripts and detect fusion genes. However, reliable variant identification from RNA-seq data remains challenging because of the complexities of the transcriptome, the challenges of accurately mapping exon boundary spanning reads and the bias introduced during the sequencing library preparation.

METHOD

We developed RVboost, a novel method specific for RNA variant prioritization. RVboost uses several attributes unique in the process of RNA library preparation, sequencing and RNA-seq data analyses. It uses a boosting method to train a model of 'good quality' variants using common variants from HapMap, and prioritizes and calls the RNA variants based on the trained model. We packaged RVboost in a comprehensive workflow, which integrates tools of variant calling, annotation and filtering.

RESULTS

RVboost consistently outperforms the variant quality score recalibration from the Genome Analysis Tool Kit and the RNA-seq variant-calling pipeline SNPiR in 12 RNA-seq samples using ground-truth variants from paired exome sequencing data. Several RNA-seq-specific attributes were identified as critical to differentiate true and false variants, including the distance of the variant positions to exon boundaries, and the percent of the reads supporting the variant in the first six base pairs. The latter identifies false variants introduced by the random hexamer priming during the library construction.

AVAILABILITY AND IMPLEMENTATION

The RVboost package is implemented to readily run in Mac or Linux environments. The software and user manual are available at http://bioinformaticstools.mayo.edu/research/rvboost/.

摘要

动机

RNA测序已成为定量基因和外显子、发现新转录本以及检测融合基因的首选方法。然而，由于转录组的复杂性、准确映射跨越外显子边界的 reads 的挑战以及测序文库制备过程中引入的偏差，从RNA测序数据中进行可靠的变异鉴定仍然具有挑战性。

方法

我们开发了RVboost，这是一种专门用于RNA变异优先级排序的新方法。RVboost利用了RNA文库制备、测序和RNA测序数据分析过程中独特的几个属性。它使用一种提升方法，利用来自HapMap的常见变异训练一个“高质量”变异模型，并根据训练好的模型对RNA变异进行优先级排序和调用。我们将RVboost打包成一个综合工作流程，该流程集成了变异调用、注释和过滤工具。

结果

在使用配对外显子测序数据中的真实变异的12个RNA测序样本中，RVboost始终优于基因组分析工具包中的变异质量得分重新校准和RNA测序变异调用管道SNPiR。几个RNA测序特有的属性被确定为区分真假变异的关键，包括变异位置到外显子边界的距离，以及在前六个碱基对中支持变异的reads的百分比。后者可识别文库构建过程中随机六聚体引发引入的假变异。

可用性和实现方式

RVboost软件包已实现可在Mac或Linux环境中轻松运行。该软件和用户手册可在http://bioinformaticstools.mayo.edu/research/rvboost/获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3e3/4296157/8bcc06d96537/btu577f1.jpg

相似文献

RVboost: RNA-seq variants prioritization using a boosting method.

Bioinformatics. 2014 Dec 1;30(23):3414-6. doi: 10.1093/bioinformatics/btu577. Epub 2014 Aug 27.

MAP-RSeq: Mayo Analysis Pipeline for RNA sequencing.

BMC Bioinformatics. 2014 Jun 27;15:224. doi: 10.1186/1471-2105-15-224.

The eSNV-detect: a computational system to identify expressed single nucleotide variants from transcriptome sequencing data.

Nucleic Acids Res. 2014 Dec 16;42(22):e172. doi: 10.1093/nar/gku1005. Epub 2014 Oct 28.

Grape RNA-Seq analysis pipeline environment.

Bioinformatics. 2013 Mar 1;29(5):614-21. doi: 10.1093/bioinformatics/btt016. Epub 2013 Jan 17.

Indel sensitive and comprehensive variant/mutation detection from RNA sequencing data for precision medicine.

BMC Med Genomics. 2018 Sep 14;11(Suppl 3):67. doi: 10.1186/s12920-018-0391-5.

Reliable identification of genomic variants from RNA-seq data.

Am J Hum Genet. 2013 Oct 3;93(4):641-51. doi: 10.1016/j.ajhg.2013.08.008. Epub 2013 Sep 26.

SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.

BMC Bioinformatics. 2016 Feb 4;17:66. doi: 10.1186/s12859-016-0923-y.

PASSion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-Seq data.

Bioinformatics. 2012 Feb 15;28(4):479-86. doi: 10.1093/bioinformatics/btr712. Epub 2012 Jan 4.

VaDiR: an integrated approach to Variant Detection in RNA.

Gigascience. 2018 Feb 1;7(2):1-13. doi: 10.1093/gigascience/gix122.

tarSVM: Improving the accuracy of variant calls derived from microfluidic PCR-based targeted next generation sequencing using a support vector machine.

BMC Bioinformatics. 2016 Jun 10;17(1):233. doi: 10.1186/s12859-016-1108-4.

引用本文的文献

Precise detection of differential RNA editing sites across varied biological conditions using the CADRES pipeline.

Sci Rep. 2025 Jun 4;15(1):19683. doi: 10.1038/s41598-025-04957-7.

Variant calling from RNA-Seq data reveals allele-specific differential expression of pathogenic cancer variants.

Commun Med (Lond). 2025 May 28;5(1):202. doi: 10.1038/s43856-025-00901-y.

Renal ischemia alters the mRNA and miRNA profile of vasculature-related genes in scattered tubular-like cells from female pigs.

Am J Physiol Renal Physiol. 2025 May 1;328(5):F724-F735. doi: 10.1152/ajprenal.00334.2024. Epub 2025 Apr 17.

DEMINING: A deep learning model embedded framework to distinguish RNA editing from DNA mutations in RNA sequencing data.

Genome Biol. 2024 Oct 8;25(1):258. doi: 10.1186/s13059-024-03397-2.

YAP-TEAD inhibition is associated with upregulation of an androgen receptor mediated transcription program providing therapeutic escape.

FEBS Open Bio. 2024 Nov;14(11):1873-1887. doi: 10.1002/2211-5463.13901. Epub 2024 Sep 19.

Sorbs2 Deficiency and Vascular BK Channelopathy in Diabetes.

Circ Res. 2024 Mar 29;134(7):858-871. doi: 10.1161/CIRCRESAHA.123.323538. Epub 2024 Feb 16.

Chronic kidney disease and left ventricular diastolic dysfunction (CKD-LVDD) alter cardiac expression of mitochondria-related genes in swine.

Transl Res. 2024 May;267:67-78. doi: 10.1016/j.trsl.2023.12.004. Epub 2024 Jan 22.

Renal ischemia alters the transcriptomic and epigenetic profile of inflammatory genes in swine scattered tubular-like cells.

Clin Sci (Lond). 2023 Aug 31;137(16):1265-1283. doi: 10.1042/CS20230555.

AT-101 Enhances the Antitumor Activity of Lenalidomide in Patients with Multiple Myeloma.

Cancers (Basel). 2023 Jan 12;15(2):477. doi: 10.3390/cancers15020477.

Predicting response to immune checkpoint blockade in NSCLC with tumour-only RNA-seq.

Br J Cancer. 2023 Apr;128(6):1148-1154. doi: 10.1038/s41416-022-02105-w. Epub 2022 Dec 26.

本文引用的文献

RADAR: a rigorously annotated database of A-to-I RNA editing.

Nucleic Acids Res. 2014 Jan;42(Database issue):D109-13. doi: 10.1093/nar/gkt996. Epub 2013 Oct 25.

Reliable identification of genomic variants from RNA-seq data.

Am J Hum Genet. 2013 Oct 3;93(4):641-51. doi: 10.1016/j.ajhg.2013.08.008. Epub 2013 Sep 26.

Lack of evidence for existence of noncanonical RNA editing.

Nat Biotechnol. 2013 Jan;31(1):19-20. doi: 10.1038/nbt.2472.

RNA-Seq and human complex diseases: recent accomplishments and future perspectives.

Eur J Hum Genet. 2013 Feb;21(2):134-42. doi: 10.1038/ejhg.2012.129. Epub 2012 Jun 27.

A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.

Fly (Austin). 2012 Apr-Jun;6(2):80-92. doi: 10.4161/fly.19695.

A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines.

Nucleic Acids Res. 2011 Aug;39(15):e100. doi: 10.1093/nar/gkr362. Epub 2011 May 27.

A framework for variation discovery and genotyping using next-generation DNA sequencing data.

Nat Genet. 2011 May;43(5):491-8. doi: 10.1038/ng.806. Epub 2011 Apr 10.

TopHat: discovering splice junctions with RNA-Seq.

Bioinformatics. 2009 May 1;25(9):1105-11. doi: 10.1093/bioinformatics/btp120. Epub 2009 Mar 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

RVboost：使用增强方法对RNA测序变异进行优先级排序。

RVboost: RNA-seq variants prioritization using a boosting method.

作者信息

Wang Chen, Davila Jaime I, Baheti Saurabh, Bhagwate Aditya V, Wang Xue, Kocher Jean-Pierre A, Slager Susan L, Feldman Andrew L, Novak Anne J, Cerhan James R, Thompson E Aubrey, Asmann Yan W

机构信息

出版信息

Bioinformatics. 2014 Dec 1;30(23):3414-6. doi: 10.1093/bioinformatics/btu577. Epub 2014 Aug 27.

DOI:10.1093/bioinformatics/btu577

PMID:25170027

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4296157/

Abstract

MOTIVATION

METHOD

RESULTS

AVAILABILITY AND IMPLEMENTATION

The RVboost package is implemented to readily run in Mac or Linux environments. The software and user manual are available at http://bioinformaticstools.mayo.edu/research/rvboost/.

摘要

动机

方法

结果

可用性和实现方式

RVboost软件包已实现可在Mac或Linux环境中轻松运行。该软件和用户手册可在http://bioinformaticstools.mayo.edu/research/rvboost/获取。

RVboost：使用增强方法对RNA测序变异进行优先级排序。

RVboost: RNA-seq variants prioritization using a boosting method.

作者信息

机构信息

出版信息

MOTIVATION

METHOD

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

方法

结果

可用性和实现方式

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

RVboost：使用增强方法对RNA测序变异进行优先级排序。

RVboost: RNA-seq variants prioritization using a boosting method.

作者信息

机构信息

出版信息

MOTIVATION

METHOD

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

方法

结果

可用性和实现方式

相似文献

引用本文的文献

本文引用的文献