• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于存在多重读取情况下RNA测序差异表达分析的模糊方法。

A fuzzy method for RNA-Seq differential expression analysis in presence of multireads.

作者信息

Consiglio Arianna, Mencar Corrado, Grillo Giorgio, Marzano Flaviana, Caratozzolo Mariano Francesco, Liuni Sabino

机构信息

Institute for Biomedical Technologies of Bari - ITB, National Research Council, Bari, 70126, Italy.

Department of Informatics, University of Bari Aldo Moro, Bari, 70121, Italy.

出版信息

BMC Bioinformatics. 2016 Nov 8;17(Suppl 12):345. doi: 10.1186/s12859-016-1195-2.

DOI:10.1186/s12859-016-1195-2
PMID:28185579
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5123383/
Abstract

BACKGROUND

When the reads obtained from high-throughput RNA sequencing are mapped against a reference database, a significant proportion of them - known as multireads - can map to more than one reference sequence. These multireads originate from gene duplications, repetitive regions or overlapping genes. Removing the multireads from the mapping results, in RNA-Seq analyses, causes an underestimation of the read counts, while estimating the real read count can lead to false positives during the detection of differentially expressed sequences.

RESULTS

We present an innovative approach to deal with multireads and evaluate differential expression events, entirely based on fuzzy set theory. Since multireads cause uncertainty in the estimation of read counts during gene expression computation, they can also influence the reliability of differential expression analysis results, by producing false positives. Our method manages the uncertainty in gene expression estimation by defining the fuzzy read counts and evaluates the possibility of a gene to be differentially expressed with three fuzzy concepts: over-expression, same-expression and under-expression. The output of the method is a list of differentially expressed genes enriched with information about the uncertainty of the results due to the multiread presence. We have tested the method on RNA-Seq data designed for case-control studies and we have compared the obtained results with other existing tools for read count estimation and differential expression analysis.

CONCLUSIONS

The management of multireads with the use of fuzzy sets allows to obtain a list of differential expression events which takes in account the uncertainty in the results caused by the presence of multireads. Such additional information can be used by the biologists when they have to select the most relevant differential expression events to validate with laboratory assays. Our method can be used to compute reliable differential expression events and to highlight possible false positives in the lists of differentially expressed genes computed with other tools.

摘要

背景

当将从高通量RNA测序获得的 reads 与参考数据库进行比对时,其中很大一部分——即所谓的多 reads——可以比对到多个参考序列。这些多 reads 源自基因重复、重复区域或重叠基因。在RNA-Seq分析中,从比对结果中去除多 reads 会导致 reads 计数被低估,而估计实际的 reads 计数在检测差异表达序列时可能会导致假阳性。

结果

我们提出了一种创新方法来处理多 reads 并评估差异表达事件,该方法完全基于模糊集理论。由于多 reads 在基因表达计算过程中会导致 reads 计数估计的不确定性,它们还会通过产生假阳性来影响差异表达分析结果的可靠性。我们的方法通过定义模糊 reads 计数来管理基因表达估计中的不确定性,并使用三个模糊概念:过表达、同表达和低表达来评估基因差异表达的可能性。该方法的输出是一个差异表达基因列表,其中丰富了由于多 reads 的存在而导致的结果不确定性信息。我们在为病例对照研究设计的RNA-Seq数据上测试了该方法,并将获得的结果与其他现有的 reads 计数估计和差异表达分析工具进行了比较。

结论

使用模糊集处理多 reads 可以得到一个差异表达事件列表,该列表考虑了由于多 reads 的存在而导致的结果不确定性。当生物学家必须选择最相关的差异表达事件进行实验室检测验证时,这些额外信息可供他们使用。我们的方法可用于计算可靠的差异表达事件,并突出其他工具计算的差异表达基因列表中可能存在的假阳性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/ffa4fee86592/12859_2016_1195_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/f5c9e356e31e/12859_2016_1195_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/576e7e3a5e49/12859_2016_1195_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/2e2076a35596/12859_2016_1195_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/ea3c93fdb9cc/12859_2016_1195_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/1648f0ee488d/12859_2016_1195_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/31354fecfa94/12859_2016_1195_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/d149d277d996/12859_2016_1195_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/65dc77142aa3/12859_2016_1195_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/ffa4fee86592/12859_2016_1195_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/f5c9e356e31e/12859_2016_1195_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/576e7e3a5e49/12859_2016_1195_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/2e2076a35596/12859_2016_1195_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/ea3c93fdb9cc/12859_2016_1195_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/1648f0ee488d/12859_2016_1195_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/31354fecfa94/12859_2016_1195_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/d149d277d996/12859_2016_1195_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/65dc77142aa3/12859_2016_1195_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b015/5123383/ffa4fee86592/12859_2016_1195_Fig9_HTML.jpg

相似文献

1
A fuzzy method for RNA-Seq differential expression analysis in presence of multireads.一种用于存在多重读取情况下RNA测序差异表达分析的模糊方法。
BMC Bioinformatics. 2016 Nov 8;17(Suppl 12):345. doi: 10.1186/s12859-016-1195-2.
2
SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.SPARTA:用于基于参考的细菌RNA测序转录组自动分析的简单程序。
BMC Bioinformatics. 2016 Feb 4;17:66. doi: 10.1186/s12859-016-0923-y.
3
aFold - using polynomial uncertainty modelling for differential gene expression estimation from RNA sequencing data.aFold - 使用多项式不确定性建模进行 RNA 测序数据的差异基因表达估计。
BMC Genomics. 2019 May 10;20(1):364. doi: 10.1186/s12864-019-5686-1.
4
BM-map: Bayesian mapping of multireads for next-generation sequencing data.BM-map:用于下一代测序数据的多读数贝叶斯映射
Biometrics. 2011 Dec;67(4):1215-24. doi: 10.1111/j.1541-0420.2011.01605.x. Epub 2011 Apr 22.
5
Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads.通过纳入非外显子映射读数对RNA测序数据进行差异表达分析。
BMC Genomics. 2015;16 Suppl 7(Suppl 7):S14. doi: 10.1186/1471-2164-16-S7-S14. Epub 2015 Jun 11.
6
BM-Map: an efficient software package for accurately allocating multireads of RNA-sequencing data.BM-Map:一个高效的软件包,用于准确分配 RNA-seq 数据的多读数。
BMC Genomics. 2012;13 Suppl 8(Suppl 8):S9. doi: 10.1186/1471-2164-13-S8-S9. Epub 2012 Dec 17.
7
Gene dispersion is the key determinant of the read count bias in differential expression analysis of RNA-seq data.基因离散度是RNA-seq数据差异表达分析中读取计数偏差的关键决定因素。
BMC Genomics. 2017 May 25;18(1):408. doi: 10.1186/s12864-017-3809-0.
8
Differentially expressed genes from RNA-Seq and functional enrichment results are affected by the choice of single-end versus paired-end reads and stranded versus non-stranded protocols.来自RNA测序的差异表达基因和功能富集结果受单端读段与双端读段以及链特异性与非链特异性方案选择的影响。
BMC Genomics. 2017 May 23;18(1):399. doi: 10.1186/s12864-017-3797-0.
9
An effective method to resolve ambiguous bisulfite-treated reads.一种有效解决亚硫酸氢盐处理后读取结果模糊的方法。
BMC Bioinformatics. 2021 May 27;22(1):283. doi: 10.1186/s12859-021-04204-6.
10
Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.通过对异构体和外显子特异性读段测序率进行建模来改进RNA测序表达估计。
BMC Bioinformatics. 2015 Oct 16;16:332. doi: 10.1186/s12859-015-0750-6.

引用本文的文献

1
Accurate allocation of multimapped reads enables regulatory element analysis at repeats.准确分配多映射reads 可实现重复元件调控元件分析。
Genome Res. 2024 Jul 23;34(6):937-951. doi: 10.1101/gr.278638.123.
2
Disregarding multimappers leads to biases in the functional assessment of NGS data.忽略多重比对会导致对 NGS 数据的功能评估产生偏差。
BMC Genomics. 2024 May 8;25(1):455. doi: 10.1186/s12864-024-10344-9.
3
Whole-Exome and Transcriptome Sequencing Expands the Genotype of Majewski Osteodysplastic Primordial Dwarfism Type II.

本文引用的文献

1
MMR: a tool for read multi-mapper resolution.MMR:一种用于读取多映射器分辨率的工具。
Bioinformatics. 2016 Mar 1;32(5):770-2. doi: 10.1093/bioinformatics/btv624. Epub 2015 Oct 30.
2
Errors in RNA-Seq quantification affect genes of relevance to human disease.RNA测序定量中的误差会影响与人类疾病相关的基因。
Genome Biol. 2015 Sep 3;16(1):177. doi: 10.1186/s13059-015-0734-x.
3
TEtranscripts: a package for including transposable elements in differential expression analysis of RNA-seq datasets.TE转录本:一个用于在RNA测序数据集差异表达分析中纳入转座元件的软件包。
全外显子组和转录组测序扩展了 Majewski 骨发育不良原基型 II 型的基因型。
Int J Mol Sci. 2023 Jul 31;24(15):12291. doi: 10.3390/ijms241512291.
4
Exogenous and endogenous dsRNAs perceived by plant Dicer-like 4 protein in the RNAi-depleted cellular context.在 RNAi 耗竭的细胞环境中,植物 Dicer-like 4 蛋白识别的外源性和内源性 dsRNAs。
Cell Mol Biol Lett. 2023 Aug 7;28(1):64. doi: 10.1186/s11658-023-00469-2.
5
Lost in HELLS: Disentangling the mystery of SALNR existence in senescence cellular models.迷失在 HELLS:解开衰老细胞模型中 SALNR 存在的奥秘。
PLoS One. 2023 May 30;18(5):e0286104. doi: 10.1371/journal.pone.0286104. eCollection 2023.
6
Analysis of Faecal Microbiota and Small ncRNAs in Autism: Detection of miRNAs and piRNAs with Possible Implications in Host-Gut Microbiota Cross-Talk.自闭症粪便微生物群和小 ncRNA 分析:宿主-肠道微生物群相互作用中具有潜在意义的 miRNA 和 piRNA 的检测。
Nutrients. 2022 Mar 23;14(7):1340. doi: 10.3390/nu14071340.
7
Sequence deeper without sequencing more: Bayesian resolution of ambiguously mapped reads.序列深度不变,测序量更少:解决模糊映射读取的贝叶斯方法。
PLoS Comput Biol. 2021 Apr 19;17(4):e1008926. doi: 10.1371/journal.pcbi.1008926. eCollection 2021 Apr.
8
Plant miRNAs Reduce Cancer Cell Proliferation by Targeting MALAT1 and NEAT1: A Beneficial Cross-Kingdom Interaction.植物微小RNA通过靶向MALAT1和NEAT1减少癌细胞增殖:一种有益的跨界相互作用
Front Genet. 2020 Sep 18;11:552490. doi: 10.3389/fgene.2020.552490. eCollection 2020.
9
Handling multi-mapped reads in RNA-seq.处理RNA测序中的多重比对 reads
Comput Struct Biotechnol J. 2020 Jun 12;18:1569-1576. doi: 10.1016/j.csbj.2020.06.014. eCollection 2020.
10
Integrated Analysis of microRNA and mRNA Expression Profiles: An Attempt to Disentangle the Complex Interaction Network in Attention Deficit Hyperactivity Disorder.微小RNA与信使核糖核酸表达谱的综合分析:解析注意缺陷多动障碍复杂相互作用网络的尝试
Brain Sci. 2019 Oct 22;9(10):288. doi: 10.3390/brainsci9100288.
Bioinformatics. 2015 Nov 15;31(22):3593-9. doi: 10.1093/bioinformatics/btv422. Epub 2015 Jul 23.
4
Defective structural RNA processing in relapsing-remitting multiple sclerosis.复发缓解型多发性硬化症中结构性RNA加工缺陷
Genome Biol. 2015 Mar 25;16(1):58. doi: 10.1186/s13059-015-0629-x.
5
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.使用DESeq2对RNA测序数据的倍数变化和离散度进行适度估计。
Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8.
6
Rcount: simple and flexible RNA-Seq read counting.Rcount:简单灵活的 RNA-Seq 读计数。
Bioinformatics. 2015 Feb 1;31(3):436-7. doi: 10.1093/bioinformatics/btu680. Epub 2014 Oct 15.
7
Combinational usage of next generation sequencing and qPCR for the analysis of tumor samples.联合使用下一代测序和 qPCR 分析肿瘤样本。
Methods. 2013 Jan;59(1):126-31. doi: 10.1016/j.ymeth.2012.11.002. Epub 2012 Nov 21.
8
A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis.Illumina 高通量 RNA 测序数据分析中标准化方法的综合评估。
Brief Bioinform. 2013 Nov;14(6):671-83. doi: 10.1093/bib/bbs046. Epub 2012 Sep 17.
9
Evaluation of normalization methods in mammalian microRNA-Seq data.哺乳动物 microRNA-Seq 数据标准化方法的评估。
RNA. 2012 Jun;18(6):1279-88. doi: 10.1261/rna.030916.111. Epub 2012 Apr 24.
10
Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks.RNA-seq 实验中使用 TopHat 和 Cufflinks 的差异基因和转录本表达分析。
Nat Protoc. 2012 Mar 1;7(3):562-78. doi: 10.1038/nprot.2012.016.