• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

UMARS:无法映射读取解决方案。

UMARS: Un-MAppable Reads Solution.

机构信息

Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan.

出版信息

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S9. doi: 10.1186/1471-2105-12-S1-S9.

DOI:10.1186/1471-2105-12-S1-S9
PMID:21342592
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3044317/
Abstract

BACKGROUND

Un-MAppable Reads Solution (UMARS) is a user-friendly web service focusing on retrieving valuable information from sequence reads that cannot be mapped back to reference genomes. Recently, next-generation sequencing (NGS) technology has emerged as a powerful tool for generating high-throughput sequencing data and has been applied to many kinds of biological research. In a typical analysis, adaptor-trimmed NGS reads were first mapped back to reference sequences, including genomes or transcripts. However, a fraction of NGS reads failed to be mapped back to the reference sequences. Such un-mappable reads are usually imputed to sequencing errors and discarded without further consideration.

METHODS

We are investigating possible biological relevance and possible sources of un-mappable reads. Therefore, we developed UMARS to scan for virus genomic fragments or exon-exon junctions of novel alternative splicing isoforms from un-mappable reads. For mapping un-mappable reads, we first collected viral genomes and sequences of exon-exon junctions. Then, we constructed UMARS pipeline as an automatic alignment interface.

RESULTS

By demonstrating the results of two UMARS alignment cases, we show the applicability of UMARS. We first showed that the expected EBV genomic fragments can be detected by UMARS. Second, we also detected exon-exon junctions from un-mappable reads. Further experimental validation also ensured the authenticity of the UMARS pipeline. The UMARS service is freely available to the academic community and can be accessed via http://musk.ibms.sinica.edu.tw/UMARS/.

CONCLUSIONS

In this study, we have shown that some un-mappable reads are not caused by sequencing errors. They can originate from viral infection or transcript splicing. Our UMARS pipeline provides another way to examine and recycle the un-mappable reads that are commonly discarded as garbage.

摘要

背景

Un-MAppable Reads Solution(UMARS)是一个用户友好的网络服务,专注于从无法映射回参考基因组的序列读取中检索有价值的信息。最近,下一代测序(NGS)技术已成为生成高通量测序数据的强大工具,并已应用于许多种生物研究。在典型的分析中,首先将接头修剪的 NGS 读取映射回参考序列,包括基因组或转录本。然而,一部分 NGS 读取无法映射回参考序列。这些无法映射的读取通常被归因于测序错误,并在没有进一步考虑的情况下被丢弃。

方法

我们正在研究无法映射的读取可能具有的生物学相关性和可能的来源。因此,我们开发了 UMARS,以从无法映射的读取中扫描病毒基因组片段或新型可变剪接异构体的外显子-外显子连接。为了映射无法映射的读取,我们首先收集病毒基因组和外显子-外显子连接的序列。然后,我们构建了 UMARS 管道作为自动对齐接口。

结果

通过展示两个 UMARS 对齐案例的结果,我们展示了 UMARS 的适用性。我们首先表明,UMARS 可以检测到预期的 EBV 基因组片段。其次,我们还从无法映射的读取中检测到外显子-外显子连接。进一步的实验验证也确保了 UMARS 管道的真实性。UMARS 服务免费提供给学术界,可以通过 http://musk.ibms.sinica.edu.tw/UMARS/ 访问。

结论

在这项研究中,我们表明,一些无法映射的读取不是由测序错误引起的。它们可能源自病毒感染或转录剪接。我们的 UMARS 管道提供了另一种检查和回收通常被视为垃圾的无法映射读取的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e26/3044317/b1ecd9ca35ab/1471-2105-12-S1-S9-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e26/3044317/256463985303/1471-2105-12-S1-S9-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e26/3044317/fb5cdafe5964/1471-2105-12-S1-S9-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e26/3044317/1943e221b8a8/1471-2105-12-S1-S9-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e26/3044317/b1ecd9ca35ab/1471-2105-12-S1-S9-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e26/3044317/256463985303/1471-2105-12-S1-S9-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e26/3044317/fb5cdafe5964/1471-2105-12-S1-S9-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e26/3044317/1943e221b8a8/1471-2105-12-S1-S9-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e26/3044317/b1ecd9ca35ab/1471-2105-12-S1-S9-4.jpg

相似文献

1
UMARS: Un-MAppable Reads Solution.UMARS:无法映射读取解决方案。
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S9. doi: 10.1186/1471-2105-12-S1-S9.
2
EasyCluster2: an improved tool for clustering and assembling long transcriptome reads.EasyCluster2:一种改进的长转录本读长聚类和组装工具。
BMC Bioinformatics. 2014;15 Suppl 15(Suppl 15):S7. doi: 10.1186/1471-2105-15-S15-S7. Epub 2014 Dec 3.
3
Sequence deeper without sequencing more: Bayesian resolution of ambiguously mapped reads.序列深度不变,测序量更少:解决模糊映射读取的贝叶斯方法。
PLoS Comput Biol. 2021 Apr 19;17(4):e1008926. doi: 10.1371/journal.pcbi.1008926. eCollection 2021 Apr.
4
AUSPP: A universal short-read pre-processing package.AUSPP:一个通用的短读长预处理程序包。
J Bioinform Comput Biol. 2019 Dec;17(6):1950037. doi: 10.1142/S0219720019500379.
5
Sequencing facility and DNA source associated patterns of virus-mappable reads in whole-genome sequencing data.全基因组测序数据中可定位病毒读段的测序设施和 DNA 来源相关模式。
Genomics. 2021 Jan;113(1 Pt 2):1189-1198. doi: 10.1016/j.ygeno.2020.12.004. Epub 2020 Dec 7.
6
BM-Map: an efficient software package for accurately allocating multireads of RNA-sequencing data.BM-Map:一个高效的软件包,用于准确分配 RNA-seq 数据的多读数。
BMC Genomics. 2012;13 Suppl 8(Suppl 8):S9. doi: 10.1186/1471-2164-13-S8-S9. Epub 2012 Dec 17.
7
Gencore: an efficient tool to generate consensus reads for error suppressing and duplicate removing of NGS data.Gencore:一种高效的工具,用于生成共识读数,以抑制 NGS 数据的错误并去除重复。
BMC Bioinformatics. 2019 Dec 27;20(Suppl 23):606. doi: 10.1186/s12859-019-3280-9.
8
From trash to treasure: detecting unexpected contamination in unmapped NGS data.从垃圾到宝藏:检测未映射 NGS 数据中的意外污染。
BMC Bioinformatics. 2019 Apr 18;20(Suppl 4):168. doi: 10.1186/s12859-019-2684-x.
9
TagDust2: a generic method to extract reads from sequencing data.TagDust2:一种从测序数据中提取读数的通用方法。
BMC Bioinformatics. 2015 Jan 28;16:24. doi: 10.1186/s12859-015-0454-y.
10
Ψ-RA: a parallel sparse index for genomic read alignment.Ψ-RA:一种用于基因组读取比对的并行稀疏索引。
BMC Genomics. 2011;12 Suppl 2(Suppl 2):S7. doi: 10.1186/1471-2164-12-S2-S7. Epub 2011 Jul 27.

引用本文的文献

1
Metagenomic ventures into outer sequence space.宏基因组学对外源序列空间的探索。
Bacteriophage. 2014 Dec 15;4(4):e979664. doi: 10.4161/21597081.2014.979664. eCollection 2014.
2
Evaluation and application of the strand-specific protocol for next-generation sequencing.新一代测序链特异性方案的评估与应用
Biomed Res Int. 2015;2015:182389. doi: 10.1155/2015/182389. Epub 2015 Mar 29.
3
A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes.在人类粪便宏基因组的未知序列中发现的一种高度丰富的噬菌体。

本文引用的文献

1
Identification of homologous microRNAs in 56 animal genomes.在 56 种动物基因组中鉴定同源 microRNAs。
Genomics. 2010 Jul;96(1):1-9. doi: 10.1016/j.ygeno.2010.03.009. Epub 2010 Mar 27.
2
Virus discovery by deep sequencing and assembly of virus-derived small silencing RNAs.通过深度测序和病毒衍生小干扰 RNA 的组装发现病毒。
Proc Natl Acad Sci U S A. 2010 Jan 26;107(4):1606-11. doi: 10.1073/pnas.0911353107. Epub 2010 Jan 4.
3
WebCARMA: a web application for the functional and taxonomic classification of unassembled metagenomic reads.
Nat Commun. 2014 Jul 24;5:4498. doi: 10.1038/ncomms5498.
4
MicroRNA 3' end nucleotide modification patterns and arm selection preference in liver tissues.肝脏组织中微小RNA 3'端核苷酸修饰模式及臂选择偏好
BMC Syst Biol. 2012;6 Suppl 2(Suppl 2):S14. doi: 10.1186/1752-0509-6-S2-S14. Epub 2012 Dec 12.
5
Comprehensive analysis of microRNAs in breast cancer.乳腺癌中 microRNAs 的综合分析。
BMC Genomics. 2012;13 Suppl 7(Suppl 7):S18. doi: 10.1186/1471-2164-13-S7-S18. Epub 2012 Dec 13.
6
miRNA arm selection and isomiR distribution in gastric cancer.miRNA 臂选择和胃癌中的 isomiR 分布。
BMC Genomics. 2012;13 Suppl 1(Suppl 1):S13. doi: 10.1186/1471-2164-13-S1-S13. Epub 2012 Jan 17.
WebCARMA:一个用于未组装宏基因组读取的功能和分类学分类的网络应用程序。
BMC Bioinformatics. 2009 Dec 18;10:430. doi: 10.1186/1471-2105-10-430.
4
The UCSC Genome Browser database: update 2010.UCSC 基因组浏览器数据库:2010 年更新
Nucleic Acids Res. 2010 Jan;38(Database issue):D613-9. doi: 10.1093/nar/gkp939. Epub 2009 Nov 11.
5
Genomic diversity and evolution of Mycobacterium ulcerans revealed by next-generation sequencing.通过下一代测序揭示的溃疡分枝杆菌的基因组多样性与进化
PLoS Pathog. 2009 Sep;5(9):e1000580. doi: 10.1371/journal.ppat.1000580. Epub 2009 Sep 11.
6
Identification and characterization of novel amphioxus microRNAs by Solexa sequencing.通过 Solexa 测序鉴定文昌鱼新型 microRNAs 并进行其特征分析。
Genome Biol. 2009;10(7):R78. doi: 10.1186/gb-2009-10-7-r78. Epub 2009 Jul 17.
7
RazerS--fast read mapping with sensitivity control.RazerS——具有灵敏度控制的快速读取映射。
Genome Res. 2009 Sep;19(9):1646-54. doi: 10.1101/gr.088823.108. Epub 2009 Jul 10.
8
SOAP2: an improved ultrafast tool for short read alignment.SOAP2:一种用于短读序列比对的改进型超快速工具。
Bioinformatics. 2009 Aug 1;25(15):1966-7. doi: 10.1093/bioinformatics/btp336. Epub 2009 Jun 3.
9
Fast and accurate short read alignment with Burrows-Wheeler transform.使用Burrows-Wheeler变换进行快速准确的短读比对。
Bioinformatics. 2009 Jul 15;25(14):1754-60. doi: 10.1093/bioinformatics/btp324. Epub 2009 May 18.
10
Complete viral genome sequence and discovery of novel viruses by deep sequencing of small RNAs: a generic method for diagnosis, discovery and sequencing of viruses.通过对小RNA进行深度测序获得完整病毒基因组序列并发现新型病毒:一种用于病毒诊断、发现和测序的通用方法
Virology. 2009 May 25;388(1):1-7. doi: 10.1016/j.virol.2009.03.024. Epub 2009 Apr 23.