• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于在宏基因组序列数据中检测病毒的轮廓隐马尔可夫模型。

Profile hidden Markov models for the detection of viruses within metagenomic sequence data.

作者信息

Skewes-Cox Peter, Sharpton Thomas J, Pollard Katherine S, DeRisi Joseph L

机构信息

Biological and Medical Informatics Graduate Program, University of California San Francisco, San Francisco, California, United States of America; Departments of Medicine, Biochemistry and Biophysics, and Microbiology, University of California San Francisco, San Francisco, California, United States of America; Howard Hughes Medical Institute, Bethesda, Maryland, United States of America.

The J. David Gladstone Institutes, University of California San Francisco, San Francisco, California, United States of America.

出版信息

PLoS One. 2014 Aug 20;9(8):e105067. doi: 10.1371/journal.pone.0105067. eCollection 2014.

DOI:10.1371/journal.pone.0105067
PMID:25140992
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4139300/
Abstract

Rapid, sensitive, and specific virus detection is an important component of clinical diagnostics. Massively parallel sequencing enables new diagnostic opportunities that complement traditional serological and PCR based techniques. While massively parallel sequencing promises the benefits of being more comprehensive and less biased than traditional approaches, it presents new analytical challenges, especially with respect to detection of pathogen sequences in metagenomic contexts. To a first approximation, the initial detection of viruses can be achieved simply through alignment of sequence reads or assembled contigs to a reference database of pathogen genomes with tools such as BLAST. However, recognition of highly divergent viral sequences is problematic, and may be further complicated by the inherently high mutation rates of some viral types, especially RNA viruses. In these cases, increased sensitivity may be achieved by leveraging position-specific information during the alignment process. Here, we constructed HMMER3-compatible profile hidden Markov models (profile HMMs) from all the virally annotated proteins in RefSeq in an automated fashion using a custom-built bioinformatic pipeline. We then tested the ability of these viral profile HMMs ("vFams") to accurately classify sequences as viral or non-viral. Cross-validation experiments with full-length gene sequences showed that the vFams were able to recall 91% of left-out viral test sequences without erroneously classifying any non-viral sequences into viral protein clusters. Thorough reanalysis of previously published metagenomic datasets with a set of the best-performing vFams showed that they were more sensitive than BLAST for detecting sequences originating from more distant relatives of known viruses. To facilitate the use of the vFams for rapid detection of remote viral homologs in metagenomic data, we provide two sets of vFams, comprising more than 4,000 vFams each, in the HMMER3 format. We also provide the software necessary to build custom profile HMMs or update the vFams as more viruses are discovered (http://derisilab.ucsf.edu/software/vFam).

摘要

快速、灵敏且特异的病毒检测是临床诊断的重要组成部分。大规模平行测序带来了新的诊断机会,可补充传统的血清学和基于PCR的技术。虽然大规模平行测序有望比传统方法更全面且偏差更小,但它也带来了新的分析挑战,特别是在宏基因组背景下检测病原体序列方面。初步估计,病毒的初始检测可通过使用诸如BLAST等工具将序列读数或组装的重叠群与病原体基因组参考数据库进行比对来简单实现。然而,识别高度分化的病毒序列存在问题,并且可能因某些病毒类型(尤其是RNA病毒)固有的高突变率而进一步复杂化。在这些情况下,可通过在比对过程中利用位置特异性信息来提高灵敏度。在此,我们使用定制的生物信息学管道以自动化方式从RefSeq中所有经过病毒注释的蛋白质构建了与HMMER3兼容的轮廓隐马尔可夫模型(轮廓HMM)。然后,我们测试了这些病毒轮廓HMM(“vFams”)将序列准确分类为病毒或非病毒的能力。对全长基因序列进行的交叉验证实验表明,vFams能够召回91%被遗漏的病毒测试序列,且不会将任何非病毒序列错误分类到病毒蛋白簇中。用一组性能最佳的vFams对先前发表的宏基因组数据集进行全面重新分析表明,它们在检测源自已知病毒较远亲属的序列方面比BLAST更灵敏。为便于使用vFams在宏基因组数据中快速检测远距离病毒同源物,我们以HMMER3格式提供了两组vFams,每组包含4000多个vFams。我们还提供了构建定制轮廓HMM或随着发现更多病毒更新vFams所需的软件(http://derisilab.ucsf.edu/software/vFam)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fc0/4139300/90fdff759e65/pone.0105067.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fc0/4139300/84275921540c/pone.0105067.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fc0/4139300/9695176412ec/pone.0105067.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fc0/4139300/4351b7964119/pone.0105067.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fc0/4139300/90fdff759e65/pone.0105067.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fc0/4139300/84275921540c/pone.0105067.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fc0/4139300/9695176412ec/pone.0105067.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fc0/4139300/4351b7964119/pone.0105067.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fc0/4139300/90fdff759e65/pone.0105067.g004.jpg

相似文献

1
Profile hidden Markov models for the detection of viruses within metagenomic sequence data.用于在宏基因组序列数据中检测病毒的轮廓隐马尔可夫模型。
PLoS One. 2014 Aug 20;9(8):e105067. doi: 10.1371/journal.pone.0105067. eCollection 2014.
2
Extension of the viral ecology in humans using viral profile hidden Markov models.利用病毒特征隐藏马尔可夫模型扩展人类病毒生态学研究
PLoS One. 2018 Jan 19;13(1):e0190938. doi: 10.1371/journal.pone.0190938. eCollection 2018.
3
Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut.比较不同的组装和注释工具在分析肠道中模拟病毒宏基因组群落中的应用。
BMC Genomics. 2014 Jan 18;15:37. doi: 10.1186/1471-2164-15-37.
4
Utilizing profile hidden Markov model databases for discovering viruses from metagenomic data: a comprehensive review.利用轮廓隐马尔可夫模型数据库从宏基因组数据中发现病毒:全面综述。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae292.
5
ViraPipe: scalable parallel pipeline for viral metagenome analysis from next generation sequencing reads.ViraPipe:用于从下一代测序读取中进行病毒宏基因组分析的可扩展并行管道。
Bioinformatics. 2018 Mar 15;34(6):928-935. doi: 10.1093/bioinformatics/btx702.
6
Cataloguing the taxonomic origins of sequences from a heterogeneous sample using phylogenomics: applications in adventitious agent detection.利用系统发育基因组学对异质样本中序列的分类学起源进行编目:在检测外来因子中的应用。
PDA J Pharm Sci Technol. 2014 Nov-Dec;68(6):602-18. doi: 10.5731/pdajpst.2014.01023.
7
RdRp-scan: A bioinformatic resource to identify and annotate divergent RNA viruses in metagenomic sequence data.RdRp-scan:一种用于在宏基因组序列数据中识别和注释不同RNA病毒的生物信息学资源。
Virus Evol. 2022 Sep 1;8(2):veac082. doi: 10.1093/ve/veac082. eCollection 2022.
8
Rational Design of Profile Hidden Markov Models for Viral Classification and Discovery用于病毒分类与发现的轮廓隐马尔可夫模型的合理设计
9
-mer-Based Metagenomics Tools Provide a Fast and Sensitive Approach for the Detection of Viral Contaminants in Biopharmaceutical and Vaccine Manufacturing Applications Using Next-Generation Sequencing.基于宏基因组学的工具采用下一代测序技术,为生物制药和疫苗生产应用中病毒污染物的检测提供了一种快速、灵敏的方法。
mSphere. 2021 Apr 21;6(2):e01336-20. doi: 10.1128/mSphere.01336-20.
10
Accelerated Profile HMM Searches.加速轮廓隐马尔可夫模型搜索。
PLoS Comput Biol. 2011 Oct;7(10):e1002195. doi: 10.1371/journal.pcbi.1002195. Epub 2011 Oct 20.

引用本文的文献

1
The dynamic genomes of Hydra and the anciently active repeat complement of animal chromosomes.水螅的动态基因组与动物染色体古老活跃的重复序列互补
Genome Biol. 2025 Jul 1;26(1):186. doi: 10.1186/s13059-025-03653-z.
2
Identification and characterization of novel CRESS-DNA viruses in the human respiratory tract.人类呼吸道中新型CRESS-DNA病毒的鉴定与特征分析
Virol J. 2025 Jun 30;22(1):211. doi: 10.1186/s12985-025-02742-6.
3
Novel Viral Sequences in a Patient with Cryptogenic Liver Cirrhosis Revealed by Serum Virome Sequencing.

本文引用的文献

1
PRICE: software for the targeted assembly of components of (Meta) genomic sequence data.PRICE:用于(元)基因组序列数据的组件靶向组装的软件。
G3 (Bethesda). 2013 May 20;3(5):865-80. doi: 10.1534/g3.113.005967.
2
Next-generation sequencing technology in clinical virology.临床病毒学中的下一代测序技术。
Clin Microbiol Infect. 2013 Jan;19(1):15-22. doi: 10.1111/1469-0691.12056.
3
Sifting through genomes with iterative-sequence clustering produces a large, phylogenetically diverse protein-family resource.通过迭代序列聚类筛选基因组,可产生大量具有系统发育多样性的蛋白质家族资源。
血清病毒组测序揭示隐源性肝硬化患者中的新型病毒序列
Viruses. 2025 Jun 3;17(6):812. doi: 10.3390/v17060812.
4
A review of neural networks for metagenomic binning.宏基因组分箱的神经网络综述。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf065.
5
Evaluation of Enrichment Approaches for the Study of the Viromes in Mollusk Species.软体动物物种病毒组研究中富集方法的评估
Food Environ Virol. 2025 Jan 12;17(1):18. doi: 10.1007/s12560-024-09625-z.
6
Three Novel Spider Genomes Unveil Spidroin Diversification and Hox Cluster Architecture: Ryuthela nishihirai (Liphistiidae), Uloborus plumipes (Uloboridae) and Cheiracanthium punctorium (Cheiracanthiidae).三个新的蜘蛛基因组揭示了蜘蛛丝蛋白的多样性和Hox基因簇结构:西平隆突蛛(地蛛科)、栉足蛛(栉足蛛科)和斑螯蛛(球蛛科)。
Mol Ecol Resour. 2025 Jan;25(1):e14038. doi: 10.1111/1755-0998.14038. Epub 2024 Oct 22.
7
Unveiling the Virome of Wild Birds: Exploring CRESS-DNA Viral Dark Matter.揭示野生鸟类的病毒组:探索 CRESS-DNA 病毒暗物质。
Genome Biol Evol. 2024 Oct 9;16(10). doi: 10.1093/gbe/evae206.
8
Utilizing profile hidden Markov model databases for discovering viruses from metagenomic data: a comprehensive review.利用轮廓隐马尔可夫模型数据库从宏基因组数据中发现病毒:全面综述。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae292.
9
Hecatomb: an integrated software platform for viral metagenomics.Hecatomb:病毒宏基因组学的集成软件平台。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae020.
10
Differences between the intestinal microbial communities of healthy dogs from plateau and those of plateau dogs infected with Echinococcus.高原地区健康犬与高原感染细粒棘球蚴犬的肠道微生物群落差异。
Virol J. 2024 May 23;21(1):116. doi: 10.1186/s12985-024-02364-4.
BMC Bioinformatics. 2012 Oct 13;13:264. doi: 10.1186/1471-2105-13-264.
4
Identification, characterization, and in vitro culture of highly divergent arenaviruses from boa constrictors and annulated tree boas: candidate etiological agents for snake inclusion body disease.从蟒蛇和环纹蟒中鉴定、表征和体外培养高度分化的沙粒病毒:蛇包涵体病的候选病因。
mBio. 2012 Aug 14;3(4):e00180-12. doi: 10.1128/mBio.00180-12. Print 2012.
5
Application of next-generation sequencing technologies in virology.下一代测序技术在病毒学中的应用。
J Gen Virol. 2012 Sep;93(Pt 9):1853-1868. doi: 10.1099/vir.0.043182-0. Epub 2012 May 30.
6
Fast gapped-read alignment with Bowtie 2.快速缺口读对准与 Bowtie 2。
Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth.1923.
7
Virus identification in unknown tropical febrile illness cases using deep sequencing.利用深度测序技术鉴定不明热带发热病例中的病毒。
PLoS Negl Trop Dis. 2012;6(2):e1485. doi: 10.1371/journal.pntd.0001485. Epub 2012 Feb 7.
8
Temporal analysis of the honey bee microbiome reveals four novel viruses and seasonal prevalence of known viruses, Nosema, and Crithidia.对蜜蜂微生物组的时间分析揭示了四种新病毒以及已知病毒、微孢子虫和克里蒂迪亚的季节性流行情况。
PLoS One. 2011;6(6):e20656. doi: 10.1371/journal.pone.0020656. Epub 2011 Jun 7.
9
Mimivirus shows dramatic genome reduction after intraamoebal culture.Mimivirus 在胞内黏菌培养后表现出显著的基因组缩减。
Proc Natl Acad Sci U S A. 2011 Jun 21;108(25):10296-301. doi: 10.1073/pnas.1101118108. Epub 2011 Jun 6.
10
HMMER web server: interactive sequence similarity searching.HMMER 网页服务器:交互式序列相似性搜索。
Nucleic Acids Res. 2011 Jul;39(Web Server issue):W29-37. doi: 10.1093/nar/gkr367. Epub 2011 May 18.