Suppr超能文献

用于临床和监测目的的 HIV-1 基因序列自动亚型分析:新的 REGA 版本 3 和其他七种工具的性能评估。

Automated subtyping of HIV-1 genetic sequences for clinical and surveillance purposes: performance evaluation of the new REGA version 3 and seven other tools.

机构信息

Laboratory for Clinical and Epidemiological Virology, Rega Institute for Medical Research, Department of Microbiology and Immunology, University of Leuven, Belgium; Clinical and Molecular Infectious Diseases Group, Faculty of Sciences and Mathematics, Universidad del Rosario, Bogotá, Colombia.

出版信息

Infect Genet Evol. 2013 Oct;19:337-48. doi: 10.1016/j.meegid.2013.04.032. Epub 2013 May 7.

Abstract

BACKGROUND

To investigate differences in pathogenesis, diagnosis and resistance pathways between HIV-1 subtypes, an accurate subtyping tool for large datasets is needed. We aimed to evaluate the performance of automated subtyping tools to classify the different subtypes and circulating recombinant forms using pol, the most sequenced region in clinical practice. We also present the upgraded version 3 of the Rega HIV subtyping tool (REGAv3).

METHODOLOGY

HIV-1 pol sequences (PR+RT) for 4674 patients retrieved from the Portuguese HIV Drug Resistance Database, and 1872 pol sequences trimmed from full-length genomes retrieved from the Los Alamos database were classified with statistical-based tools such as COMET, jpHMM and STAR; similarity-based tools such as NCBI and Stanford; and phylogenetic-based tools such as REGA version 2 (REGAv2), REGAv3, and SCUEAL. The performance of these tools, for pol, and for PR and RT separately, was compared in terms of reproducibility, sensitivity and specificity with respect to the gold standard which was manual phylogenetic analysis of the pol region.

RESULTS

The sensitivity and specificity for subtypes B and C was more than 96% for seven tools, but was variable for other subtypes such as A, D, F and G. With regard to the most common circulating recombinant forms (CRFs), the sensitivity and specificity for CRF01_AE was 99% with statistical-based tools, with phylogenetic-based tools and with Stanford, one of the similarity based tools. CRF02_AG was correctly identified for more than 96% by COMET, REGAv3, Stanford and STAR. All the tools reached a specificity of more than 97% for most of the subtypes and the two main CRFs (CRF01_AE and CRF02_AG). Other CRFs were identified only by COMET, REGAv2, REGAv3, and SCUEAL and with variable sensitivity. When analyzing sequences for PR and RT separately, the performance for PR was generally lower and variable between the tools. Similarity and statistical-based tools were 100% reproducible, but this was lower for phylogenetic-based tools such as REGA (99%) and SCUEAL (~96%).

CONCLUSIONS

REGAv3 had an improved performance for subtype B and CRF02_AG compared to REGAv2 and is now able to also identify all epidemiologically relevant CRFs. In general the best performing tools, in alphabetical order, were COMET, jpHMM, REGAv3, and SCUEAL when analyzing pure subtypes in the pol region, and COMET and REGAv3 when analyzing most of the CRFs. Based on this study, we recommend to confirm subtyping with 2 well performing tools, and be cautious with the interpretation of short sequences.

摘要

背景

为了研究 HIV-1 亚型之间的发病机制、诊断和耐药途径的差异,需要一种准确的用于大型数据集的亚型分类工具。我们旨在评估自动化亚型分类工具在使用 pol(临床实践中最常测序的区域)对不同亚型和循环重组形式进行分类方面的性能。我们还介绍了 Rega HIV 亚型工具(REGAv3)的升级版本 3。

方法

使用基于统计的 COMET、jpHMM 和 STAR 等工具,对从葡萄牙 HIV 耐药数据库中检索到的 4674 名患者的 HIV-1 pol 序列(PR+RT)和从 Los Alamos 数据库中修剪的全长基因组中检索到的 1872 个 pol 序列进行分类;基于相似性的 NCBI 和斯坦福;以及基于系统发育的 REGA 版本 2(REGAv2)、REGAv3 和 SCUEAL。针对 pol 以及 PR 和 RT 分别比较了这些工具的重现性、敏感性和特异性,以手动对 pol 区域进行系统发育分析作为金标准。

结果

七种工具对 B 和 C 亚型的敏感性和特异性均超过 96%,但对 A、D、F 和 G 等其他亚型则存在差异。对于最常见的循环重组形式(CRFs),基于统计的工具对 CRF01_AE 的敏感性和特异性约为 99%,而基于系统发育的工具和基于相似性的斯坦福则是如此。COMET、REGAv3、斯坦福和 STAR 可正确识别超过 96%的 CRF02_AG。大多数工具对大多数亚型和两个主要 CRFs(CRF01_AE 和 CRF02_AG)的特异性均超过 97%。其他 CRFs仅可通过 COMET、REGAv2、REGAv3 和 SCUEAL 识别,且敏感性存在差异。当分别分析 PR 和 RT 时,PR 的性能通常较低,且在工具之间存在差异。基于相似性和统计的工具具有 100%的重现性,但基于系统发育的工具(如 REGA(99%)和 SCUEAL(96%))则较低。

结论

与 REGAv2 相比,REGAv3 对 B 亚型和 CRF02_AG 的性能有所提高,现在能够识别所有具有流行病学意义的 CRFs。一般来说,在分析 pol 区域中的纯亚型时,性能最好的工具(按字母顺序排列)是 COMET、jpHMM、REGAv3 和 SCUEAL,而在分析大多数 CRFs 时,性能最好的工具是 COMET 和 REGAv3。基于这项研究,我们建议使用两种性能良好的工具来确认亚型分类,并谨慎解释短序列。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验