• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多标签学习预测 HIV-1 循环重组形式(CRFs)的遗传来源完整性。

Genetic source completeness of HIV-1 circulating recombinant forms (CRFs) predicted by multi-label learning.

机构信息

Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Hunan 411105, China.

Advanced Analytics Institute, University of Technology Sydney, Sydney, NSW 2007, Australia.

出版信息

Bioinformatics. 2021 May 5;37(6):750-758. doi: 10.1093/bioinformatics/btaa887.

DOI:10.1093/bioinformatics/btaa887
PMID:33063094
Abstract

MOTIVATION

Infection with strains of different subtypes and the subsequent crossover reading between the two strands of genomic RNAs by host cells' reverse transcriptase are the main causes of the vast HIV-1 sequence diversity. Such inter-subtype genomic recombinants can become circulating recombinant forms (CRFs) after widespread transmissions in a population. Complete prediction of all the subtype sources of a CRF strain is a complicated machine learning problem. It is also difficult to understand whether a strain is an emerging new subtype and if so, how to accurately identify the new components of the genetic source.

RESULTS

We introduce a multi-label learning algorithm for the complete prediction of multiple sources of a CRF sequence as well as the prediction of its chronological number. The prediction is strengthened by a voting of various multi-label learning methods to avoid biased decisions. In our steps, frequency and position features of the sequences are both extracted to capture signature patterns of pure subtypes and CRFs. The method was applied to 7185 HIV-1 sequences, comprising 5530 pure subtype sequences and 1655 CRF sequences. Results have demonstrated that the method can achieve very high accuracy (reaching 99%) in the prediction of the complete set of labels of HIV-1 recombinant forms. A few wrong predictions are actually incomplete predictions, very close to the complete set of genuine labels.

AVAILABILITY AND IMPLEMENTATION

https://github.com/Runbin-tang/The-source-of-HIV-CRFs-prediction.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

不同亚型毒株的感染以及宿主细胞的逆转录酶在基因组 RNA 两条链之间的交叉阅读,是 HIV-1 序列多样性的主要原因。这种跨亚型的基因组重组可以在人群中广泛传播后成为循环重组形式(CRF)。完全预测 CRF 株的所有亚型来源是一个复杂的机器学习问题。也很难确定一个菌株是否是新出现的亚型,如果是,如何准确识别遗传来源的新成分。

结果

我们引入了一种多标签学习算法,用于完全预测 CRF 序列的多个来源及其年代编号。通过各种多标签学习方法的投票来加强预测,以避免有偏见的决策。在我们的步骤中,提取了序列的频率和位置特征,以捕获纯亚型和 CRF 的特征模式。该方法应用于 7185 个 HIV-1 序列,包括 5530 个纯亚型序列和 1655 个 CRF 序列。结果表明,该方法可以非常准确地预测 HIV-1 重组形式的完整标签集(达到 99%)。少数错误的预测实际上是不完整的预测,非常接近完整的真实标签集。

可用性和实现

https://github.com/Runbin-tang/The-source-of-HIV-CRFs-prediction。

补充信息

补充数据可在生物信息学在线获得。

相似文献

1
Genetic source completeness of HIV-1 circulating recombinant forms (CRFs) predicted by multi-label learning.多标签学习预测 HIV-1 循环重组形式(CRFs)的遗传来源完整性。
Bioinformatics. 2021 May 5;37(6):750-758. doi: 10.1093/bioinformatics/btaa887.
2
Identification of a New HIV-1 BC Intersubtype Circulating Recombinant Form (CRF108_BC) in Spain.在西班牙鉴定出一种新型HIV-1 BC亚型间循环重组型(CRF108_BC)
Viruses. 2021 Jan 12;13(1):93. doi: 10.3390/v13010093.
3
[Characteristic analysis of molecular subtypes and recombinant structure of HIV-1 infection in Zhejiang Province, 2015].[2015年浙江省HIV-1感染分子亚型特征及重组结构分析]
Zhonghua Yu Fang Yi Xue Za Zhi. 2018 Apr 6;52(4):409-414. doi: 10.3760/cma.j.issn.0253-9624.2018.04.014.
4
Sensitive Next-Generation Sequencing Method Reveals Deep Genetic Diversity of HIV-1 in the Democratic Republic of the Congo.灵敏的新一代测序方法揭示了刚果民主共和国HIV-1的深度基因多样性。
J Virol. 2017 Feb 28;91(6). doi: 10.1128/JVI.01841-16. Print 2017 Mar 15.
5
Identification of Two New HIV-1 Circulating Recombinant Forms (CRF87_cpx and CRF88_BC) from Reported Unique Recombinant Forms in Asia.从亚洲已报告的独特重组形式中鉴定出两种新型HIV-1流行重组形式(CRF87_cpx和CRF88_BC)。
AIDS Res Hum Retroviruses. 2017 Apr;33(4):353-358. doi: 10.1089/AID.2016.0252. Epub 2016 Dec 13.
6
High frequency of HIV-1 infections with multiple HIV-1 strains in men having sex with men (MSM) in Senegal.在塞内加尔,男男性行为者(MSM)中存在多种 HIV-1 毒株的 HIV-1 感染率较高。
Infect Genet Evol. 2013 Dec;20:206-14. doi: 10.1016/j.meegid.2013.09.002. Epub 2013 Sep 11.
7
Automated subtyping of HIV-1 genetic sequences for clinical and surveillance purposes: performance evaluation of the new REGA version 3 and seven other tools.用于临床和监测目的的 HIV-1 基因序列自动亚型分析:新的 REGA 版本 3 和其他七种工具的性能评估。
Infect Genet Evol. 2013 Oct;19:337-48. doi: 10.1016/j.meegid.2013.04.032. Epub 2013 May 7.
8
Short communication: evidence of HIV type 1 complex and second generation recombinant strains among patients infected in 1997-2007 in France: ANRS CO06 PRIMO Cohort.简短通讯:1997年至2007年在法国感染的患者中存在HIV-1复合群和第二代重组毒株的证据:法国国家艾滋病研究机构(ANRS)CO06 PRIMO队列研究
AIDS Res Hum Retroviruses. 2010 Jun;26(6):645-51. doi: 10.1089/aid.2009.0201.
9
Geographically-stratified HIV-1 group M pol subtype and circulating recombinant form sequences.地理分层的 HIV-1 组 M 聚合酶基因亚型和流行重组形式序列。
Sci Data. 2018 Jul 31;5:180148. doi: 10.1038/sdata.2018.148.
10
Near full-length genetic analysis of HIV sequences derived from Cyprus: evidence of a highly polyphyletic and evolving infection.塞浦路斯HIV序列的近全长基因分析:高度多系且不断演变感染的证据
AIDS Res Hum Retroviruses. 2009 Aug;25(8):727-40. doi: 10.1089/aid.2008.0239.

引用本文的文献

1
Investigating alignment-free machine learning methods for HIV-1 subtype classification.研究用于HIV-1亚型分类的无比对机器学习方法。
Bioinform Adv. 2024 Jul 29;4(1):vbae108. doi: 10.1093/bioadv/vbae108. eCollection 2024.
2
CGRWDL: alignment-free phylogeny reconstruction method for viruses based on chaos game representation weighted by dynamical language model.CGRWDL:基于动态语言模型加权混沌博弈表示的病毒无比对系统发育重建方法
Front Microbiol. 2024 Mar 20;15:1339156. doi: 10.3389/fmicb.2024.1339156. eCollection 2024.