• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种新型的基因序列特征化方法:基因组空间与生物距离及其应用。

A novel method of characterizing genetic sequences: genome space with biological distance and applications.

机构信息

Department of Mathematics, Statistics and Computer Science, University of Illinois at Chicago, Chicago, Illinois, United States of America.

出版信息

PLoS One. 2011 Mar 2;6(3):e17293. doi: 10.1371/journal.pone.0017293.

DOI:10.1371/journal.pone.0017293
PMID:21399690
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3047556/
Abstract

BACKGROUND

Most existing methods for phylogenetic analysis involve developing an evolutionary model and then using some type of computational algorithm to perform multiple sequence alignment. There are two problems with this approach: (1) different evolutionary models can lead to different results, and (2) the computation time required for multiple alignments makes it impossible to analyse the phylogeny of a whole genome. This motivates us to create a new approach to characterize genetic sequences.

METHODOLOGY

To each DNA sequence, we associate a natural vector based on the distributions of nucleotides. This produces a one-to-one correspondence between the DNA sequence and its natural vector. We define the distance between two DNA sequences to be the distance between their associated natural vectors. This creates a genome space with a biological distance which makes global comparison of genomes with same topology possible. We use our proposed method to analyze the genomes of the new influenza A (H1N1) virus, human rhinoviruses (HRV) and mammalian mitochondrial. The result shows that a triple-reassortant swine virus circulating in North America and the Eurasian swine virus belong to the lineage of the influenza A (H1N1) virus. For the HRV and mammalian mitochondrial genomes, the results coincide with biologists' analyses.

CONCLUSIONS

Our approach provides a powerful new tool for analyzing and annotating genomes and their phylogenetic relationships. Whole or partial genomes can be handled more easily and more quickly than using multiple alignment methods. Once a genome space has been constructed, it can be stored in a database. There is no need to reconstruct the genome space for subsequent applications, whereas in multiple alignment methods, realignment is needed to add new sequences. Furthermore, one can make a global comparison of all genomes simultaneously, which no other existing method can achieve.

摘要

背景

大多数现有的系统发育分析方法都涉及开发进化模型,然后使用某种类型的计算算法来执行多重序列比对。这种方法有两个问题:(1)不同的进化模型可能会导致不同的结果,(2)多重比对所需的计算时间使得不可能分析整个基因组的系统发育。这促使我们创造一种新的方法来描述遗传序列。

方法

我们为每个 DNA 序列关联一个基于核苷酸分布的自然向量。这在 DNA 序列与其自然向量之间产生了一一对应的关系。我们将两个 DNA 序列之间的距离定义为它们相关联的自然向量之间的距离。这创建了一个具有生物距离的基因组空间,使得具有相同拓扑结构的基因组的全局比较成为可能。我们使用我们提出的方法来分析新型甲型流感 (H1N1) 病毒、人类鼻病毒 (HRV) 和哺乳动物线粒体的基因组。结果表明,在北美循环的三重重配猪病毒和欧亚猪病毒属于甲型流感 (H1N1) 病毒谱系。对于 HRV 和哺乳动物线粒体基因组,结果与生物学家的分析一致。

结论

我们的方法为分析和注释基因组及其系统发育关系提供了一种强大的新工具。与使用多重比对方法相比,可以更轻松、更快速地处理整个或部分基因组。一旦构建了基因组空间,就可以将其存储在数据库中。与多重比对方法不同,不需要重新构建基因组空间来添加新序列。此外,可以同时对所有基因组进行全局比较,这是其他现有方法无法实现的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/284f/3047556/eab5c06392a1/pone.0017293.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/284f/3047556/828280fb6575/pone.0017293.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/284f/3047556/3b2883981ee0/pone.0017293.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/284f/3047556/ad2bd2a4e3d9/pone.0017293.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/284f/3047556/eab5c06392a1/pone.0017293.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/284f/3047556/828280fb6575/pone.0017293.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/284f/3047556/3b2883981ee0/pone.0017293.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/284f/3047556/ad2bd2a4e3d9/pone.0017293.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/284f/3047556/eab5c06392a1/pone.0017293.g004.jpg

相似文献

1
A novel method of characterizing genetic sequences: genome space with biological distance and applications.一种新型的基因序列特征化方法:基因组空间与生物距离及其应用。
PLoS One. 2011 Mar 2;6(3):e17293. doi: 10.1371/journal.pone.0017293.
2
K-mer natural vector and its application to the phylogenetic analysis of genetic sequences.K- -mer 自然向量及其在遗传序列系统发育分析中的应用。
Gene. 2014 Aug 1;546(1):25-34. doi: 10.1016/j.gene.2014.05.043. Epub 2014 May 22.
3
A new distribution vector and its application in genome clustering.一种新的分布向量及其在基因组聚类中的应用。
Mol Phylogenet Evol. 2011 May;59(2):438-43. doi: 10.1016/j.ympev.2011.02.020. Epub 2011 Mar 6.
4
Detection and Characterization of Swine Origin Influenza A(H1N1) Pandemic 2009 Viruses in Humans following Zoonotic Transmission.人类经动物传播感染的猪源 2009 年甲型 H1N1 流感病毒的检测与特征分析。
J Virol. 2020 Dec 22;95(2). doi: 10.1128/JVI.01066-20.
5
An improved model for whole genome phylogenetic analysis by Fourier transform.一种通过傅里叶变换进行全基因组系统发育分析的改进模型。
J Theor Biol. 2015 Oct 7;382:99-110. doi: 10.1016/j.jtbi.2015.06.033. Epub 2015 Jul 4.
6
Gene comparison based on the repetition of single-nucleotide structure patterns.基于单核苷酸结构模式重复的基因比较。
Comput Biol Med. 2012 Oct;42(10):975-81. doi: 10.1016/j.compbiomed.2012.07.009. Epub 2012 Aug 16.
7
A Single Amino Acid at Position 431 of the PB2 Protein Determines the Virulence of H1N1 Swine Influenza Viruses in Mice.位于 PB2 蛋白第 431 位的单个氨基酸决定了 H1N1 猪流感病毒在小鼠中的毒力。
J Virol. 2020 Mar 31;94(8). doi: 10.1128/JVI.01930-19.
8
Genetic characterization of influenza A viruses circulating in pigs and isolated in north-east Spain during the period 2006-2007.2006 - 2007年期间在西班牙东北部猪群中流行并分离出的甲型流感病毒的基因特征。
Res Vet Sci. 2014 Apr;96(2):380-8. doi: 10.1016/j.rvsc.2013.12.006. Epub 2013 Dec 13.
9
Influenza A H1N1 virus in Indian pigs & its genetic relatedness with pandemic human influenza A 2009 H1N1.甲型 H1N1 流感病毒在印度猪群中的感染情况及其与 2009 年大流行的人感染甲型 H1N1 流感病毒的遗传关系。
Indian J Med Res. 2010 Aug;132:160-7.
10
A novel clustering method via nucleotide-based Fourier power spectrum analysis.一种基于核苷酸的傅里叶功率谱分析的新型聚类方法。
J Theor Biol. 2011 Jun 21;279(1):83-9. doi: 10.1016/j.jtbi.2011.03.029. Epub 2011 Apr 2.

引用本文的文献

1
Energy entropy vector: a novel approach for efficient microbial genomic sequence analysis and classification.能量熵向量:一种用于高效微生物基因组序列分析和分类的新方法。
Brief Bioinform. 2025 Sep 6;26(5). doi: 10.1093/bib/bbaf459.
2
CAKL: Commutative algebra k-mer learning of genomics.CAKL:基因组学的交换代数k-mer学习
ArXiv. 2025 Aug 13:arXiv:2508.09406v1.
3
The grand biological universe: A comprehensive geometric construction of genome space.宏大的生物宇宙:基因组空间的全面几何构建

本文引用的文献

1
A novel construction of genome space with biological geometry.具有生物几何结构的基因组空间的新构建。
DNA Res. 2010 Jun;17(3):155-68. doi: 10.1093/dnares/dsq008. Epub 2010 Apr 1.
2
A rapid method for characterization of protein relatedness using feature vectors.一种利用特征向量快速分析蛋白质相关性的方法。
PLoS One. 2010 Mar 5;5(3):e9550. doi: 10.1371/journal.pone.0009550.
3
2009 Swine-origin influenza A (H1N1) resembles previous influenza isolates.2009年甲型H1N1猪源流感病毒与以往分离出的流感病毒相似。
Innovation (Camb). 2025 Apr 30;6(8):100937. doi: 10.1016/j.xinn.2025.100937. eCollection 2025 Aug 4.
4
Levy Sooty Tern Optimization Algorithm Builds DNA Storage Coding Sets for Random Access.列维乌燕鸥优化算法构建用于随机访问的DNA存储编码集。
Entropy (Basel). 2024 Sep 11;26(9):778. doi: 10.3390/e26090778.
5
Investigating alignment-free machine learning methods for HIV-1 subtype classification.研究用于HIV-1亚型分类的无比对机器学习方法。
Bioinform Adv. 2024 Jul 29;4(1):vbae108. doi: 10.1093/bioadv/vbae108. eCollection 2024.
6
Exploring geometry of genome space via Grassmann manifolds.通过格拉斯曼流形探索基因组空间的几何结构。
Innovation (Camb). 2024 Jul 22;5(5):100677. doi: 10.1016/j.xinn.2024.100677. eCollection 2024 Sep 9.
7
New Virus Variant Detection Based on the Optimal Natural Metric.基于最优自然测度的新型病毒变体检测
Genes (Basel). 2024 Jul 7;15(7):891. doi: 10.3390/genes15070891.
8
The optimal metric for viral genome space.病毒基因组空间的最佳指标。
Comput Struct Biotechnol J. 2024 May 10;23:2083-2096. doi: 10.1016/j.csbj.2024.05.005. eCollection 2024 Dec.
9
Automated recognition of chromosome fusion using an alignment-free natural vector method.使用无比对自然向量法自动识别染色体融合
Front Genet. 2024 Mar 20;15:1364951. doi: 10.3389/fgene.2024.1364951. eCollection 2024.
10
Unsupervised identification of significant lineages of SARS-CoV-2 through scalable machine learning methods.通过可扩展的机器学习方法对 SARS-CoV-2 的重要谱系进行无监督识别。
Proc Natl Acad Sci U S A. 2024 Mar 19;121(12):e2317284121. doi: 10.1073/pnas.2317284121. Epub 2024 Mar 13.
PLoS One. 2009 Jul 28;4(7):e6402. doi: 10.1371/journal.pone.0006402.
4
The pig as a mixing vessel for influenza viruses: Human and veterinary implications.猪作为流感病毒的混合容器:对人类和兽医的影响。
J Mol Genet Med. 2008 Nov 27;3(1):158-66.
5
Antigenic and genetic characteristics of swine-origin 2009 A(H1N1) influenza viruses circulating in humans.在人群中传播的源自猪的2009年甲型H1N1流感病毒的抗原和基因特征
Science. 2009 Jul 10;325(5937):197-201. doi: 10.1126/science.1176225. Epub 2009 May 22.
6
Triple-reassortant swine influenza A (H1) in humans in the United States, 2005-2009.2005 - 2009年美国人类感染的三重重配甲型H1流感病毒
N Engl J Med. 2009 Jun 18;360(25):2616-25. doi: 10.1056/NEJMoa0903812. Epub 2009 May 7.
7
Implications of the emergence of a novel H1 influenza virus.一种新型H1流感病毒出现的影响。
N Engl J Med. 2009 Jun 18;360(25):2667-8. doi: 10.1056/NEJMe0903995. Epub 2009 May 7.
8
Emergence of a novel swine-origin influenza A (H1N1) virus in humans.一种新型猪源甲型流感病毒(H1N1)在人类中的出现。
N Engl J Med. 2009 Jun 18;360(25):2605-15. doi: 10.1056/NEJMoa0903810. Epub 2009 May 7.
9
Sequencing and analyses of all known human rhinovirus genomes reveal structure and evolution.对所有已知人类鼻病毒基因组的测序和分析揭示了其结构与进化。
Science. 2009 Apr 3;324(5923):55-9. doi: 10.1126/science.1165557. Epub 2009 Feb 12.
10
MEGA: a biologist-centric software for evolutionary analysis of DNA and protein sequences.MEGA:一款以生物学家为中心的用于DNA和蛋白质序列进化分析的软件。
Brief Bioinform. 2008 Jul;9(4):299-306. doi: 10.1093/bib/bbn017. Epub 2008 Apr 16.