• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

插入缺失长度的分布。

Distribution of Indel lengths.

作者信息

Qian B, Goldstein R A

机构信息

Biophysics Research Division, University of Michigan, Ann Arbor, USA.

出版信息

Proteins. 2001 Oct 1;45(1):102-4. doi: 10.1002/prot.1129.

DOI:10.1002/prot.1129
PMID:11536366
Abstract

Protein sequence alignment has become a widely used method in the study of newly sequenced proteins. Most sequence alignment methods use an affine gap penalty to assign scores to insertions and deletions. Although affine gap penalties represent the relative ease of extending a gap compared with initializing a gap, it is still an obvious oversimplification of the real processes that occur during sequence evolution. To improve the efficiency of sequence alignment methods and to obtain a better understanding of the process of sequence evolution, we wanted to find a more accurate model of insertions and deletions in homologous proteins. In this work, we extract the probability of a gap occurrence and the resulting gap length distribution in distantly related proteins (sequence identity < 25%) using alignments based on their common structures. We observe a distribution of gaps that can be fitted with a multiexponential with four distinct components. The results suggest new approaches to modeling insertions and deletions in sequence alignments.

摘要

蛋白质序列比对已成为新测序蛋白质研究中广泛使用的方法。大多数序列比对方法使用仿射空位罚分来为插入和缺失打分。尽管仿射空位罚分表示与起始一个空位相比扩展一个空位的相对难易程度,但它仍然是对序列进化过程中实际发生的过程的明显过度简化。为了提高序列比对方法的效率并更好地理解序列进化过程,我们希望找到一个更准确的同源蛋白质插入和缺失模型。在这项工作中,我们基于远缘相关蛋白质(序列同一性<25%)的共同结构,通过比对提取空位出现的概率和由此产生的空位长度分布。我们观察到一种空位分布,它可以用具有四个不同成分的多指数函数来拟合。这些结果为序列比对中插入和缺失的建模提出了新方法。

相似文献

1
Distribution of Indel lengths.插入缺失长度的分布。
Proteins. 2001 Oct 1;45(1):102-4. doi: 10.1002/prot.1129.
2
A generalized affine gap model significantly improves protein sequence alignment accuracy.广义仿射间隙模型显著提高了蛋白质序列比对的准确性。
Proteins. 2005 Feb 1;58(2):329-38. doi: 10.1002/prot.20299.
3
Comparison of linear gap penalties and profile-based variable gap penalties in profile-profile alignments.在序列比对的轮廓-轮廓比对中,线性缺口罚分与基于轮廓的可变缺口罚分的比较。
Comput Biol Chem. 2011 Oct 12;35(5):308-18. doi: 10.1016/j.compbiolchem.2011.07.006. Epub 2011 Jul 22.
4
Empirical and structural models for insertions and deletions in the divergent evolution of proteins.蛋白质趋异进化中插入和缺失的经验模型与结构模型。
J Mol Biol. 1993 Feb 20;229(4):1065-82. doi: 10.1006/jmbi.1993.1105.
5
Frequency of gaps observed in a structurally aligned protein pair database suggests a simple gap penalty function.在一个结构比对的蛋白质对数据库中观察到的空位频率表明了一个简单的空位罚分函数。
Nucleic Acids Res. 2004 May 20;32(9):2838-43. doi: 10.1093/nar/gkh610. Print 2004.
6
Using CLUSTAL for multiple sequence alignments.使用CLUSTAL进行多序列比对。
Methods Enzymol. 1996;266:383-402. doi: 10.1016/s0076-6879(96)66024-8.
7
A Shannon entropy-based filter detects high- quality profile-profile alignments in searches for remote homologues.一种基于香农熵的过滤器在搜索远源同源物时可检测到高质量的序列轮廓比对。
Proteins. 2004 Feb 1;54(2):351-60. doi: 10.1002/prot.10564.
8
Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments.蛋白质插入和缺失的实证分析,以确定蛋白质序列比对中正确空位放置的参数。
J Mol Biol. 2004 Aug 6;341(2):617-31. doi: 10.1016/j.jmb.2004.05.045.
9
Robust sequence alignment using evolutionary rates coupled with an amino acid substitution matrix.使用进化速率结合氨基酸替换矩阵进行稳健的序列比对。
BMC Bioinformatics. 2015 Aug 14;16:255. doi: 10.1186/s12859-015-0688-8.
10
Local sequence alignments with monotonic gap penalties.具有单调空位罚分的局部序列比对。
Bioinformatics. 1999 Jun;15(6):455-62. doi: 10.1093/bioinformatics/15.6.455.

引用本文的文献

1
Insertions and Deletions: Computational Methods, Evolutionary Dynamics, and Biological Applications.插入和缺失:计算方法、进化动态和生物应用。
Mol Biol Evol. 2024 Sep 4;41(9). doi: 10.1093/molbev/msae177.
2
Statistical framework to determine indel-length distribution.用于确定插入缺失长度分布的统计框架。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae043.
3
A cellular selection identifies elongated flavodoxins that support electron transfer to sulfite reductase.细胞选择鉴定出支持电子向亚硫酸还原酶转移的长形黄素蛋白。
Protein Sci. 2023 Oct;32(10):e4746. doi: 10.1002/pro.4746.
4
High-resolution mapping reveals the mechanism and contribution of genome insertions and deletions to RNA virus evolution.高分辨率图谱揭示了基因组插入和缺失对 RNA 病毒进化的机制和贡献。
Proc Natl Acad Sci U S A. 2023 Aug;120(31):e2304667120. doi: 10.1073/pnas.2304667120. Epub 2023 Jul 24.
5
Deep reinforcement learning-based pairwise DNA sequence alignment method compatible with embedded edge devices.基于深度强化学习的兼容嵌入式边缘设备的两两 DNA 序列比对方法。
Sci Rep. 2023 Feb 16;13(1):2773. doi: 10.1038/s41598-023-29277-6.
6
Pairwise Heuristic Sequence Alignment Algorithm Based on Deep Reinforcement Learning.基于深度强化学习的成对启发式序列比对算法
IEEE Open J Eng Med Biol. 2021 Jan 29;2:36-43. doi: 10.1109/OJEMB.2021.3055424. eCollection 2021.
7
Local Alignment of DNA Sequence Based on Deep Reinforcement Learning.基于深度强化学习的DNA序列局部比对
IEEE Open J Eng Med Biol. 2021 Apr 27;2:170-178. doi: 10.1109/OJEMB.2021.3076156. eCollection 2021.
8
The length scale of multivalent interactions is evolutionarily conserved in fungal and vertebrate phase-separating proteins.多价相互作用的长度尺度在真菌和脊椎动物的相分离蛋白中是进化保守的。
Genetics. 2022 Jan 4;220(1). doi: 10.1093/genetics/iyab184.
9
A Probabilistic Model for Indel Evolution: Differentiating Insertions from Deletions.一种插入/缺失进化的概率模型:区分插入和缺失。
Mol Biol Evol. 2021 Dec 9;38(12):5769-5781. doi: 10.1093/molbev/msab266.
10
Reproducible simulations of realistic samples for next-generation sequencing studies using Variant Simulation Tools.使用变异模拟工具对下一代测序研究的真实样本进行可重复模拟。
Genet Epidemiol. 2015 Jan;39(1):45-52. doi: 10.1002/gepi.21867. Epub 2014 Nov 13.