文献检索，用中文搜 PubMed

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Center of Bioengineering Research Center of Biotechnology RAS, 119071 Moscow, Russia.

Moscow Engineering Physics Institute, National Research Nuclear University MEPhI, 115409 Moscow, Russia.

Int J Mol Sci. 2021 Jul 1;22(13):7096. doi: 10.3390/ijms22137096.

We report a Method to Search for Highly Divergent Tandem Repeats (MSHDTR) in protein sequences which considers pairwise correlations between adjacent residues. MSHDTR was compared with some previously developed methods for searching for tandem repeats (TRs) in amino acid sequences, such as T-REKS and XSTREAM, which focus on the identification of TRs with significant sequence similarity, whereas MSHDTR detects repeats that significantly diverged during evolution, accumulating deletions, insertions, and substitutions. The application of MSHDTR to a search of the Swiss-Prot databank revealed over 15 thousand TR-containing amino acid sequences that were difficult to find using the other methods. Among the detected TRs, the most representative were those with consensus lengths of two and seven residues; these TRs were subjected to cluster analysis and the classes of patterns were identified. All TRs detected in this study have been combined into a databank accessible over the WWW.

我们报告了一种在蛋白质序列中搜索高度变异串联重复（MSHDTR）的方法，该方法考虑了相邻残基之间的成对相关性。MSHDTR 与一些以前开发的用于搜索氨基酸序列中的串联重复（TR）的方法进行了比较，例如 T-REKS 和 XSTREAM，这些方法侧重于识别具有显著序列相似性的 TR，而 MSHDTR 则检测在进化过程中显著变异、积累缺失、插入和替换的重复。将 MSHDTR 应用于对 Swiss-Prot 数据库的搜索揭示了超过 15000 个含有 TR 的氨基酸序列，这些序列使用其他方法很难找到。在检测到的 TR 中，最具代表性的是那些具有两个和七个残基共识长度的 TR；这些 TR 进行了聚类分析，并确定了模式类别。本研究中检测到的所有 TR 已组合成一个可通过万维网访问的数据库。