Suppr超能文献

支架序列(ScaffoldSeq):用于定向进化群体特征分析的软件。

ScaffoldSeq: Software for characterization of directed evolution populations.

作者信息

Woldring Daniel R, Holec Patrick V, Hackel Benjamin J

机构信息

Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, Minnesota, 55455.

出版信息

Proteins. 2016 Jul;84(7):869-74. doi: 10.1002/prot.25040. Epub 2016 Apr 16.

Abstract

ScaffoldSeq is software designed for the numerous applications-including directed evolution analysis-in which a user generates a population of DNA sequences encoding for partially diverse proteins with related functions and would like to characterize the single site and pairwise amino acid frequencies across the population. A common scenario for enzyme maturation, antibody screening, and alternative scaffold engineering involves naïve and evolved populations that contain diversified regions, varying in both sequence and length, within a conserved framework. Analyzing the diversified regions of such populations is facilitated by high-throughput sequencing platforms; however, length variability within these regions (e.g., antibody CDRs) encumbers the alignment process. To overcome this challenge, the ScaffoldSeq algorithm takes advantage of conserved framework sequences to quickly identify diverse regions. Beyond this, unintended biases in sequence frequency are generated throughout the experimental workflow required to evolve and isolate clones of interest prior to DNA sequencing. ScaffoldSeq software uniquely handles this issue by providing tools to quantify and remove background sequences, cluster similar protein families, and dampen the impact of dominant clones. The software produces graphical and tabular summaries for each region of interest, allowing users to evaluate diversity in a site-specific manner as well as identify epistatic pairwise interactions. The code and detailed information are freely available at http://research.cems.umn.edu/hackel. Proteins 2016; 84:869-874. © 2016 Wiley Periodicals, Inc.

摘要

ScaffoldSeq是一款为多种应用而设计的软件,包括定向进化分析。在这些应用中,用户生成一群编码具有相关功能的部分多样化蛋白质的DNA序列,并希望表征整个群体中单个位点和成对氨基酸的频率。酶成熟、抗体筛选和替代支架工程的常见场景涉及原始群体和进化群体,这些群体在保守框架内包含序列和长度都不同的多样化区域。高通量测序平台有助于分析此类群体的多样化区域;然而,这些区域内的长度变异性(例如抗体互补决定区)会妨碍比对过程。为了克服这一挑战,ScaffoldSeq算法利用保守框架序列快速识别多样化区域。除此之外,在DNA测序之前进化和分离感兴趣的克隆所需的整个实验工作流程中会产生序列频率的意外偏差。ScaffoldSeq软件通过提供量化和去除背景序列、聚类相似蛋白质家族以及减弱优势克隆影响的工具,独特地处理了这个问题。该软件为每个感兴趣的区域生成图形和表格摘要,允许用户以位点特异性方式评估多样性,并识别上位性成对相互作用。代码和详细信息可在http://research.cems.umn.edu/hackel免费获取。《蛋白质》2016年;84:869 - 874。© 2016威利期刊公司。

相似文献

1
ScaffoldSeq: Software for characterization of directed evolution populations.
Proteins. 2016 Jul;84(7):869-74. doi: 10.1002/prot.25040. Epub 2016 Apr 16.
3
Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.
BMC Bioinformatics. 2013;14 Suppl 11(Suppl 11):S2. doi: 10.1186/1471-2105-14-S11-S2. Epub 2013 Sep 13.
4
Alignment-free clustering of large data sets of unannotated protein conserved regions using minhashing.
BMC Bioinformatics. 2018 Mar 5;19(1):83. doi: 10.1186/s12859-018-2080-y.
5
On the quality of tree-based protein classification.
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
7
MBEToolbox: a MATLAB toolbox for sequence data analysis in molecular biology and evolution.
BMC Bioinformatics. 2005 Mar 22;6:64. doi: 10.1186/1471-2105-6-64.
8
SITEBLAST--rapid and sensitive local alignment of genomic sequences employing motif anchors.
Bioinformatics. 2005 May 1;21(9):2093-4. doi: 10.1093/bioinformatics/bti224. Epub 2004 Dec 14.
10
TruSPAdes: barcode assembly of TruSeq synthetic long reads.
Nat Methods. 2016 Mar;13(3):248-50. doi: 10.1038/nmeth.3737. Epub 2016 Feb 1.

引用本文的文献

2
Hyperstable Synthetic Mini-Proteins as Effective Ligand Scaffolds.
ACS Synth Biol. 2023 Dec 15;12(12):3608-3622. doi: 10.1021/acssynbio.3c00409. Epub 2023 Nov 27.
3
Biophysical Characterization Platform Informs Protein Scaffold Evolvability.
ACS Comb Sci. 2019 Apr 8;21(4):323-335. doi: 10.1021/acscombsci.8b00182. Epub 2019 Feb 18.
4
Constrained Combinatorial Libraries of Gp2 Proteins Enhance Discovery of PD-L1 Binders.
ACS Comb Sci. 2018 Jul 9;20(7):423-435. doi: 10.1021/acscombsci.8b00010. Epub 2018 Jun 5.
5
A Gradient of Sitewise Diversity Promotes Evolutionary Fitness for Binder Discovery in a Three-Helix Bundle Protein Scaffold.
Biochemistry. 2017 Mar 21;56(11):1656-1671. doi: 10.1021/acs.biochem.6b01142. Epub 2017 Mar 9.
6
Deep sequencing methods for protein engineering and design.
Curr Opin Struct Biol. 2017 Aug;45:36-44. doi: 10.1016/j.sbi.2016.11.001. Epub 2016 Nov 22.

本文引用的文献

1
High-Throughput Ligand Discovery Reveals a Sitewise Gradient of Diversity in Broadly Evolved Hydrophilic Fibronectin Domains.
PLoS One. 2015 Sep 18;10(9):e0138956. doi: 10.1371/journal.pone.0138956. eCollection 2015.
2
A 45-Amino-Acid Scaffold Mined from the PDB for High-Affinity Ligand Engineering.
Chem Biol. 2015 Jul 23;22(7):946-56. doi: 10.1016/j.chembiol.2015.06.012. Epub 2015 Jul 9.
3
Software for the analysis and visualization of deep mutational scanning data.
BMC Bioinformatics. 2015 May 20;16:168. doi: 10.1186/s12859-015-0590-4.
4
FASTAptamer: A Bioinformatic Toolkit for High-throughput Sequence Analysis of Combinatorial Selections.
Mol Ther Nucleic Acids. 2015 Mar 3;4(3):e230. doi: 10.1038/mtna.2015.4.
5
Deep mutational scanning: a new style of protein science.
Nat Methods. 2014 Aug;11(8):801-7. doi: 10.1038/nmeth.3027.
6
Epistatically interacting substitutions are enriched during adaptive protein evolution.
PLoS Genet. 2014 May 8;10(5):e1004328. doi: 10.1371/journal.pgen.1004328. eCollection 2014 May.
7
Residue specific contributions to stability and activity inferred from saturation mutagenesis and deep sequencing.
Curr Opin Struct Biol. 2014 Feb;24:63-71. doi: 10.1016/j.sbi.2013.12.001. Epub 2014 Jan 7.
8
Bioinformatics identification of coevolving residues.
Methods Mol Biol. 2014;1123:223-43. doi: 10.1007/978-1-62703-968-0_15.
9
Deep sequencing of phage display libraries to support antibody discovery.
Methods. 2013 Mar 15;60(1):99-110. doi: 10.1016/j.ymeth.2013.03.001. Epub 2013 Mar 14.
10
Emerging methods in protein co-evolution.
Nat Rev Genet. 2013 Apr;14(4):249-61. doi: 10.1038/nrg3414. Epub 2013 Mar 5.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验