School of Biological Sciences and Edinburgh Medical School: Biomedical Sciences, University of Edinburgh, The King's Buildings, Edinburgh, Scotland, United Kingdom.
PLoS One. 2018 Feb 23;13(2):e0193332. doi: 10.1371/journal.pone.0193332. eCollection 2018.
The design of highly diverse phage display libraries is based on assumption that DNA bases are incorporated at similar rates within the randomized sequence. As library complexity increases and expected copy numbers of unique sequences decrease, the exploration of library space becomes sparser and the presence of truly random sequences becomes critical. We present the program PuLSE (Phage Library Sequence Evaluation) as a tool for assessing randomness and therefore diversity of phage display libraries. PuLSE runs on a collection of sequence reads in the fastq file format and generates tables profiling the library in terms of unique DNA sequence counts and positions, translated peptide sequences, and normalized 'expected' occurrences from base to residue codon frequencies. The output allows at-a-glance quantitative quality control of a phage library in terms of sequence coverage both at the DNA base and translated protein residue level, which has been missing from toolsets and literature. The open source program PuLSE is available in two formats, a C++ source code package for compilation and integration into existing bioinformatics pipelines and precompiled binaries for ease of use.
高度多样化的噬菌体展示文库的设计基于这样的假设,即在随机序列中,DNA 碱基以相似的速率被掺入。随着文库复杂性的增加和独特序列的预期拷贝数的减少,文库空间的探索变得更加稀疏,真正随机序列的存在变得至关重要。我们提出了 PuLSE(噬菌体文库序列评估)程序,作为评估噬菌体展示文库随机性和多样性的工具。PuLSE 可以在 fastq 文件格式的序列读取集上运行,并生成表,根据独特的 DNA 序列计数和位置、翻译的肽序列以及从碱基到残基密码子频率的归一化“预期”出现次数,对文库进行分析。输出允许从 DNA 碱基和翻译蛋白残基水平上,对噬菌体文库进行即时的定量质量控制,这是工具集和文献中缺失的。PuLSE 是一个开源程序,有两种格式,一个是用于编译和集成到现有生物信息学管道的 C++源代码包,另一个是易于使用的预编译二进制文件。