Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China.
Key Laboratory of Marine Drugs, Chinese Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao 266003, China.
Math Biosci Eng. 2023 Jan 31;20(4):6174-6190. doi: 10.3934/mbe.2023266.
With the development of next-generation protein sequencing technologies, sequence assembly algorithm has become a key technology for de novo sequencing process. At present, the existing methods can address the assembly of an unknown single protein chain. However, for monoclonal antibodies with light and heavy chains, the assembly is still an unsolved question. To address this problem, we propose a new assembly method, DBAS, which integrates the quality scores and sequence alignment scores from de novo sequencing peptides into a weighted de Bruijn graph to assemble the final protein sequences. The established method is used to assembling sequences from two datasets with mixed light and heavy chains from antibodies. The results show that the DBAS can assemble long antibody sequences for both mixed light and heavy chains and single chains. In addition, DBAS is able to distinguish the light and heavy chains by using BLAST sequence alignment. The results show that the algorithm has good performance for both target sequence coverage and contig assembly accuracy.
随着下一代蛋白质测序技术的发展,序列组装算法已成为从头测序过程中的关键技术。目前,现有的方法可以解决未知单条蛋白质链的组装问题。然而,对于具有轻链和重链的单克隆抗体,组装仍然是一个未解决的问题。为了解决这个问题,我们提出了一种新的组装方法 DBAS,它将从头测序肽的质量分数和序列比对分数集成到加权 de Bruijn 图中,以组装最终的蛋白质序列。该方法用于组装来自两个包含抗体混合轻链和重链的数据集的序列。结果表明,DBAS 可以组装长的混合轻链和重链以及单链抗体序列。此外,DBAS 能够通过 BLAST 序列比对区分轻链和重链。结果表明,该算法在目标序列覆盖率和重叠群组装准确性方面都具有良好的性能。