Krantz Bryan A
Department of Microbial Pathogenesis, School of Dentistry, University of Maryland, Baltimore, 650 W. Baltimore Street, Baltimore, MD 21201, U.S.A.
bioRxiv. 2025 May 6:2025.05.04.652126. doi: 10.1101/2025.05.04.652126.
Rapid and accurate detection of peptide biomarkers using nanopore biosensors is critical for disease diagnosis and other biomedical applications. Processing large, complex single-channel translocation data streams poses a significant challenge for peptide analyte classification. Here, we present a supervised deep learning data processing pipeline for peptide classification from translocation events. The first stage employs a convolutional and recurrent neural network, adapted from the Deep-Channel multi-channel classifier, to accurately classify raw current recordings into discrete conductance states, including partially blocked sub-conductance intermediates. The second stage, peptide classification, utilizes a novel branched input network with a temporal convolutional network for processing translocation event conductance state sequences and a dense network for incorporating computed event-level and global kinetic features. Using idealized simulated multi-state translocation data for seven peptides, we demonstrate high classification accuracy (0.99) when global features are included alongside event-level features. For classifying mixture samples, where only event-level features are applicable, performance is more modest (0.68 accuracy). Peptide mixture predictions showed reasonable accuracy (MAE 0.045-0.161), although misclassification resulted in false positives. Event stochasticity and the fact that some peptides possessed similar kinetic parameters posed challenging for event-level prediction. However, vote aggregation from translocation event streams achieves perfect 100% accuracy, when predicting pure peptide samples. This proof-of-concept study demonstrates a robust deep learning framework for nanopore peptide classification using simulated data, laying the groundwork for classifying peptides from complex mixtures using real experimental data with the anthrax toxin protective antigen nanopore.
使用纳米孔生物传感器快速准确地检测肽生物标志物对于疾病诊断和其他生物医学应用至关重要。处理大型、复杂的单通道转运数据流对肽分析物分类构成了重大挑战。在此,我们提出了一种用于从转运事件中进行肽分类的监督深度学习数据处理管道。第一阶段采用了一种基于深度通道多通道分类器改编的卷积循环神经网络,以将原始电流记录准确分类为离散的电导状态,包括部分阻塞的亚电导中间体。第二阶段,即肽分类,利用一种新颖的分支输入网络,其中一个时间卷积网络用于处理转运事件电导状态序列,一个全连接网络用于纳入计算出的事件级和全局动力学特征。使用七种肽的理想化模拟多状态转运数据,我们证明当全局特征与事件级特征一起包含时,分类准确率很高(0.99)。对于仅适用于事件级特征的混合样本分类,性能较为一般(准确率为0.68)。肽混合物预测显示出合理的准确率(平均绝对误差为0.045 - 0.161),尽管误分类会导致假阳性。事件的随机性以及一些肽具有相似动力学参数这一事实对事件级预测构成了挑战。然而,在预测纯肽样本时,来自转运事件流的投票聚合实现了完美的100%准确率。这项概念验证研究展示了一个使用模拟数据进行纳米孔肽分类的强大深度学习框架,为使用炭疽毒素保护性抗原纳米孔的真实实验数据对复杂混合物中的肽进行分类奠定了基础。