Suppr超能文献

用纳米孔识别未折叠全蛋白中的残基:基于线性不等式的理论模型

Identifying residues in unfolded whole proteins with a nanopore: a theoretical model based on linear inequalities.

作者信息

Sampath G

出版信息

bioRxiv. 2023 Sep 3:2023.08.31.555759. doi: 10.1101/2023.08.31.555759.

Abstract

A theoretical model is proposed for the identification of individual amino acids (AAs) in an unfolded whole protein's primary sequence. It is based in part on a recent report (. 41, 1130-1139, 2023) that describes the unfolding and translocation of whole proteins at constant speed through a biological nanopore (alpha-Hemolysin) of length 5 nm with a residue dwell time inside the pore of ~10 μs. Here current blockade levels in the pore due to the translocating protein are assumed to be measured with a limited precision of 70 nm and a bandwidth of 20 KHz for measurement with a low-bandwidth detector. Exclusion volumes in two pores of slightly different lengths are used as a computational proxy for the blockade signal; subsequence exclusion volume differences along the protein sequence are computed from the sampled translocation signals in the two pores relatively shifted multiple times. These are then converted into a system of linear inequalities that can be solved with linear programming and related methods; residues are coarsely identified as belonging to one of 4 subsets of the 20 standard AAs. To obtain the exact identity of a residue an artifice analogous to the use of base-specific tags for DNA sequencing with a nanopore ( 113, 5233-5238, 2016) is used. Conjugates that add volume are attached to a given AA type, this biases the set of inequalities toward the volume of the conjugated AA, from this biased set the position of occurrence of every residue of the AA type in the whole sequence is extracted. By applying this step separately to each of the 20 standard AAs the full sequence can be obtained. The procedure is illustrated with a protein in the human proteome (Uniprot id UP000005640_9606).

摘要

提出了一种理论模型,用于识别未折叠的完整蛋白质一级序列中的单个氨基酸(AA)。该模型部分基于最近的一份报告(. 41, 1130 - 1139, 2023),该报告描述了完整蛋白质以恒定速度通过长度为5 nm的生物纳米孔(α - 溶血素)的展开和转运过程,蛋白质在孔内的驻留时间约为10 μs。在此,假设通过低带宽检测器测量时,由于转运蛋白质导致的孔内电流阻断水平的测量精度有限,为70 nm,带宽为20 KHz。将两个长度略有不同的孔中的排除体积用作阻断信号的计算代理;沿着蛋白质序列的子序列排除体积差异是根据两个孔中相对多次移位的采样转运信号计算得出的。然后将这些差异转换为一个线性不等式系统,可以用线性规划和相关方法求解;残基被粗略地识别为属于20种标准氨基酸的4个子集之一。为了获得残基的确切身份,使用了一种类似于在纳米孔DNA测序中使用碱基特异性标签的技巧(113, 5233 - 5238, 2016)。将增加体积的共轭物连接到给定的氨基酸类型上,这会使不等式组偏向共轭氨基酸的体积,从这个有偏差的组中提取整个序列中该氨基酸类型每个残基的出现位置。通过对20种标准氨基酸中的每一种分别应用此步骤,可以获得完整序列。该过程通过人类蛋白质组中的一种蛋白质(Uniprot id UP000005640_9606)进行了说明。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b97/10491143/be87554bd885/nihpp-2023.08.31.555759v1-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验