Department of Molecular Pharmacology and Experimental Therapeutics (Windland Smith Rice Sudden Death Genomics Laboratory). Department of Cardiovascular Medicine, Division of Heart Rhythm Services (Windland Smith Rice Genetic Heart Rhythm Clinic). Department of Pediatric and Adolescent Medicine, Division of Pediatric Cardiology, Mayo Clinic.
Circ Genom Precis Med. 2024 Oct;17(5):e004584. doi: 10.1161/CIRCGEN.124.004584. Epub 2024 Aug 9.
Genetic testing for cardiac channelopathies is the standard of care. However, many rare genetic variants remain classified as variants of uncertain significance (VUS) due to lack of epidemiological and functional data. Whether deep protein language models may aid in VUS resolution remains unknown. Here, we set out to compare how 2 deep protein language models perform at VUS resolution in the 3 most common long-QT syndrome-causative genes compared with the gold-standard patch clamp.
A total of 72 rare nonsynonymous VUS (9 19 , and 50 ) were engineered by site-directed mutagenesis and expressed in either HEK293 cells or TSA201 cells. Whole-cell patch-clamp technique was used to functionally characterize these variants. The protein language models, evolutionary scale modeling, version 1b and AlphaMissense, were used to predict the variant effect of missense variants and compared with patch clamp.
Considering variants in all 3 genes, the evolutionary scale modeling, version 1b model had a receiver operating characteristic curve-area under the curve of 0.75 (=0.0003). It had a sensitivity of 88% and a specificity of 50%. AlphaMissense performed well compared with patch-clamp with an receiver operating characteristic curve-area under the curve of 0.85 (<0.0001), sensitivity of 80%, and specificity of 76%.
Deep protein language models aid in VUS resolution with high sensitivity but lower specificity. Thus, these tools cannot fully replace functional characterization but can aid in reducing the number of variants that may require functional analysis.
心脏通道病的基因检测是标准的治疗方法。然而,由于缺乏流行病学和功能数据,许多罕见的遗传变异仍然被归类为意义不明的变异(VUS)。深度蛋白质语言模型是否有助于解决 VUS 仍然未知。在这里,我们着手比较 2 种深度蛋白质语言模型在与金标准膜片钳相比的 3 种最常见的长 QT 综合征致病基因中的 VUS 分辨率方面的表现。
通过定点诱变工程共设计了 72 种罕见的非同义 VUS(9 个、9 个和 50 个),并在 HEK293 细胞或 TSA201 细胞中表达。使用全细胞膜片钳技术对这些变体进行功能特征分析。使用蛋白质语言模型,进化尺度建模,版本 1b 和 AlphaMissense,预测错义变体的变体效应,并与膜片钳进行比较。
考虑到所有 3 个基因中的变体,进化尺度建模,版本 1b 模型的接受者操作特征曲线下面积为 0.75(=0.0003)。它的灵敏度为 88%,特异性为 50%。与膜片钳相比,AlphaMissense 的表现较好,接受者操作特征曲线下面积为 0.85(<0.0001),灵敏度为 80%,特异性为 76%。
深度蛋白质语言模型具有较高的灵敏度,但特异性较低,有助于解决 VUS。因此,这些工具不能完全替代功能特征分析,但可以帮助减少需要功能分析的变体数量。