Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
Department of Biological Chemistry, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
Mol Syst Biol. 2021 Feb;17(2):e9840. doi: 10.15252/msb.20209840.
The integration of proteomic, transcriptomic, and genetic variant annotation data will improve our understanding of genotype-phenotype associations. Due, in part, to challenges associated with accurate inter-database mapping, such multi-omic studies have not extended to chemoproteomics, a method that measures the intrinsic reactivity and potential "druggability" of nucleophilic amino acid side chains. Here, we evaluated mapping approaches to match chemoproteomic-detected cysteine and lysine residues with their genetic coordinates. Our analysis revealed that database update cycles and reliance on stable identifiers can lead to pervasive misidentification of labeled residues. Enabled by this examination of mapping strategies, we then integrated our chemoproteomics data with computational methods for predicting genetic variant pathogenicity, which revealed that codons of highly reactive cysteines are enriched for genetic variants that are predicted to be more deleterious and allowed us to identify and functionally characterize a new damaging residue in the cysteine protease caspase-8. Our study provides a roadmap for more precise inter-database mapping and points to untapped opportunities to improve the predictive power of pathogenicity scores and to advance prioritization of putative druggable sites.
蛋白质组学、转录组学和遗传变异注释数据的整合将提高我们对基因型-表型相关性的理解。部分原因是由于准确的数据库间映射相关的挑战,这种多组学研究尚未扩展到化学蛋白质组学,化学蛋白质组学是一种测量亲核氨基酸侧链固有反应性和潜在“可成药性”的方法。在这里,我们评估了匹配化学蛋白质组学检测到的半胱氨酸和赖氨酸残基与其遗传坐标的映射方法。我们的分析表明,数据库更新周期和对稳定标识符的依赖可能导致标记残基的普遍错误识别。通过对映射策略的这种检查,我们随后将我们的化学蛋白质组学数据与预测遗传变异致病性的计算方法进行了整合,这表明高度反应性半胱氨酸的密码子中富含预测更具危害性的遗传变异,这使我们能够识别并对新的半胱氨酸蛋白酶 caspase-8 中的破坏性残基进行功能表征。我们的研究为更精确的数据库间映射提供了路线图,并指出了改进致病性评分的预测能力和推进潜在可成药性位点优先级的未开发机会。