• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

论共享临床蛋白质组学数据的隐私风险

On the privacy risks of sharing clinical proteomics data.

作者信息

Li Sujun, Bandeira Nuno, Wang Xiaofeng, Tang Haixu

机构信息

School of Informatics and Computing, Indiana University, Bloomington, IN, USA.

Department of Computer Science and Engineering, University of California, San Diego, CA, USA.

出版信息

AMIA Jt Summits Transl Sci Proc. 2016 Aug 31;2016:122-31. eCollection 2016.

PMID:27595046
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5009298/
Abstract

Although the privacy issues in human genomic studies are well known, the privacy risks in clinical proteomic data have not been thoroughly studied. As a proof of concept, we reported a comprehensive analysis of the privacy risks in clinical proteomic data. It showed that a small number of peptides carrying the minor alleles (referred to as the minor allelic peptides) at non-synonymous single nucleotide polymorphism (nsSNP) sites can be identified in typical clinical proteomic datasets acquired from the blood/serum samples of individual patient, from which the patient can be identified with high confidence. Our results suggested the presence of significant privacy risks in raw clinical proteomic data. However, these risks can be mitigated by a straightforward pre-processing step of the raw data that removing a very small fraction (0.1%, 7.14 out of 7,504 spectra on average) of MS/MS spectra identified as the minor allelic peptides, which has little or no impact on the subsequent analysis (and re-use) of these datasets.

摘要

虽然人类基因组研究中的隐私问题广为人知,但临床蛋白质组数据中的隐私风险尚未得到充分研究。作为概念验证,我们报告了对临床蛋白质组数据隐私风险的全面分析。结果表明,在从个体患者的血液/血清样本获取的典型临床蛋白质组数据集中,可以识别出少数在非同义单核苷酸多态性(nsSNP)位点携带次要等位基因的肽段(称为次要等位基因肽段),据此能够高度准确地识别出患者。我们的结果表明原始临床蛋白质组数据存在重大隐私风险。然而,通过对原始数据进行一个简单的预处理步骤,即去除被鉴定为次要等位基因肽段的极小部分(0.1%,平均7504个质谱/质谱图谱中有7.14个),这些风险可以得到缓解,而这对这些数据集的后续分析(和再利用)几乎没有影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e327/5009298/c48dff404d54/2380571f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e327/5009298/c48dff404d54/2380571f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e327/5009298/c48dff404d54/2380571f1.jpg

相似文献

1
On the privacy risks of sharing clinical proteomics data.论共享临床蛋白质组学数据的隐私风险
AMIA Jt Summits Transl Sci Proc. 2016 Aug 31;2016:122-31. eCollection 2016.
2
Detection and validation of non-synonymous coding SNPs from orthogonal analysis of shotgun proteomics data.通过鸟枪法蛋白质组学数据的正交分析检测和验证非同义编码单核苷酸多态性
J Proteome Res. 2007 Jun;6(6):2331-40. doi: 10.1021/pr0700908. Epub 2007 May 9.
3
Between Access and Privacy: Challenges in Sharing Health Data.在获取与隐私之间:共享健康数据面临的挑战
Yearb Med Inform. 2018 Aug;27(1):55-59. doi: 10.1055/s-0038-1641216. Epub 2018 Aug 29.
4
Re-Identification Risk versus Data Utility for Aggregated Mobility Research Using Mobile Phone Location Data.使用手机位置数据进行汇总移动性研究时的再识别风险与数据效用
PLoS One. 2015 Oct 15;10(10):e0140589. doi: 10.1371/journal.pone.0140589. eCollection 2015.
5
Privacy Risks of Sharing Data from Environmental Health Studies.环境健康研究数据共享的隐私风险。
Environ Health Perspect. 2020 Jan;128(1):17008. doi: 10.1289/EHP4817. Epub 2020 Jan 10.
6
Chromosome-Based Proteomic Study for Identifying Novel Protein Variants from Human Hippocampal Tissue Using Customized neXtProt and GENCODE Databases.基于染色体的蛋白质组学研究,利用定制的neXtProt和GENCODE数据库从人类海马组织中鉴定新型蛋白质变体。
J Proteome Res. 2015 Dec 4;14(12):5028-37. doi: 10.1021/acs.jproteome.5b00472. Epub 2015 Nov 16.
7
Evaluation of Privacy Risks of Patients' Data in China: Case Study.中国患者数据隐私风险评估:案例研究
JMIR Med Inform. 2020 Feb 5;8(2):e13046. doi: 10.2196/13046.
8
Protein-based forensic identification using genetically variant peptides in human bone.利用人类骨骼中基因变异肽段进行基于蛋白质的法医鉴定。
Forensic Sci Int. 2018 Jul;288:89-96. doi: 10.1016/j.forsciint.2018.04.016. Epub 2018 Apr 22.
9
In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics.使用多个搜索引擎和明确的指标对蛋白质推断算法进行深入分析。
J Proteomics. 2017 Jan 6;150:170-182. doi: 10.1016/j.jprot.2016.08.002. Epub 2016 Aug 4.
10
Comparison of protein expression levels and proteomically-inferred genotypes using human hair from different body sites.比较不同身体部位的人发中的蛋白质表达水平和基于蛋白质组学推断的基因型。
Forensic Sci Int Genet. 2019 Jul;41:19-23. doi: 10.1016/j.fsigen.2019.03.009. Epub 2019 Mar 11.

引用本文的文献

1
Assessing Privacy Vulnerabilities in Genetic Data Sets: Scoping Review.评估基因数据集的隐私漏洞:范围综述
JMIR Bioinform Biotechnol. 2024 May 27;5:e54332. doi: 10.2196/54332.
2
Large scale proteomic studies create novel privacy considerations.大规模蛋白质组学研究带来新的隐私问题。
Sci Rep. 2023 Jun 7;13(1):9254. doi: 10.1038/s41598-023-34866-6.
3
Identifying individuals using proteomics: are we there yet?利用蛋白质组学识别个体:我们做到了吗?

本文引用的文献

1
LC-MS/MS-based serum proteomics for identification of candidate biomarkers for hepatocellular carcinoma.基于液相色谱-串联质谱的血清蛋白质组学用于鉴定肝细胞癌的候选生物标志物
Proteomics. 2015 Jul;15(13):2369-81. doi: 10.1002/pmic.201400364. Epub 2015 Apr 29.
2
Forensic DNA Phenotyping: Predicting human appearance from crime scene material for investigative purposes.法医DNA表型分析:为调查目的从犯罪现场材料预测人类外貌特征。
Forensic Sci Int Genet. 2015 Sep;18:33-48. doi: 10.1016/j.fsigen.2015.02.003. Epub 2015 Feb 16.
3
MS-GF+ makes progress towards a universal database search tool for proteomics.
Front Mol Biosci. 2022 Nov 29;9:1062031. doi: 10.3389/fmolb.2022.1062031. eCollection 2022.
4
Data Management of Sensitive Human Proteomics Data: Current Practices, Recommendations, and Perspectives for the Future.敏感人类蛋白质组学数据的数据管理:当前实践、建议和未来展望。
Mol Cell Proteomics. 2021;20:100071. doi: 10.1016/j.mcpro.2021.100071. Epub 2021 Mar 10.
5
Ethical Principles, Constraints and Opportunities in Clinical Proteomics.临床蛋白质组学中的伦理原则、限制因素与机遇
Mol Cell Proteomics. 2021 Jan 14;20:100046. doi: 10.1016/j.mcpro.2021.100046.
6
Beyond Genes: Re-Identifiability of Proteomic Data and Its Implications for Personalized Medicine.超越基因:蛋白质组数据的可重新识别性及其对个性化医学的影响。
Genes (Basel). 2019 Sep 5;10(9):682. doi: 10.3390/genes10090682.
7
A Golden Age for Working with Public Proteomics Data.处理公共蛋白质组学数据的黄金时代。
Trends Biochem Sci. 2017 May;42(5):333-341. doi: 10.1016/j.tibs.2017.01.001. Epub 2017 Jan 22.
MS-GF+朝着蛋白质组学通用数据库搜索工具的方向取得了进展。
Nat Commun. 2014 Oct 31;5:5277. doi: 10.1038/ncomms6277.
4
ProteomeXchange provides globally coordinated proteomics data submission and dissemination.蛋白质组学交换库提供全球协调的蛋白质组学数据提交和传播服务。
Nat Biotechnol. 2014 Mar;32(3):223-6. doi: 10.1038/nbt.2839.
5
Computational framework for identification of intact glycopeptides in complex samples.用于鉴定复杂样本中完整糖肽的计算框架。
Anal Chem. 2014 Jan 7;86(1):453-63. doi: 10.1021/ac402338u. Epub 2013 Dec 10.
6
RefSeq: an update on mammalian reference sequences.RefSeq:哺乳动物参考序列的更新。
Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63. doi: 10.1093/nar/gkt1114. Epub 2013 Nov 19.
7
Large-scale mass spectrometric detection of variant peptides resulting from nonsynonymous nucleotide differences.由非同义核苷酸差异导致的变异肽段的大规模质谱检测。
J Proteome Res. 2014 Jan 3;13(1):228-40. doi: 10.1021/pr4009207. Epub 2013 Nov 11.
8
Identifying personal genomes by surname inference.姓氏推断识别个人基因组。
Science. 2013 Jan 18;339(6117):321-4. doi: 10.1126/science.1229566.
9
The PRoteomics IDEntifications (PRIDE) database and associated tools: status in 2013.PRIDE 数据库及相关工具:2013 年的现状。
Nucleic Acids Res. 2013 Jan;41(Database issue):D1063-9. doi: 10.1093/nar/gks1262. Epub 2012 Nov 29.
10
False discovery rates in spectral identification.光谱识别中的假发现率。
BMC Bioinformatics. 2012;13 Suppl 16(Suppl 16):S2. doi: 10.1186/1471-2105-13-S16-S2. Epub 2012 Nov 5.