• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于基因组序列的高效隐私保护可变长度子串匹配

Efficient privacy-preserving variable-length substring match for genome sequence.

作者信息

Nakagawa Yoshiki, Ohata Satsuya, Shimizu Kana

机构信息

Department of Computer Science and Engineering, Waseda University, Tokyo, Japan.

Self-employment, Tokyo, Japan.

出版信息

Algorithms Mol Biol. 2022 Apr 26;17(1):9. doi: 10.1186/s13015-022-00211-1.

DOI:10.1186/s13015-022-00211-1
PMID:35473587
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9040336/
Abstract

The development of a privacy-preserving technology is important for accelerating genome data sharing. This study proposes an algorithm that securely searches a variable-length substring match between a query and a database sequence. Our concept hinges on a technique that efficiently applies FM-index for a secret-sharing scheme. More precisely, we developed an algorithm that can achieve a secure table lookup in such a way that [Formula: see text] is computed for a given depth of recursion where [Formula: see text] is an initial position, and V is a vector. We used the secure table lookup for vectors created based on FM-index. The notable feature of the secure table lookup is that time, communication, and round complexities are not dependent on the table length N, after the query input. Therefore, a substring match by reference to the FM-index-based table can also be conducted independently against the database length, and the entire search time is dramatically improved compared to previous approaches. We conducted an experiment using a human genome sequence with the length of 10 million as the database and a query with the length of 100 and found that the query response time of our protocol was at least three orders of magnitude faster than a non-indexed database search protocol under the realistic computation/network environment.

摘要

开发隐私保护技术对于加速基因组数据共享至关重要。本研究提出了一种算法,该算法可安全地搜索查询与数据库序列之间的可变长度子串匹配。我们的概念基于一种将FM索引有效应用于秘密共享方案的技术。更确切地说,我们开发了一种算法,该算法可以通过以下方式实现安全的表查找:对于给定的递归深度计算[公式:见原文],其中[公式:见原文]是初始位置,V是一个向量。我们将安全表查找用于基于FM索引创建的向量。安全表查找的显著特点是,在查询输入之后,时间、通信和轮次复杂度不依赖于表长度N。因此,通过参考基于FM索引的表进行子串匹配也可以独立于数据库长度进行,并且与以前的方法相比,整个搜索时间有了显著改善。我们使用长度为10 million的人类基因组序列作为数据库,并使用长度为100的查询进行了实验,发现在实际计算/网络环境下,我们协议的查询响应时间比非索引数据库搜索协议至少快三个数量级。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/dbe61b2eabb2/13015_2022_211_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/9936a92d783d/13015_2022_211_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/b080fdaaa4b9/13015_2022_211_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/353e4d9ab1c8/13015_2022_211_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/c9021a492edf/13015_2022_211_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/e5ecdbd8e119/13015_2022_211_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/5439fa3e7ec1/13015_2022_211_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/50dd2894a53f/13015_2022_211_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/dbe61b2eabb2/13015_2022_211_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/9936a92d783d/13015_2022_211_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/b080fdaaa4b9/13015_2022_211_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/353e4d9ab1c8/13015_2022_211_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/c9021a492edf/13015_2022_211_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/e5ecdbd8e119/13015_2022_211_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/5439fa3e7ec1/13015_2022_211_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/50dd2894a53f/13015_2022_211_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3da/9040336/dbe61b2eabb2/13015_2022_211_Fig8_HTML.jpg

相似文献

1
Efficient privacy-preserving variable-length substring match for genome sequence.用于基因组序列的高效隐私保护可变长度子串匹配
Algorithms Mol Biol. 2022 Apr 26;17(1):9. doi: 10.1186/s13015-022-00211-1.
2
An optimized FM-index library for nucleotide and amino acid search.一个用于核苷酸和氨基酸搜索的优化FM索引库。
Algorithms Mol Biol. 2021 Dec 31;16(1):25. doi: 10.1186/s13015-021-00204-6.
3
Efficient Privacy-Preserving -Means Clustering from Secret-Sharing-Based Secure Three-Party Computation.基于秘密共享的安全三方计算实现高效隐私保护的均值聚类
Entropy (Basel). 2022 Aug 18;24(8):1145. doi: 10.3390/e24081145.
4
Secure Wavelet Matrix: Alphabet-Friendly Privacy-Preserving String Search for Bioinformatics.安全小波矩阵:生物信息学中友好字母的隐私保护字符串搜索
IEEE/ACM Trans Comput Biol Bioinform. 2019 Sep-Oct;16(5):1675-1684. doi: 10.1109/TCBB.2018.2814039. Epub 2018 Mar 8.
5
HIA: a genome mapper using hybrid index-based sequence alignment.HIA:一种使用基于混合索引的序列比对的基因组映射器。
Algorithms Mol Biol. 2015 Dec 23;10:30. doi: 10.1186/s13015-015-0062-4. eCollection 2015.
6
Efficient privacy-preserving string search and an application in genomics.高效的隐私保护字符串搜索及其在基因组学中的应用。
Bioinformatics. 2016 Jun 1;32(11):1652-61. doi: 10.1093/bioinformatics/btw050. Epub 2016 Mar 2.
7
Secure multiparty computation of a comparison problem.比较问题的安全多方计算
Springerplus. 2016 Sep 5;5(1):1489. doi: 10.1186/s40064-016-3061-0. eCollection 2016.
8
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
9
Round-Efficient Secure Inference Based on Masked Secret Sharing for Quantized Neural Network.基于掩码秘密共享的量化神经网络的轮次高效安全推理
Entropy (Basel). 2023 Feb 20;25(2):389. doi: 10.3390/e25020389.
10
Secure and Privacy-Preserving Body Sensor Data Collection and Query Scheme.安全且保护隐私的身体传感器数据收集与查询方案
Sensors (Basel). 2016 Feb 1;16(2):179. doi: 10.3390/s16020179.

引用本文的文献

1
Proxy panels enable privacy-aware outsourcing of genotype imputation.代理面板实现了基因型填充的隐私保护外包。
Genome Res. 2025 Feb 14;35(2):326-339. doi: 10.1101/gr.278934.124.

本文引用的文献

1
Efficient and Precise Secure Generalized Edit Distance and Beyond.高效且精确的安全广义编辑距离及其他
IEEE Trans Dependable Secure Comput. 2022 Jan-Feb;19(1):579-590. doi: 10.1109/tdsc.2020.2984219. Epub 2020 Apr 2.
2
Privately computing set-maximal matches in genomic data.在基因组数据中进行私有集最大匹配计算。
BMC Med Genomics. 2020 Jul 21;13(Suppl 7):72. doi: 10.1186/s12920-020-0718-x.
3
Federated discovery and sharing of genomic data using Beacons.使用信标进行基因组数据的联合发现与共享。
Nat Biotechnol. 2019 Mar;37(3):220-224. doi: 10.1038/s41587-019-0046-x.
4
Secure Wavelet Matrix: Alphabet-Friendly Privacy-Preserving String Search for Bioinformatics.安全小波矩阵:生物信息学中友好字母的隐私保护字符串搜索
IEEE/ACM Trans Comput Biol Bioinform. 2019 Sep-Oct;16(5):1675-1684. doi: 10.1109/TCBB.2018.2814039. Epub 2018 Mar 8.
5
Privacy-preserving techniques of genomic data-a survey.基因组数据隐私保护技术综述。
Brief Bioinform. 2019 May 21;20(3):887-895. doi: 10.1093/bib/bbx139.
6
A hybrid cloud read aligner based on MinHash and kmer voting that preserves privacy.一种基于 MinHash 和 kmer 投票的混合云读取对齐器,可保护隐私。
Nat Commun. 2017 May 16;8:15311. doi: 10.1038/ncomms15311.
7
Efficient privacy-preserving string search and an application in genomics.高效的隐私保护字符串搜索及其在基因组学中的应用。
Bioinformatics. 2016 Jun 1;32(11):1652-61. doi: 10.1093/bioinformatics/btw050. Epub 2016 Mar 2.
8
Privacy in the Genomic Era.基因组时代的隐私问题。
ACM Comput Surv. 2015 Sep;48(1). doi: 10.1145/2767007.
9
The Matchmaker Exchange: a platform for rare disease gene discovery.媒人交换平台:罕见病基因发现的平台。
Hum Mutat. 2015 Oct;36(10):915-21. doi: 10.1002/humu.22858.
10
Routes for breaching and protecting genetic privacy.突破和保护遗传隐私的途径。
Nat Rev Genet. 2014 Jun;15(6):409-21. doi: 10.1038/nrg3723. Epub 2014 May 8.