Kim Yongwook Bryce, Hemberg Erik, O'Reilly Una-May
Annu Int Conf IEEE Eng Med Biol Soc. 2016 Aug;2016:2479-2483. doi: 10.1109/EMBC.2016.7591233.
We introduce stratified locality-sensitive hashing (SLSH) for retrieving similar physiological waveform time series. SLSH further accelerates the sublinear retrieval time obtained by the standard locality-sensitive hashing (LSH) method. The standard family of locality-sensitive hash functions is limited to provide only a single perspective on the data due to its one-to-one relationship to a distinct distance function for measuring similarity. SLSH incorporates multiple locality-sensitive hash families with various distance functions enabling it to examine the data with more diverse and refined perspectives. We provide the procedures of SLSH with locality-sensitive hash families for the l1 and the cosine distances, and compare its performance to the standard LSH on an arterial blood pressure time series data extracted from the physiological waveform repository of the MIMIC2 database. The time to retrieve five most similar waveforms by SLSH is 14 times faster than the linear search and 1.7 times faster than the standard LSH when we allow 5% decrease in accuracy as a trade-off.
我们引入分层局部敏感哈希(SLSH)来检索相似的生理波形时间序列。SLSH进一步加快了通过标准局部敏感哈希(LSH)方法获得的亚线性检索时间。由于标准的局部敏感哈希函数族与用于测量相似度的独特距离函数存在一一对应关系,所以它只能从单一角度看待数据。SLSH结合了多个具有不同距离函数的局部敏感哈希族,使其能够从更多样化和精细的角度审视数据。我们给出了针对l1和余弦距离使用局部敏感哈希族的SLSH过程,并在从MIMIC2数据库的生理波形存储库中提取的动脉血压时间序列数据上,将其性能与标准LSH进行了比较。当我们允许精度降低5%作为权衡时,通过SLSH检索五个最相似波形的时间比线性搜索快14倍,比标准LSH快1.7倍。