HUSCAP logo Hokkaido Univ. logo

Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >

VHP: Approximate Nearest Neighbor Search via Virtual Hypersphere Partitioning

This item is licensed under: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

Files in This Item:
3397230.3397240.pdf740.62 kBPDFView/Open
Please use this identifier to cite or link to this item:http://hdl.handle.net/2115/79717

Title: VHP: Approximate Nearest Neighbor Search via Virtual Hypersphere Partitioning
Authors: Lu, Kejing Browse this author
Wang, Hongya Browse this author
Wang, Wei Browse this author
Kudo, Mineichi Browse this author →KAKEN DB
Issue Date: May-2020
Publisher: Association for Computing Machinery (ACM)
Journal Title: Proceedings Of The Vldb Endowment
Volume: 13
Issue: 9
Start Page: 1443
End Page: 1455
Publisher DOI: 10.14778/3397230.3397240
Abstract: Locality sensitive hashing (LSH) is a widely practiced c-approximate nearest neighbor(c-ANN) search algorithm in high dimensional spaces. The state-of-the-art LSH based algorithm searches an unbounded and irregular space to identify candidates, which jeopardizes the efficiency. To address this issue, we introduce the concept of virtual hypersphere partitioning. The core idea is to impose a virtual hypersphere, centered at the query, in the original feature space and only examine points inside the hypersphere. The search space of a hypersphere is isotropic and bounded, and thus more efficient than the existing one. In practice, we use multiple physical hyperspheres with different radii in corresponding projection subspaces to emulate the single virtual hypersphere. We also developed a principled method to compute the hypersphere radii for given success probability. Based on virtual hypersphere partitioning, we propose a novel disk-based indexing and searching scheme VHP to answer c-ANN queries. In the indexing phase, VHP stores LSH projections with independent B+-trees. To process a query, VHP keeps increasing the radii of physical hyperspheres coordinately, which in effect amounts to enlarging the virtual hypersphere, to accommodate more candidates until the success probability is met. Rigorous theoretical analysis shows that the proposed algorithm supports c-ANN search for arbitrarily small c >= 1 with probability guarantee. Extensive experiments on a variety of datasets, including the billion-scale ones, demonstrate that VHP could achieve different tradeoffs between efficiency and accuracy, and achieves up to 2x speedup in running time over the state-of-the-art methods.
Rights: https://creativecommons.org/licenses/by-nc-nd/4.0/
Type: article
URI: http://hdl.handle.net/2115/79717
Appears in Collections:情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)

Submitter: 工藤 峰一

Export metadata:

OAI-PMH ( junii2 , jpcoar )

MathJax is now OFF:


 

 - Hokkaido University