向量检索(近似近邻检索 ANN)技术:有用?没用?本质很简单但被神话 【0】论文整理 SIGMOD、VLDB、NIPS...

326 阅读3分钟

本文首先将展示一个向量检索领域的精彩论文集,即近似最近邻搜索(ANN搜索,ANNS)。该集合旨在收集高质量的研究论文、文章和资源,提供有价值的见解和进展。这项技术是矢量数据库、检索增强生成(RAG)、大规模信息检索、推荐系统、药物发现、图像搜索等领域的重要组成部分。

首先本文将持续更新,其次在后续的文章中将深入浅出地分析向量检索领域的文章,把这个被变得看起来很大的领域变小一点,很多东西也没那么复杂,所以该领域的很多著名工作都是启发式的。

什么是向量检索及其应用

一些解释:

应用:

论文 

标题链接大概分类备注
Approximate Nearest Neighbor Search on High Dimensional Data — Experiments, Analyses, and ImprovementLinkSurvey
Graph-based Nearest Neighbor Search: From Practice to TheoryLinkTheoretical
FINGER: Fast Inference for Graph-based Approximate Nearest Neighbor SearchLinkGraph-based
HVS: hierarchical graph structure based on Voronoi diagrams for solving approximate nearest neighbor searchLinkGraph-based
DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single NodeLinkGraph-based
Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World GraphsLinkGraph-based
SONG: Approximate Nearest Neighbor Search on GPULinkGraph-based
Graph-based Nearest Neighbor Search: Promises and FailuresLinkGraph-based
Improving Approximate Nearest Neighbor Search through Learned Adaptive Early TerminationLinkGraph-based
A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor SearchLinkSurvey
Fast approximate nearest neighbor search with the navigating spreading-out graphLinkGraph-based
Non-metric Similarity Graphs for Maximum Inner Product SearchLinkGraph-based
Understanding and Improving Proximity Graph-based Maximum Inner Product SearchLinkGraph-based
Learning to Route in Similarity GraphsLinkGraph-based
Optimization of Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional DataLinkGraph-based
Fast Approximate Nearest Neighbor Search with a Dynamic Exploration Graph using Continuous RefinementLinkGraph-based
Efficient Approximate Nearest Neighbor Search in Multi-dimensional DatabasesLinkGraph-based
Scaling Graph-Based ANNS Algorithms to Billion-Size Datasets: A Comparative AnalysisLinkGraph-based
SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor SearchLinkGraph-based
Hierarchical Clustering-Based Graphs for Large Scale Approximate Nearest Neighbor SearchLinkGraph-based
Hierarchical Clustering-Based Graphs for Large Scale Approximate Nearest Neighbor SearchLinkGraph-based
Fusion of graph-based indexing and product quantization for ANN searchLinkGraph-based
Towards Efficient Index Construction and Approximate Nearest Neighbor Search in High-Dimensional SpacesLinkGraph-based
Optimization of Indexing Based on k-Nearest Neighbor Graph for Proximity Search in High-dimensional DataLinkGraph-based
Scaling Graph-Based ANNS Algorithms to Billion-Size Datasets: A Comparative AnalysisLinkSurvey
Automating Nearest Neighbor Search Configuration with Constrained OptimizationLinkLearning
Approximate Nearest Neighbor Search under Neural Similarity Metric for Large-Scale RecommendationLinkGraph-based
Norm Adjusted Proximity Graph for Fast Inner Product RetrievalLinkGraph-based
On Efficient Retrieval of Top Similarity VectorsLinkGraph-based
SONG: Approximate Nearest Neighbor Search on GPULinkGPU
RTNN: Accelerating Neighbor Search Using Hardware Ray TracingLinkGPU
Billion-scale similarity search with GPUsLinkGPU
Fast neural ranking on bipartite graph indicesLinkNeural Rank
Fast Item Ranking under Neural Network based MeasuresLinkNeural Rank
Non-metric Similarity Graphs for Maximum Inner Product SearchLinkMIPS
Möbius Transformation for Fast Inner Product Search on GraphLinkMIPS
Understanding and Improving Proximity Graph-based Maximum Inner Product SearchLinkMIPS
Reinforcement Routing on Proximity Graph for Efficient RecommendationLinkLearning
From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More EffectiveLinkLearning
Constructing Tree-based Index for Efficient and Effective Dense RetrievalLinkLearning
Reverse Maximum Inner Product Search: Formulation, Algorithms, and AnalysisLinkMIPS
FARGO: Fast Maximum Inner Product Search via Global Multi-ProbingLinkLSH
SRS: solving c -approximate nearest neighbor queries in high dimensional Euclidean space with a tiny indexLinkLSH
From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More EffectiveLinkLSH
LazyLSH: Approximate Nearest Neighbor Search for Multiple Distance Functions with a Single IndexLinkLSH
HD-index: pushing the scalability-accuracy boundary for approximate kNN search in high-dimensional spacesLinkLSH
Falconn++: A Locality-sensitive Filtering Approach for Approximate Nearest Neighbor SearchLinkLSH
Deep Semantic-Preserving Ordinal Hashing for Cross-Modal Similarity SearchLinkLSH
Supervised Hierarchical Deep Hashing for Cross-Modal RetrievalLinkLSH
A Revisit of Hashing Algorithms for Approximate Nearest Neighbor SearchLinkSurvey

请注意,某些条目可能需要访问权限或成员资格才能查看完整内容

详细内容