FeatureRetrieval Function

FeatureRetrieval is a Faiss-based NPU heterogeneous retrieval acceleration framework, providing high-performance retrieval for massive data in high-dimensional space. It is developed based on the TBE operator and the C++ language that is consistent with Faiss and supports ARM and x86_64 platforms. FeatureRetrieval supports two types of retrieval libraries: small library retrieval for exact search and large library retrieval for inexact search. A small library contains 300,000 to 1,000,000 records, while a large library contains tens of millions or even hundreds of millions of records and supports feature vectors with 64 dimensions to 512 dimensions.

  • For the small library retrieval, brute-force search algorithms such as Flat, SQ, and INT8 are used to search for all feature vectors in the base library and return the search result.
    • The INT8 algorithm performs brute-force search based on feature quantization, which also refers to int8flat in this document (for example, int8flat_generate_model.py).
    • The SQ algorithm uses an 8-bit integer for quantization, which also refers to SQ8 in this document (for example, sq8_generate_model.py). For details about proper data types, see Table 1.
  • For the large library retrieval, the IVF-based algorithms, such as IVFPQ, IVFSQ, and IVFINT8, are implemented on the Ascend platform based on the Faiss framework. Different from the conventional inverted index, the IVF-based algorithms perform feature clustering and then reduce the retrieval scale using cluster centroids, improving performance at the cost of accuracy.

The low level of each algorithm is implemented by the TBE operator accelerated by the Ascend platform.

Table 1 Data types of Flat, SQ, and INT8 algorithms

Algorithm

Source Data Type

Result Data Type

Flat

float32

float32

SQ

float32

float32

INT8

int8

int32