SearchByThreshold

API definition

APP_ERROR SearchByThreshold(int n, const float16_t *queries, float threshold, int topk, int *num, idx_t *indices, float *distances, unsigned int tableLen = 0, const float *table = nullptr);

Function

Adds threshold filtering on the basis of search and returns only the results that meet the threshold conditions. For example, if a valid mapping table (tableLen > 0 and *table is a non-null pointer) is transferred, the top k results after mapping are returned.

Input

int n: number of feature vectors to be queried.

const float16_t *queries: feature vectors to be queried. The length is n × dim.

float threshold: threshold for filtering. The API does not restrict the value range. If a table is passed for mapping, the API maps the distance to score and then filters it based on threshold.

int topk: sorts the distance between the query and the base vectors and returns topk results.

unsigned int tableLen: length of the mapping table. The default value is 0, indicating that no mapping is performed. Currently, the mapping table length can be set to 10000.

const float *table: pointer of the mapping table, pointing to the storage space of valid mapping values with the tableLen length. Currently, the supported redundancy length is 48, that is, the length of the space to which *table points is 10048 × sizeof(float) bytes.

Output

int *num: number of base vectors corresponding to each query vector and meeting the threshold condition. The length is n.

idx_t *indices: vector index that meets the threshold condition. Each query records the distance that meets the condition in sequence, and then pads the occupied space by topk. The total length of indices is n × topk.

float *distances: distance between the base vector that meets the threshold condition and the query vector. The recording mode and length are the same as those of indices.

Return value

APP_ERROR: return status. For details, see Return Code Reference.

Restrictions

  • n: The value range is [0, capacity].
  • topk: The value range is [0, 1024].
  • If both tableLen and table meet the requirements during the parameter pass, the API maps the calculated distance.

    First, the distance value is normalized to a floating point number f1 in the range of [0, 1], and then f1 is multiplied by tableLen and rounded up to obtain an integer index between [0, tableLen]. Then, the integer index is used as an offset to obtain the corresponding score from the memory space pointed to by table. The mapping is completed, and score is saved to distance.

    The index mapping formula is as follows: ((CosDistance + 1)/2) × tableLen

  • indices, queries, distances, and num must be non-null pointers and their lengths must meet the requirements. Otherwise, an out-of-bounds read/write error may occur, causing program breakdown.