ComputeDistanceByThreshold

API Definition

APP_ERROR ComputeDistanceByThreshold(int n, const float16_t *queries, float threshold, int *num, idx_t *indices, float *distances, unsigned int tableLen = 0, const float *table = nullptr)

Function

Adds threshold filtering on the basis of ComputeDistance and returns only the distance that meets the threshold conditions. If a valid mapping table (tableLen > 0 and *table is a non-null pointer) is transferred, the distances are obtained after mapping and threshold filtering.

Input

int n: number of feature vectors to be queried.

float16_t *queries: feature vectors to be queried. The length is n x vector dimension dim.

float threshold: threshold for filtering. The API does not restrict the value range. If a stretch table is passed for mapping, the API maps the distance to score and then filters it based on threshold.

unsigned int tableLen: length of the mapping table. The default value is 0, indicating that no mapping is performed. Currently, the mapping table length can be set to 10000.

const float *table: pointer of the mapping table, pointing to the storage space of valid mapping values with the tableLen length. Currently, the supported redundancy length is 48, that is, the length of the space to which *table points is 10048 x sizeof(float) bytes.

Output

int *num: number of base library vectors corresponding to each feature vector to be queried and meeting the threshold condition. The length is n.

idx_t *indices: subscript index of the base library vector that meets the thresholds. The number of base libraries for each query that meets the thresholds varies. After all valid indexes are recorded in sequence, the occupied space is padded by ntotalPad. The total length of indices is n x ntotalPad (ntotalPad = (ntotal + 15)/16 x 16, that is, a 16-padded value by ntotal).

float *distances: distance between the base library vector that meets the thresholds and the vector to be queried. The mode to record valid values and space size of valid values are the same as those of indices.

Return Value

APP_ERROR: return status. For details, see Return Code Reference.

Restrictions

  • n: The value range is [0, capacity].
  • *indices: The space length to be provided is n x ntotalPad. (ntotalPad = (ntotal + 15)/16 x 16, that is, a 16-padded value by ntotal. After the i query is filtered, the valid base library indexes are stored in the first *(num + i) space of ntotalPad. The padded values do not provide any significance.)
  • *distances: The length of the space to be provided is n x ntotalPad.
  • aclrtmalloc is recommended for *indices and *distances, which can allocate the total physical memory to optimize delay processing.
  • If both TableLen and *table meet the requirements during the parameter pass, the API maps the calculated distance.

    First, the distance value is normalized to a floating point number f1 in the range of [0, 1], and then f1 is multiplied by tableLen and rounded up to obtain an integer index between [0, tableLen]. Then, the integer index is used as an offset to obtain the corresponding score from the memory space pointed to by *table. The mapping is completed, and score is saved to *distance .

    The index mapping formula is as follows: ((CosDistance + 1)/2) x tableLen