Retrieval Execution on the Device

Currently, only the standard form is supported (that is, the retrieval service runs on the host). However, in some special scenarios, the retrieval service needs to run on the device. The following describes how to perform retrieval on the device.

Prerequisite

  • The CANN has been installed in open-form mode and the /usr/local/AscendMiniOSRun/ folder exists. For details, see the CANN Software Installation Guide (Open Form, Atlas Inference Products).
  • The 50 MB memory limit of the SSH service has been removed to ensure that all dependency files can be sent. For details, see "Enabling the SSH Service Using the DSMI API" in CANN Software Installation Guide (Open Form, Atlas inference products).
  • The host must use the ARM architecture.
  • 4 GB P2P memory needs to be reserved on the device. This part of memory is unavailable by default. To use the memory to reach the maximum library capacity, run the npu-smi info set -t p2p-mem-cfg -i "id" -d "value" command to disable the copy into the chip's BAR space. For details about how to use the command, see "Enabling or Disabling the Copy into the BAR Space of a Chip" in the Atlas Center Inference Card 25.5.0 npu-smi Command Reference.

Procedure

  1. Generate the operator required by the algorithm. For details about the algorithm, see Algorithm Introduction.
  2. Transfer the following dependency libraries to the device:
    • OpenBLAS: /opt/OpenBLAS/lib
    • Faiss: /usr/local/faiss/faiss1.10.0/lib
    • Toolkit.so in runtime state: /usr/local/AscendMiniOSRun/acllib/lib64 and /usr/local/AscendMiniOSRun/aarch64-linux/data
    • Retrieval .so: $ {MX_INDEX_HOME}/mxIndex/host/lib, where {MX_INDEX_HOME} is the installation directory of Index SDK.
    • libgfortran.so in the compiler on the host: /usr/lib /aarch64-linux-gnu/libgfortran.so*
    • Binary file compiled by the demo
    • latest/opp/version.info in the toolkit directory
    • Operator file:${MX_INDEX_HOME}/modelpath/

      Ensure that the operator file contains only operators of Atlas inference product. Otherwise, the running on the device may fail.

  3. Log in to the device and configure the following environment variables.
    1
    2
    3
    4
    # Configure environment variables.
    export LD_LIBRARY_PATH=./lib:./lib64:./
    # Configure the directory where the version.info file is located.
    export ASCEND_OPP_PATH=./
    
  4. Log in to the device and run the test case.