Deploying RAG SDK on a Physical Machine
This section describes how to deploy RAG SDK on a physical machine running ubuntu20.04-live-server and Huawei Cloud EulerOS 2.0, using the common user HwHiAiUser as an example.
- The required dependencies have been installed based on Installing Dependencies.
- The user for installing CANN and RAG SDK must be the same. A common user is recommended.
Preparing for Installation
- ubuntu20.04-live-server: Ensure that Python 3.11, libpq-dev, and CMake 3.24.3 or later have been installed. To install libpq-dev and Python, perform the following steps:
# Install libpq-dev (required by psycopg2). apt install -y libpq-dev # Set PY_VERSION to python3.11. export PY_VERSION=python3.11 # Add Python ppa. add-apt-repository -y ppa:deadsnakes/ppa && apt-get update # Install Python. apt-get install -y --no-install-recommends $PY_VERSION $PY_VERSION-dev $PY_VERSION-distutils $PY_VERSION-venv # Set the default Python. ln -sf /usr/bin/$PY_VERSION /usr/bin/python3 ln -sf /usr/bin/$PY_VERSION /usr/bin/python # Install pip. curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py python3 get-pip.py python3 -m pip install --upgrade setuptools
- Huawei Cloud EulerOS 2.0: Perform the following steps to install Python and dependencies.
# Ensure that the Yum repository of the system contains Python 3.11. For details, see Configuring an HCE Repository and Repo Configuration on openEuler. yum update yum install python3.11 yum install cmake swig postgresql-devel patch mesa-libGL
Procedure
- Switch to the HwHiAiUser user and go to the /home/HwHiAiUser directory.
- Install torch and torch-npu.
1 2 3 4 5 6
# Install torch on x86. pip3 install torch==2.1.0+cpu --index-url https://download.pytorch.org/whl/cpu # Install torch on aarch64. pip3 install torch==2.1.0 # Install torch-npu on both architectures. pip3 install torch-npu==2.1.0.post12
- Install torchvision-npu.
1 2 3 4 5 6 7 8 9 10 11 12 13
# Download the Torchvision Adapter code and go to the root directory of the plugin. git clone https://gitee.com/ascend/vision.git vision_npu cd vision_npu git checkout v0.16.0-6.0.0 # Install the dependency library. pip3 install -r requirement.txt # Configure CANN environment variables. source /home/HwHiAiUser/Ascend/ascend-toolkit/set_env.sh # Compile the package. python3 setup.py bdist_wheel # Install the package. cd dist pip3 install torchvision_npu-*.whl
- Install OpenBLAS.
- Download the OpenBLAS v0.3.10 source package and decompress it.
1 2
wget https://github.com/xianyi/OpenBLAS/archive/v0.3.10.tar.gz -O OpenBLAS-0.3.10.tar.gz tar -xf OpenBLAS-0.3.10.tar.gz
- Go to the OpenBLAS directory.
1cd OpenBLAS-0.3.10
- Perform build and installation.
1 2 3
make FC=gfortran USE_OPENMP=1 -j # Specify the installation path for a common user. make PREFIX=/home/HwHiAiUser/OpenBLAS install
- Configure the environment variables of the library path.
1 2 3
vim ~/.bashrc # Add the following information to the file: export LD_LIBRARY_PATH=/home/HwHiAiUser/OpenBLAS/lib:$LD_LIBRARY_PATH
- Check whether the installation is successful.
1cat /home/HwHiAiUser/OpenBLAS/lib/cmake/openblas/OpenBLASConfigVersion.cmake | grep 'PACKAGE_VERSION "'
If the correct version information is displayed, the installation is successful.
- Download the OpenBLAS v0.3.10 source package and decompress it.
- Download the Faiss source code, build the Faiss wheel package, and install it.
Faiss is also installed when Index SDK is installed. However, only libfaiss.so is generated after compilation. You need to build and install the Faiss wheel package so that Faiss can be used in Python.
- Download the Faiss source package and decompress it.
1 2 3
# faiss 1.10.0 wget https://github.com/facebookresearch/faiss/archive/v1.10.0.tar.gz tar -xf v1.10.0.tar.gz && cd faiss-1.10.0/faiss
- Create the install_faiss.sh script.
1vi install_faiss.sh - Add the following content to the install_faiss.sh script.
export FAISS_INSTALL_PATH=/usr/local/faiss/faiss1.10.0 # According to different OSs, the Faiss installation path may be ${FAISS_INSTALL_PATH}/lib or ${FAISS_INSTALL_PATH}/lib64. export FAISS_INSTALL_PATH_LIB=${FAISS_INSTALL_PATH}/lib mkdir -p ${FAISS_INSTALL_PATH} sed -i "149 i virtual void search_with_filter (idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, const void *mask = nullptr) const{}" Index.h sed -i "49 i template <typename IndexT> IndexIDMapTemplate<IndexT>::IndexIDMapTemplate (IndexT *index, std::vector<idx_t> &ids): index (index), own_fields (false) { this->is_trained = index->is_trained; this->metric_type = index->metric_type; this->verbose = index->verbose; this->d = index->d; id_map = ids; }" IndexIDMap.cpp sed -i "30 i explicit IndexIDMapTemplate (IndexT *index, std::vector<idx_t> &ids);" IndexIDMap.h sed -i "217 i utils/sorting.h" CMakeLists.txt cd .. && cmake -B build . -DFAISS_ENABLE_GPU=OFF -DPython_EXECUTABLE=/usr/bin/python3 -DBUILD_TESTING=OFF -DBUILD_SHARED_LIBS=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=${FAISS_INSTALL_PATH} make -C build -j faiss make -C build -j swigfaiss # If an error message indicating no wheel package is displayed, use pip. cd build/faiss/python && python3 setup.py bdist_wheel cd ../../.. && make -C build install cd build/faiss/python && cp libfaiss_python_callbacks.so ${FAISS_INSTALL_PATH_LIB}/ cd dist # This operation may update NumPy to 2.x.x. You need to roll back NumPy to 1.26.4. pip3 install faiss-1.10.0*.whl - Press Esc, type :wq!, and press Enter to save the changes and exit.
- Run the install_faiss.sh script to install Faiss.
bash install_faiss.sh
- If an error message indicating no wheel package is displayed, use pip.
- After Faiss is installed, NumPy may be updated to 2.x.x. You need to roll back NumPy to 1.26.4.
- Download the Faiss source package and decompress it.
- Install Index SDK.
- Grant the execute permission on the software package.
chmod +x Ascend-mindxsdk-mxindex_{version}_linux-{arch}.run
- Check the consistency and integrity of the software package.
./Ascend-mindxsdk-mxindex_{version}_linux-{arch}.run --check
If the following information is displayed, the software package has passed the verification:
1Verifying archive integrity... 100% SHA256 checksums are OK. All good.
- Create an installation path for the software package.
- If you do not specify an installation path, the software is installed in the path where its package is located by default.
- If you need to specify an installation path, create it first. /home/HwHiAiUser/Ascend is used as an example.
1mkdir -p /home/HwHiAiUser/Ascend
- Install Index SDK.
./Ascend-mindxsdk-mxindex_7.2.RC1_linux-aarch64.run --install --install-path=<Installation path> --platform=<npu_type>If you need to specify the installation path, set <Installation path> to the path created in the previous step. If the message "Do you accept the EULA to install RAG SDK? [Y/N]" is displayed during the installation process, enter Y or y to agree to the EULA and continue installation. If you enter other characters, installation stops and the program exits.
If the following information is displayed, the software is successfully installed:
Uncompressing ASCEND MXINDEX RUN PACKAGE 100%
- Run the Index SDK script after Index SDK is installed.
cd <Installation path>/mxIndex/ops &&./custom_opp_{arch}.run
- Grant the execute permission on the software package.
- Download and install AscendFaiss.
- Download the source package and decompress it.
1 2
wget https://gitee.com/ascend/mindsdk-referenceapps/repository/archive/master.zip unzip master.zip && cd mindsdk-referenceapps-master/IndexSDK/faiss-python
- Create the install_ascendfaiss_sh script.
1vi install_ascendfaiss.sh - Add the following content to the install_ascendfaiss.sh script.
# Set the following environment variables: export PY_VERSION=python3.11 export FAISS_INSTALL_PATH=/usr/local/faiss/faiss1.10.0 # According to different OSs, the Faiss installation path may be ${FAISS_INSTALL_PATH}/lib or ${FAISS_INSTALL_PATH}/lib64. export FAISS_INSTALL_PATH_LIB=${FAISS_INSTALL_PATH}/lib export INDEXSDK_INSTALL_PATH=/home/HwHiAiUser/Ascend/mxIndex export PYTHON_HEADER=/usr/include/$PY_VERSION/ export ASCEND_INSTALL_PATH=/home/HwHiAiUser/Ascend/ascend-toolkit/latest export DRIVER_INSTALL_PATH=/usr/local/Ascend/ export OPENBLAS_INSTALL_PATH=/home/HwHiAiUser/OpenBLAS export NUMPY_INCLUDE=$(python3 -c "import numpy; print(numpy.get_include())") swig -python -c++ -Doverride= -module swig_ascendfaiss -I${PYTHON_HEADER} -I${FAISS_INSTALL_PATH}/include -I${INDEXSDK_INSTALL_PATH}/include -DSWIGWORDSIZE64 -o swig_ascendfaiss.cpp swig_ascendfaiss.swig g++ -std=c++11 -DFINTEGER=int -fopenmp -I/usr/local/include -I${ASCEND_INSTALL_PATH}/acllib/include -I${ASCEND_INSTALL_PATH}/runtime/include -fPIC -fstack-protector-all -Wall -Wreturn-type -D_FORTIFY_SOURCE=2 -g -O3 -Wall -Wextra -I${PYTHON_HEADER} -I${NUMPY_INCLUDE} -I${FAISS_INSTALL_PATH}/include -I${INDEXSDK_INSTALL_PATH}/include -c swig_ascendfaiss.cpp -o swig_ascendfaiss.o g++ -std=c++11 -shared -fopenmp -L${ASCEND_INSTALL_PATH}/lib64 -L${ASCEND_INSTALL_PATH}/acllib/lib64 -L${ASCEND_INSTALL_PATH}/runtime/lib64 -L${DRIVER_INSTALL_PATH}/driver/lib64 -L${DRIVER_INSTALL_PATH}/driver/lib64/common -L${DRIVER_INSTALL_PATH}/driver/lib64/driver -L${FAISS_INSTALL_PATH_LIB} -L${INDEXSDK_INSTALL_PATH}/lib -Wl,-rpath-link=${ASCEND_INSTALL_PATH}/acllib/lib64:${ASCEND_INSTALL_PATH}/runtime/lib64:${DRIVER_INSTALL_PATH}/driver/lib64:${DRIVER_INSTALL_PATH}/driver/lib64/common:${DRIVER_INSTALL_PATH}/driver/lib64/driver -L/usr/local/lib -Wl,-z,relro -Wl,-z,now -Wl,-z,noexecstack -s -o _swig_ascendfaiss.so swig_ascendfaiss.o -L.. -lascendfaiss -lfaiss -lascend_hal -lc_sec # If an error message indicating no build is displayed, use pip. python3 -m build # This operation may update NumPy to 2.x.x. You need to roll back NumPy to 1.26.4. cd dist && pip3 install ascendfaiss*.whl export LD_LIBRARY_PATH=${INDEXSDK_INSTALL_PATH}/lib:${FAISS_INSTALL_PATH}/lib:$LD_LIBRARY_PATH - Press Esc, type :wq!, and press Enter to save the changes and exit.
- Run the install_ascendfaiss.sh script to install AscendFaiss.
bash install_ascendfaiss.sh
- Download the source package and decompress it.
- Install RAG SDK.
bash Ascend-mindxsdk-mxrag_<version>_linux-<arch>.run --install --install-path=<Installation path> --platform=<npu_type> # Install third-party dependencies. pip3 install rank_bm25==0.2.2 langchain-opengauss==0.1.5 # Install dependencies. pip3 install -r <Installation path>/mxRag/requirements.txt
If the following information is displayed, the software is successfully installed:
Install package successfully
The --install command also supports options listed in Table 1. If you enter an option that is not listed in the table, the installation may be normal or an error may be reported.
If the options queried by running the ./{run_file_name}.run --help command are not described in the following table, they are reserved or applicable to other processors. You can ignore them.
Table 1 Supported options of the installation package Option
Meaning
--help | -h
Queries help information.
--info
Queries the construction information.
--list
Queries the file list.
--check
Queries the integrity of the software package.
--quiet|-q
Enables silent mode to reduce the verbosity of human-machine interactions.
--nox11
Discarded
--noexec
Decompresses the software package to the current directory without running the installation script. It is used together with --extract=<path> in the format of --noexec --extract=<path>.
--extract=<path>
Decompresses the software package to a specified directory. It can be used with any of --noexec, --install, and --upgrade.
--tar arg1 [arg2 ...]
Runs the tar command on the software package. Use the options following tar as the command options. For example, the --tar xvf command indicates that the .run package will be decompressed to the current directory.
--version
Queries the version of the RAG SDK installation package.
--install
Installs the software package.
--install-path=<path>
(Optional) Customizes the root directory for installing the RAG SDK package. If it is not set, the directory where the current command is executed is used by default. The path must start with a slash (/) or tilde (~). The valid characters include "-_.0-9a-zA-Z/", and the path cannot contain consecutive dots (..). The length cannot exceed 1,024 characters.
If you do not specify the path, the default path is used.
- /usr/local/Ascend for the root user
- ${HOME}/Ascend for a non-root user
If this option is used to specify the installation directory, other users do not have the write permission on the directory. If a common user is specified for installation, the owner of the installation directory must be the current installation user.
--upgrade
Upgrades the software package, i.e., upgrades RAG SDK to the version contained in the installation package.
--platform
(Optional) Corresponds to the Ascend AI Processor type.
Run the npu-smi info command on the server where the Ascend AI Processor is installed, and then delete the last digit of Name. The obtained value is the value of --platform.
If the Atlas 800I A3 SuperPoD Server is used, the value is A3.
--whitelist
(Optional) Installs the allowlist feature module. The value can be operator or whl. If multiple features are installed, separate them with commas (,). If this option is not set, all features are installed.
operator: inference acceleration operator module.
whl: RAG SDK function module, including knowledge base management, vectorization, and cache.
The following options are not displayed in --help. Do not use them directly.
- --xwin: uses the xwin operating mode.
- --phase2: performs the second step.
- Set the RAG SDK running environment variables.
- Use vim to open the ~/.bashrc file and add the following content to the end of the file:
export MX_INDEX_FINALIZE=0 export PY_VERSION=python3.11 export LOGURU_FORMAT='<green>{time:YYYY-MM-DD HH:mm:ss.SSS}</green> | <level>{level: <8}</level> | <cyan>{name}</cyan>:<cyan>{function}</cyan>:<cyan>{line}</cyan> - <level>{message!r}</level>' export MX_INDEX_MODELPATH=/home/HwHiAiUser/Ascend/modelpath # Set the Index SDK installation path. If the default path is not used, change the path as required. export MX_INDEX_INSTALL_PATH=/home/HwHiAiUser/Ascend/mxIndex export MX_INDEX_MULTITHREAD=1 export ASCEND_HOME=$HOME/Ascend/ export LD_LIBRARY_PATH=/home/HwHiAiUser/Ascend/mxIndex/lib:/home/HwHiAiUser/faiss/faiss1.10.0/lib:$LD_LIBRARY_PATH export PYTHONPATH=/home/HwHiAiUser/.local/lib/$PY_VERSION/site-packages/mx_rag/libs:$PYTHONPATH export LD_PRELOAD=$(ls /home/HwHiAiUser/.local/lib/$PY_VERSION/site-packages/scikit_learn.libs/libgomp-*):$LD_PRELOAD source /home/HwHiAiUser/Ascend/ascend-toolkit/set_env.sh source /home/HwHiAiUser/Ascend/nnal/atb/set_env.sh source /home/HwHiAiUser/Ascend/mxRag/script/set_env.sh - Save the settings and exit. Run the following command for the environment to take effect:
source ~/.bashrc
- Use vim to open the ~/.bashrc file and add the following content to the end of the file:
The following error information may be displayed during RAG SDK installation:
ERROR: Cannot uninstall 'xxx'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
It indicates that the xxx module is an OS built-in component and cannot be directly upgraded. You can run the pip3 install -r requirements.txt --ignore-installed command to install the module.
Checking the Operating Environment
- Switch to the running user HwHiAiUser.
- Run the npu-smi info command to check whether the driver is properly mounted.
If the value of Health is OK, the current processor is healthy. The following information is only an example.
+--------------------------------------------------------------------------------------------------------+ | npu-smi 24.1.rc2 Version: 24.1.rc2 | +-------------------------------+-----------------+------------------------------------------------------+ | NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page) | | Chip Device | Bus-Id | AICore(%) Memory-Usage(MB) | +===============================+=================+======================================================+ | 7 xxx | OK | NA 44 0 / 0 | | 0 0 | 0000:83:00.0 | 0 1851 / 21527 | +===============================+=================+======================================================+ | 8 xxx | OK | NA 44 0 / 0 | | 0 1 | 0000:84:00.0 | 0 1852 / 21527 | +===============================+=================+======================================================+ +-------------------------------+-----------------+------------------------------------------------------+ | NPU Chip | Process id | Process name | Process memory(MB) | +===============================+=================+======================================================+ | No running processes found in NPU 7 | +===============================+=================+======================================================+ | No running processes found in NPU 8 | +===============================+=================+======================================================+