昇腾社区首页
中文
注册

安装MindIE Turbo(容器化)

本章节以Atlas 800I A2 推理服务器、openeuler24.03为基础镜像为例,指导用户如何构建MindIE Turbo的Arm和x86两种架构的镜像文件,以及启动容器化部署MindIE Turbo,请确保服务器能够连接稳定的网络。

根据操作系统架构的不同,MindIE Turbo镜像的构建方式也存在不同,主要区别在于Arm架构下会推荐使用经过编译优化技术生成的python、torch_npu、torch的软件包。有关编译优化技术的相关介绍,请参考《PyTorch 训练模型迁移调优指南》的“编译优化技术介绍”章节

前提条件

  • 宿主机已经安装过NPU驱动和固件。如未安装,请参见《CANN 软件安装指南》中的“选择安装场景”章节,按如下方式选择安装场景,按“安装NPU驱动和固件”章节进行安装。
    • 安装方式:选择“在物理机上安装”。
    • 操作系统:选择使用的操作系统,MindIE支持的操作系统请参见《MindIE安装指南》中的“安装说明”章节
    • 业务场景:选择“训练&推理&开发调试”。
  • 用户在宿主机自行安装Docker(版本要求大于或等于24.x.x),安装方式可参考Docker 官方安装文档。可通过如下方式查看当前Docker版本:
    docker --version
  • 确保服务器能够连接稳定的网络。因为构建过程中需在线下载多个资源,包括Python源码、编译工具以及各种依赖等,无法离线构建。
  • 构建镜像的环境必须满足以下条件:
    表1 Docker版本

    Docker

    Docker Compose

    >=24.x.x

    >=2

操作步骤

  • 构建MindIE Turbo Arm架构镜像文件
    1. 将获取的Arm架构MindIE Turbo软件包CANN软件包python编译优化包torch_npu(编译优化)torch(编译优化)放在某一个目录,如“/home/package”。
    2. 在“/home/package”路径下编写Dockerfile和docker-compose.yml,其文件目录结构如下:
      .
      ├── Ascend-cann-kernels-910b_8.1.RC1_linux-aarch64.run
      ├── Ascend-cann-nnal_8.1.RC1_linux-aarch64.run
      ├── Ascend-cann-toolkit_8.1.RC1_linux-aarch64.run
      ├── Ascend-mindie-turbo_2.0.RC1_py311_linux_aarch64.tar.gz
      ├── Dockerfile
      ├── docker-compose.yml
      ├── libcrypto.so.1.1
      ├── libomp.so
      ├── libssl.so.1.1
      ├── py311_bisheng.tar.gz
      ├── torch-2.5.1-cp311-cp311-linux_aarch64.whl
      ├── torch_npu-2.5.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
      1. 编写Dockerfile文件。

        该Dockerfile默认使用openeuler24.03、Arm架构以及Python 3.11,仅作为参考,请根据实际情况自行修改。

        FROM hub.oepkgs.net/openeuler/openeuler:24.03-lts as base
        ENV LD_LIBRARY_PATH=/usr/local/Ascend/driver/lib64/driver:/usr/local/Ascend/driver/lib64/common:/usr/local/lib:$LD_LIBRARY_PATH
        ENV DEBIAN_FRONTEND=noninteractive
        ENV TZ=Asia/Shanghai
        RUN echo "sslverify=false" >>/etc/yum.conf && \
            yum update -y && \
            yum install -y git wget vim
        ###################################### 1. INSTALL Compile Optimized Python #########################
        COPY  ./libcrypto.so.1.1 ./libomp.so ./libssl.so.1.1 ./py311_bisheng.tar.gz /tmp
        RUN mv /tmp/*.so* /usr/local/lib && \
            tar -zxvf /tmp/py311_bisheng.*  -C /usr/local/  && \
            mv /usr/local/py311_bisheng /usr/local/python && \
            sed -i "1c#\!/usr/local/python/bin/python3.11" /usr/local/python/bin/pip3 && \
            sed -i "1c#\!/usr/local/python/bin/python3.11" /usr/local/python/bin/pip3.11 && \
            ln -sf  /usr/local/python/bin/python3  /usr/bin/python && \
            ln -sf  /usr/local/python/bin/python3  /usr/bin/python3 && \
            ln -sf  /usr/local/python/bin/pip3  /usr/bin/pip3 && \
            ln -sf  /usr/local/python/bin/pip3  /usr/bin/pip && \
            sed -i "1c#\!/usr/bin/python3.11" /usr/bin/yum && \
            echo "sslverify=false" >>/etc/yum.conf
          
        ENV PATH=/usr/bin:/usr/local/python/bin:$PATH
        ENV LANG C.UTF-8
        # 添加pip 镜像源防止 pip install 超时
        RUN mkdir ~/.pip  && \
            echo "[global]" > ~/.pip/pip.conf && \
            echo "index-url = http://mirrors.aliyun.com/pypi/simple/" >> ~/.pip/pip.conf && \
            echo "trusted-host = mirrors.aliyun.com" >> ~/.pip/pip.conf && \
            pip3 install \
            --disable-pip-version-check \
            --no-cache-dir \
            --no-compile \
            'setuptools==65.5.1' \
            wheel 
        ################################################## 2. CANN buider ##################################################
        FROM base AS cann_installer
        ARG DEVICE
        ARG CANN_VERSION
        ARG ARCH
        # Toolkit envs
        ENV ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
        ENV LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/lib64:${ASCEND_TOOLKIT_HOME}/lib64/plugin/opskernel:$LD_LIBRARY_PATH
        ENV LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/lib64/plugin/nnengine:${ASCEND_TOOLKIT_HOME}/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/${ARCH}:$LD_LIBRARY_PATH
        ENV LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/tools/aml/lib64:${ASCEND_TOOLKIT_HOME}/tools/aml/lib64/plugin:$LD_LIBRARY_PATH
        ENV PYTHONPATH=${ASCEND_TOOLKIT_HOME}/python/site-packages:${ASCEND_TOOLKIT_HOME}/opp/built-in/op_impl/ai_core/tbe:$PYTHONPATH
        ENV PATH=${ASCEND_TOOLKIT_HOME}/bin:${ASCEND_TOOLKIT_HOME}/compiler/ccec_compiler/bin:${ASCEND_TOOLKIT_HOME}/tools/ccec_compiler/bin:$PATH
        ENV ASCEND_AICPU_PATH=${ASCEND_TOOLKIT_HOME} \
            ASCEND_OPP_PATH=${ASCEND_TOOLKIT_HOME}/opp \
            TOOLCHAIN_HOME=${ASCEND_TOOLKIT_HOME}/toolkit \
            ASCEND_HOME_PATH=${ASCEND_TOOLKIT_HOME}
        COPY ./Ascend-cann-kernels-${DEVICE}_${CANN_VERSION}_linux-${ARCH}.run ./Ascend-cann-nnal_${CANN_VERSION}_linux-${ARCH}.run ./Ascend-cann-toolkit_${CANN_VERSION}_linux-${ARCH}.run /tmp/
        RUN chmod +x /tmp/*.run && \
            /tmp/Ascend-cann-toolkit_*linux-*.run --install -q && \
            /tmp/Ascend-cann-kernels-*_linux-*.run --install -q && \
            /tmp/Ascend-cann-nnal_*_linux-*.run --install  -q && \
            rm -rf /tmp/*
        ################################################## 3. Install CANN ##################################################
        FROM base AS cann
        # Toolkit envs
        ENV ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
        ENV LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/lib64:${ASCEND_TOOLKIT_HOME}/lib64/plugin/opskernel:$LD_LIBRARY_PATH
        ENV LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/lib64/plugin/nnengine:${ASCEND_TOOLKIT_HOME}/opp/built-in/op_impl/ai_core/tbe/op_tiling/lib/linux/${ARCH}:$LD_LIBRARY_PATH
        ENV LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/tools/aml/lib64:${ASCEND_TOOLKIT_HOME}/tools/aml/lib64/plugin:$LD_LIBRARY_PATH
        ENV PYTHONPATH=${ASCEND_TOOLKIT_HOME}/python/site-packages:${ASCEND_TOOLKIT_HOME}/opp/built-in/op_impl/ai_core/tbe:$PYTHONPATH
        ENV PATH=${ASCEND_TOOLKIT_HOME}/bin:${ASCEND_TOOLKIT_HOME}/compiler/ccec_compiler/bin:${ASCEND_TOOLKIT_HOME}/tools/ccec_compiler/bin:$PATH
        ENV ASCEND_AICPU_PATH=${ASCEND_TOOLKIT_HOME} \
            ASCEND_OPP_PATH=${ASCEND_TOOLKIT_HOME}/opp \
            TOOLCHAIN_HOME=${ASCEND_TOOLKIT_HOME}/toolkit \
            ASCEND_HOME_PATH=${ASCEND_TOOLKIT_HOME}
        # NNAL envs
        ENV ATB_HOME_PATH=/usr/local/Ascend/nnal/atb/latest/atb/cxx_abi_0
        ENV LD_LIBRARY_PATH=${ATB_HOME_PATH}/lib:${ATB_HOME_PATH}/examples:${ATB_HOME_PATH}/tests/atbopstest:$LD_LIBRARY_PATH \
            PATH=${ATB_HOME_PATH}/bin:$PATH \
            ASDOPS_HOME_PATH=${ATB_HOME_PATH}
        # NNAL non-path envs
        ENV ATB_STREAM_SYNC_EVERY_KERNEL_ENABLE=0
        ENV ATB_STREAM_SYNC_EVERY_RUNNER_ENABLE=0
        ENV ATB_STREAM_SYNC_EVERY_OPERATION_ENABLE=0
        ENV ATB_OPSRUNNER_SETUP_CACHE_ENABLE=1
        ENV ATB_OPSRUNNER_KERNEL_CACHE_TYPE=3
        ENV ATB_OPSRUNNER_KERNEL_CACHE_LOCAL_COUNT=1
        ENV ATB_OPSRUNNER_KERNEL_CACHE_GLOABL_COUNT=5
        ENV ATB_OPSRUNNER_KERNEL_CACHE_TILING_SIZE=10240
        ENV ATB_WORKSPACE_MEM_ALLOC_ALG_TYPE=1
        ENV ATB_WORKSPACE_MEM_ALLOC_GLOBAL=0
        ENV ATB_COMPARE_TILING_EVERY_KERNEL=0
        ENV ATB_HOST_TILING_BUFFER_BLOCK_NUM=128
        ENV ATB_DEVICE_TILING_BUFFER_BLOCK_NUM=32
        ENV ATB_SHARE_MEMORY_NAME_SUFFIX=""
        ENV ATB_LAUNCH_KERNEL_WITH_TILING=1
        ENV ATB_MATMUL_SHUFFLE_K_ENABLE=1
        ENV ATB_RUNNER_POOL_SIZE=64
        ENV ASDOPS_HOME_PATH=${ATB_HOME_PATH}
        ENV ASDOPS_MATMUL_PP_FLAG=1
        ENV ASDOPS_LOG_LEVEL=ERROR
        ENV ASDOPS_LOG_TO_STDOUT=0
        ENV ASDOPS_LOG_TO_FILE=1
        ENV ASDOPS_LOG_TO_FILE_FLUSH=0
        ENV ASDOPS_LOG_TO_BOOST_TYPE=atb
        ENV ASDOPS_LOG_PATH=~
        ENV ASDOPS_TILING_PARSE_CACHE_DISABLE=0
        ENV LCCL_DETERMINISTIC=0
        ENV TASK_QUEUE_ENABLE=2
        ENV OMP_PROC_BIND=false
        COPY --from=cann_installer /usr/local/Ascend /usr/local/Ascend
        COPY --from=cann_installer /etc/Ascend /etc/Ascend
        ################################################## 4. Install vLLM  && vLLM_Ascned ##################################################
        FROM cann AS vllm
        RUN git config --global http.sslVerify false  && \
            git clone -b v0.7.3 https://github.com/vllm-project/vllm.git && \
            cp -r vllm /tmp && \
            cd /tmp/vllm && \
            pip install -r requirements-common.txt && \
            pip install -r requirements-build.txt && \
            VLLM_TARGET_DEVICE=empty pip install . 
        RUN git config --global http.sslVerify false  && \
            git clone -b v0.7.3-dev https://github.com/vllm-project/vllm-ascend.git && \
            cd vllm-ascend && \
            pip install .
        ############################################# 5. Install Torch_npu & Torch & MindieTurbo ################################
        FROM vllm AS turbo
        COPY ./Ascend-mindie-turbo_2.0.RC1_py311_linux_aarch64.tar.gz /tmp
        RUN cd /tmp && \
            tar -xzvf /tmp/Ascend-mindie-turbo_2.0.RC1_py311_linux_aarch64.tar.gz  && \ 
            cd /tmp/Ascend-mindie-turbo_2.0.RC1_py311_linux_aarch64 && \
            pip install *.whl && \
            pip cache purge && \
            rm -rf /tmp/*
            
        COPY ./torch_npu-2.5.1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl ./torch-2.5.1-cp311-cp311-linux_aarch64.whl /tmp/
        RUN pip install /tmp/torch_npu-*.whl --force-reinstall --no-deps && \
            pip install /tmp/torch-2.5.1*.whl --force-reinstall --no-deps && \
            pip install pandas gevent sacrebleu rouge_score pybind11 pytest && \
            pip cache purge && \
            rm -rf /tmp/*
      2. 编写docker-compose.yml文件。
        services:
          mindie-turbo:
            build:
              context: .
              network: host
              dockerfile: Dockerfile
              target: turbo
              args:
                DEVICE: 910b
                ARCH: aarch64
                CANN_VERSION: 8.1.RC1
            image: mindie-turbo:800I-A2-py311-Openeuler24.03-aarch64
            container_name: mindie-turbo
            volumes:
              - /usr/local/Ascend/driver:/usr/local/Ascend/driver
              - /usr/local/bin/npu-smi:/usr/local/bin/npu-smi
              - /usr/local/dcmi:/usr/local/dcmi
              - /usr/local/sbin:/usr/local/sbin
              - /data:/data
            working_dir: /workspace
            entrypoint: /bin/bash
            tty: true
            privileged: true

        “/home/data”路径只是一个示例,请根据自身需要挂载对应目录。

    3. 在“/home/package”目录下,执行如下命令构建镜像(不启动容器):
      docker compose build
    4. 执行如下命令后台启动容器。
      docker compose up -d
    5. 执行如下命令进入容器。
      docker exec -it <container-name> /bin/bash
  • 构建MindIE Turbo x86架构的镜像文件
    x86架构下,无需使用编译优化的相关软件包,仅需下载x86架构下的CANN 软件包MindIE Turbo软件包,对上述a.编写Dockerfile文件。进行适当修改。
    1. 将获取的x86架构MindIE Turbo软件包CANN软件包放在某一个目录,如“/home/package”。
    2. 在“/home/package”路径下编写Dockerfile和docker-compose.yml,其文件目录结构如下:
      .
      ├── Ascend-cann-kernels-910b_8.1.RC1_linux-x86_64.run
      ├── Ascend-cann-nnal_8.1.RC1_linux-x86_64.run
      ├── Ascend-cann-toolkit_8.1.RC1_linux-x86_64.run
      ├── Ascend-mindie-turbo_2.0.RC1_py311_linux_x86_64.tar.gz
      ├── Dockerfile
      ├── docker-compose.yml
    3. 因为不需要安装编译优化的python,请使用下述代码块替换Arm的a.编写Dockerfile文件。中“安装python”部分。
      ########################################## Install python ###############################################################
      RUN echo "sslverify=false" >>/etc/yum.conf && \
          yum update -y && \
          yum --setopt=sslverify=false install -y python3-pip shadow-utils git wget vim util-linux findutils python3-devel pciutils && \
          ln -sf /usr/bin/pip3.11 /usr/bin/pip && \
          ln -sf /usr/bin/python3.11 /usr/bin/python && \
          sed -i "1c#\!/usr/bin/python3.11" /usr/bin/yum && \
          echo "sslverify=false" >>/etc/yum.conf && \
          pip install --no-cache-dir wheel
    4. 由于不需要安装编译优化版的torch、torch_npu,因此torch、torch_npu在vLLM和vLLM Ascend安装过程中会通过pip install方式直接安装。
      使用下述代码块替换Arm的a.编写Dockerfile文件。中“安装vLLM和vLLM Ascend”部分。
      ########################################## Install vLLM && vLLM_Ascend #######################################################
      FROM torch AS vllm
      
      RUN git config --global http.sslVerify false  && \
          git clone -b v0.7.3 https://github.com/vllm-project/vllm.git && \
          cp -r vllm /tmp && \
          cd /tmp/vllm && \
          pip install -r requirements-common.txt && \
          pip install -r requirements-build.txt && \
          pip install ray && \
          VLLM_TARGET_DEVICE="empty" python -m pip install . --extra-index https://download.pytorch.org/whl/cpu/ && \
          python -m pip uninstall -y triton
      
      RUN git config --global http.sslVerify false  && \
          git clone -b v0.7.3-dev https://github.com/vllm-project/vllm-ascend.git && \
          cd vllm-ascend && \
          python -m pip install .  --extra-index https://download.pytorch.org/whl/cpu/

      其余过程与Arm构建运行过程一致。