Creating a Post-training Image for Reinforcement Learning (verl)
verl is a flexible, efficient, and production-ready reinforcement learning (RL) training framework designed for the post-training phase of large language models (LLMs). This section describes how to create a post-training image running Ubuntu 20.04 by using verl.
Obtaining Software Packages
Obtain the software packages of the corresponding OS and the Dockerfile and script files required for packaging the image by referring to Table 1.
Software Package |
Mandatory (Yes/No) |
Description |
How to Obtain |
|---|---|---|---|
Kernels |
Yes |
CANN binary operator package. The value of arch can be aarch64 or x86_64. The following example uses 8.2.RC1. |
NOTE:
Obtain a software package that matches the server model. |
CANN |
Yes |
CANN development kit, which is used to install ToolKit and NNAL. The following example uses 8.2.RC1. |
NOTE:
Obtain a software package that matches the server model. |
get-pip.py |
Yes |
Required for installing the pip module. |
curl -k https://bootstrap.pypa.io/get-pip.py -o get-pip.py |
version.info |
Yes |
Driver version information file. |
Copy the /usr/local/Ascend/driver/version.info file from the host. |
ascend_install.info |
Yes |
Driver installation information file. |
Copy the /etc/ascend_install.info file from the host. |
vLLM |
Yes |
Inference engine ( v0.9.1 branch) used in example. |
git clone -b v0.9.1 https://github.com/vllm-project/vllm.git After the package is downloaded, change the torch version in vllm/requirements/build.txt to 2.5.1. |
vllm-ascend |
Yes |
Adaptation plugin of vLLM on NPUs. Use commitid: 4014ad2a46e01c79fd8d98d6283404d0bc414dce. |
git clone -b v0.9.1-dev https://github.com/vllm-project/vllm-ascend.git cd vllm-ascend git checkout 4014ad2a46e01c79fd8d98d6283404d0bc414dce Then, change the torch-npu version in requirements.txt to 2.5.1.post1. |
Megatron-LM |
Yes |
Megatron v0.12.1 is used as the training backend. |
git clone https://github.com/NVIDIA/Megatron-LM.git cd Megatron-LM git checkout core_v0.12.1 |
MindSpeed |
Yes |
MindSpeed is used as the training backend. Use commitid: 1f13e6fdbfd701ea7e045c8d6bb2469fab9775a7. |
git clone https://gitcode.com/Ascend/MindSpeed.git cd MindSpeed git checkout 1f13e6fdbfd701ea7e045c8d6bb2469fab9775a7 |
verl |
Yes |
Post-training framework. Use commitid: 02f4386ae89c9a25863dca0bb8b6e119b2f01385. |
git clone https://github.com/volcengine/verl.git cd verl git checkout 02f4386ae89c9a25863dca0bb8b6e119b2f01385 |
rl-plugin |
Yes |
Adaptation plugin of verl on NPUs. Use commitid: 9a679fc3be95d162b78d42e9e3df569c30a89a5e. |
git clone https://gitcode.com/Ascend/MindSpeed-RL.git cd MindSpeed-RL/rl-plugin git checkout 9a679fc3be95d162b78d42e9e3df569c30a89a5e |
Dockerfile |
Yes |
Required for creating an image. |
- |
To avoid using a software package that has been tampered with during transmission or storage, download its digital signature file for integrity check while downloading the software package.
After the software package is downloaded from the Support website, verify its PGP digital signature by referring to the OpenPGP Signature Verification Guide. If the software package fails the verification, do not use the software package, and contact Huawei technical support.
The verification is also required before the installation or update of the software package.
For carriers, visit https://support.huawei.com/carrier/digitalSignatureAction.
For enterprise customers: https://support.huawei.com/enterprise/en/tool/pgp-verify-TL1000000054.
Procedure
- Prepare the required software packages on the host by referring to Table 1.
- Write Dockerfile as follows.
FROM ubuntu:20.04 WORKDIR /root COPY . . ARG HOST_ASCEND_BASE=/usr/local/Ascend ARG TOOLKIT_PATH=/usr/local/Ascend/toolkit/latest ARG TOOLKIT=Ascend-cann-toolkit_8.2.RC1_linux-aarch64.run ARG NNAL=Ascend-cann-nnal_8.2.RC1_linux-aarch64.run ARG KERNEL=Atlas-A3-cann-kernels_8.2.RC1_linux-aarch64.run RUN echo "nameserver 114.114.114.114" > /etc/resolv.conf RUN echo "deb http://repo.huaweicloud.com/ubuntu-ports/ focal main restricted universe multiverse\n\ deb http://repo.huaweicloud.com/ubuntu-ports/ focal-updates main restricted universe multiverse\n\ deb http://repo.huaweicloud.com/ubuntu-ports/ focal-backports main restricted universe multiverse\n\ deb http://ports.ubuntu.com/ubuntu-ports/ focal-security main restricted universe multiverse" > /etc/apt/sources.list RUN umask 0022 && apt update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends software-properties-common RUN umask 0022 && add-apt-repository ppa:deadsnakes/ppa && apt update && apt autoremove -y python python3 && apt install -y python3.10 python3.10-dev vim patch gcc g++ make cmake build-essential libbz2-dev libreadline-dev wget curl llvm libncurses5-dev libncursesw5-dev xz-utils tk-dev liblzma-dev m4 dos2unix libopenblas-dev git libjemalloc2 libomp-dev net-tools # Create Python soft links. RUN ln -s /usr/bin/python3.10 /usr/bin/python RUN unlink /usr/bin/python3 RUN ln -s /usr/bin/python3.10 /usr/bin/python3 RUN ln -s /usr/bin/python3.10-config /usr/bin/python-config RUN ln -s /usr/bin/python3.10-config /usr/bin/python3-config RUN umask 0022 && python get-pip.py # Configure the pip mirror. RUN mkdir -p ~/.pip \ && echo '[global] \n\ index-url=https://mirrors.huaweicloud.com/repository/pypi/simple\n\ trusted-host=mirrors.huaweicloud.com' >> ~/.pip/pip.conf # Time zone RUN ln -sf /usr/share/zoneinfo/UTC /etc/localtime # Create the HwHiAiUser user and owner. Ensure that the UID and GID are the same as those on the physical machine to avoid ownerless files. In the example, the user and corresponding group are automatically created, and the UID and GID are both 1000. RUN useradd -d /home/HwHiAiUser -u 1000 -m -s /bin/bash HwHiAiUser # Ascend package # Copy the /usr/local/Ascend/driver/version.info file on the host to the current directory before the build. RUN umask 0022 && \ cp ascend_install.info /etc/ && \ mkdir -p /usr/local/Ascend/driver/ && \ cp version.info /usr/local/Ascend/driver/ && \ chmod +x $TOOLKIT && \ chmod +x $KERNEL && \ chmod +x $NNAL RUN umask 0022 && ./$TOOLKIT --install-path=/usr/local/Ascend/ --install --quiet RUN umask 0022 && . /usr/local/Ascend/ascend-toolkit/set_env.sh && ./$KERNEL --install --quiet RUN umask 0022 && . /usr/local/Ascend/ascend-toolkit/set_env.sh && ./$NNAL --install --quiet - Build the image. Note that the period (.) at the end of the command must not be omitted.
docker build -t verl-train:v1 .
- Install the inference service package and start the container.
docker run -it \ -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ verl-train:v1 /bin/bash
Run the following commands in the container:
source /usr/local/Ascend/driver/bin/setenv.bash; source /usr/local/Ascend/ascend-toolkit/set_env.sh; source /usr/local/Ascend/nnal/atb/set_env.sh; source /usr/local/Ascend/nnal/asdsip/set_env.sh; # Install vLLM. cd vllm && pip install -r requirements/build.txt -i https://mirrors.aliyun.com/pypi/simple/ && pip install -r requirements/common.txt -i https://mirrors.aliyun.com/pypi/simple/ && VLLM_TARGET_DEVICE=empty python setup.py develop && cd .. # Install vllm-ascend. cd vllm-ascend && pip install -v -e . && cd .. # Install Megatron. cd Megatron-LM && git checkout core_v0.12.1 && pip install -e . && cd .. # Install MindSpeed. cd MindSpeed && pip install -e . && cd .. # Install verl. cd verl && pip install -e . && cd .. # Install the verl plugin. cd MindSpeed-RL/rl-plugin && pip install -v -e . && cd ..
- If an error message is displayed indicating that the CMake path of torch cannot be found during the installation of vllm-ascend, run the following command to specify CMAKE_PREFIX_PATH for installation:
CMAKE_PREFIX_PATH=/usr/local/lib/python3.10/dist-packages/torch/share/cmake/Torch/ pip install -v -e .
- If an error message is displayed indicating that the README.md file cannot be found during the installation of verl, create a README.md file in MindSpeed-RL/rl-plugin. The content may be arbitrary.
- After the installation is complete, if it is found that the torch version is not 2.5.1 and the torchvision version is not 0.20.1, reinstall torch 2.5.1 and torchvision 0.20.1.
- If an error message is displayed indicating that the CMake path of torch cannot be found during the installation of vllm-ascend, run the following command to specify CMAKE_PREFIX_PATH for installation:
- Run the following commands in a new window to save the image. To make Dockerfile more secure, you can define HEALTHCHECK based on service requirements. Then, run the HEALTHCHECK [OPTIONS] CMD command in the container to check the container running status.
# Search for the container ID. docker ps | grep verl-train # Commit the container as the image. Replace <container_id> with the actual container ID. docker commit <container_id> verl-train:v1
