Obtaining the Compilation Optimization Package

The compilation optimization environment is complex to configure, and the entire process takes a long time. To improve deployment efficiency and achieve out-of-the-box availability, you can directly obtain the pre-built generalization compilation optimization package, including the compressed package after Python compilation optimization and the .whl installation packages of torch and torch_npu. The Python compressed package can be directly configured with a soft link. The .whl installation packages of torch and torch_npu are optimized based on model data from typical scenarios and provide performance benefits with good generalization. For details about how to obtain the software packages, see Table 1.

Table 1 Obtaining software packages in the vLLM scenario

File

How to Obtain

Python package

https://repo.oepkgs.net/ascend/pytorch/vllm/python/

.whl installation packages of torch and torch_npu

https://repo.oepkgs.net/ascend/pytorch/vllm/torch/

Runtime dependency SO

https://repo.oepkgs.net/ascend/pytorch/vllm/lib/