Open-Source Inference Engine
It is a hardware adaptation plugin designed for efficient vLLM inference, enabling seamless interworking between NPUs and the vLLM framework. MindIE Turbo accelerates LLM inference on NPUs, achieving higher throughput and lower latency.
MindIE Inference Engine
MindIE is a high-performance AI inference engine, enabling accelerated execution, debugging, tuning, and rapid migration. With its layered open architecture and unified interfaces, MindIE simplifies development while delivering peak performance through deeply optimized capabilities.
Customer-Developed Inference Engine
Customers can interconnect their inference engines with CANN through open APIs and acceleration libraries. This flexible architecture ensures high performance and stable deployment.
Open-Source Inference Engine
It is a hardware adaptation plugin designed for efficient vLLM inference, enabling seamless interworking between NPUs and the vLLM framework. MindIE Turbo accelerates LLM inference on NPUs, achieving higher throughput and lower latency.
MindIE Inference Engine
MindIE is a high-performance AI inference engine, enabling accelerated execution, debugging, tuning, and rapid migration. With its layered open architecture and unified interfaces, MindIE simplifies development while delivering peak performance through deeply optimized capabilities.
Customer-Developed Inference Engine
Customers can interconnect their inference engines with CANN through open APIs and acceleration libraries. This flexible architecture ensures high performance and stable deployment.
Development Resources
Installation Resources
Get Open-Source Inference Engine Resources
Use the Dockerfile to build an image and prepare the base environment required for models, including CANN, FrameworkPTAdapter, MindIE Turbo, and vLLM, to implement quick model inference. For details, refer to Set up using Docker.
Get MindIE Inference Engine Image
This image is pre-configured with the base environment required for model execution, including CANN, FrameworkPTAdapter, MindIE, and ATB Models, enabling rapid inference setup.
Models
View LLMs Supported by MindIE
LLMs and their versions supported by MindIE.
Supported:
DeepSeek
Qwen
LLaMA
ChatGLM
Baichuan
...



