Environment Setup

This section describes the deployment architecture of AMCT in different product forms.

Product Models

The deployment architecture varies according to the product form. The following describes the product form and then describes the deployment architecture in different product forms.

The following uses the Ascend AI Processor as an example. If the PCIe works as the master and supports peripherals, this mode is referred to as the RC mode (or Ascend RC in the following). If the PCIe serves as a slave, this mode is referred to as the EP mode (or Ascend EP in the following).

  • The working modes of the Ascend AI Processor are as follows:
    • The Atlas 200/300/500 Inference Product has two modes: EP and RC
    • The Atlas Training Series Product, has only the EP mode.
  • The following products support the RC mode: Atlas 200 AI accelerator module and Atlas 200 DK

    The CPUs of such products run the AI service software specified by the running user directly and connect to peripherals such as network cameras, I2C sensors, and SPI monitors as secondary devices.

  • The following products support the EP mode:

    Atlas 200/300/500 Inference Product: Atlas 200 AI accelerator module, Atlas 300I inference card, Atlas 500 AI edge station, Atlas 500 Pro AI edge server, and Atlas 800 inference server

    Atlas Training Series Product: Atlas 800 training server and Atlas 300T training card

    In EP mode, the host acts as the master, the device acts as the slave, and customer's AI apps run on the host. The product, as a device, connects to the host over the PCIe interface, while the host loads AI tasks to the Ascend AI Processor on the device over the PCIe interface.

Figure 1 shows the products and architecture of the two modes.

The concepts of host and device are described as follows:

  • Host: an x86 server or an Arm server connected to the hardware powered by an Ascend AI Processor. It leverages the neural network (NN) compute capability provided by the Ascend AI Processor.
  • Device: a hardware backend powered by an Ascend AI Processor. It provides the server with the NN compute capability over the PCIe interface.
Figure 1 RC and EP scenarios

Environment Setup in Ascend EP Mode

Figure 2 shows the deployment architecture of AMCT. For details about the supported OSs, see Supported Operating Systems. Before running inference, use the ATC tool to convert the quantized model into an offline model adapted to the Ascend AI Processor.

Figure 2 Environment setup in Ascend EP mode
  1. Deploy AMCT on the server that meets the requirements to compress the model.
  2. Use the ATC tool to convert the compressed model into an offline model adapted to Ascend AI Processor.
  3. Run inference on the resultant .om offline model obtained in 2 on the server powered by Ascend AI Processor.

Environment Setup in Ascend RC Mode

Figure 3 shows the environment setup for AMCT. Currently, only Ubuntu 20.04 (AArch64) and Ubuntu 18.04 (AArch64) are supported. For details, see Checking the OS Requirements and Environment. Before running inference, use the ATC tool to convert the quantized model into an offline model adapted to the Ascend AI Processor.
Figure 3 Environment setup in Ascend RC mode
  1. Install AMCT on an Ubuntu (AArch64) server to perform model compression.
  2. Use the ATC tool to convert the compressed model into an offline model adapted to the Ascend AI Processor.
  3. Run inference on the resultant .om offline model obtained in 2 on the server powered by Ascend AI Processor.