Containers Exit Abnormally and Cannot Run

Symptom

After a containerized application is successfully deployed on FusionDirector, a container exits abnormally and cannot run.

Possible Causes

  1. The container configuration is incorrect. For example, the environment variables without dependencies do not have corresponding permissions, or the NPU resource allocation fails.
  2. The image file does not match the system architecture.
  3. The application in the container is abnormal.

Procedure

  1. On FusionDirector, choose Menu > Devices > Device List, select Servers or Edge Devices based on the device type, and click Device Name or BMC IP in the device list. Click the Containerized Applications tab and check whether the exception information about the container running status contains errors.

    • If yes, rectify the fault based on the error information, deploy the container again, and check whether the container can run properly.
      • If yes, the fault is rectified.
      • If no, perform this step until the container is successfully run or the error information cannot be used to locate the fault.
    • If no, or the error information cannot be used to locate the fault, go to 2.
  2. On FusionDirector, choose Menu > Devices > Device List, select Servers or Edge Devices based on the device type, and click Device Name or BMC IP in the device list. Click Current Alarms and check whether any exceptions about NPU resource allocation exist.
    • If yes, log in to the CLI of the device, run the docker inspect $(docker ps -q) | grep Devices -n3 command to view the Devices information of the container, and check whether the number of containers whose Devices information is not empty exceeds that of NPUs. If the number of containers exceeds that of NPUs, delete unnecessary NPU containers and deploy the containers again.
    • If no, go to 3.
  3. Log in to the CLI of the device, run the docker ps -a command to view all container IDs, and run the docker logs {containerID} command to check whether the error message Exec format error is displayed for the abnormal container.
    • If yes, the image file does not match the system architecture. Use an image file whose architecture corresponds to that of the device to deploy the container again.
    • If no, go to 4.
  4. Log in to the CLI of the device, run the docker ps -a command to check the ID of the abnormal container, and run the docker logs {containerID} command to check whether there is error information displayed for the abnormal container.
    • If yes, rectify the fault based on the error information, deploy the container again, and check whether the container can run properly.
      • If yes, the fault is rectified.
      • If no, repeat this step until the container runs successfully or no error information is displayed.
    • If no, go to 5.
  5. Collect log files generated by service applications in the container and check whether error information exists.
    • If yes, rectify the fault based on the error information, deploy the container again, and check whether the container can run properly.
      • If yes, the fault is rectified.
      • If no, repeat this step until the container runs successfully or no error information is displayed.
    • If no, contact maintenance personnel.