Unpacking

When starting a customer model, the TensorFlow or PyTorch technology stack is used based on customer requirements. The software stack to be unpacked should be selected based on the customer's inference framework process.

If the TensorFlow framework is used, the customer usually provides a .pb file. You can run the model based on the customer's software stack. For details, see the demo in the community (TensorFlow Inference Samples).
If the PyTorch route is used, the TorchAir suite can be used for inference. For details, see the demo in the community (TorchAir Inference Samples).
If the ecosystem route is used, the Inductor + Triton process is used. After unpacking, you can preliminarily observe the performance baseline and compare it with the target. In addition, you can perform subsequent analysis by referring to the profiling methods in the preceding sections of this document.

Parent topic: Performance Tuning Analysis