TensorFlow and PyTorch Performance Data Collection
TensorFlow does not provide APIs that can be directly called. You need to run the msprof command to collect data. Generally, dynamic collection is used to control the amount of data to be collected. For details, see "Dynamic Profiling" in the CANN Performance Tuning Tool User Guide.
Sample:
import npu_device
from npu_device.compat.v1.npu_init import *
import numpy as np
import tensorflow as tf
tf.compat.v1.disable_eager_execution()
session_config = tf.compat.v1.ConfigProto()
custom_op = session_config.graph_options.rewrite_options.custom_optimizers.add()
custom_op.name = "NpuOptimizer"
custom_op.parameter_map["graph_max_parallel_model_num"].i = 1
custom_op.parameter_map["aicore_num"].s = tf.compat.as_bytes("7|10")
session_config.graph_options.rewrite_options.remapping = RewriterConfig.OFF
left_shape = [1, 8000]
right_shape = [800, 1]
x = tf.compat.v1.placeholder(tf.int64, shape=left_shape)
y = tf.compat.v1.placeholder(tf.int64, shape=right_shape)
equal_ret = tf.math.equal(x, y)
inputs_x = np.random.rand(*left_shape)
inputs_x = inputs_x.astype(np.int64)
inputs_y = np.random.rand(*right_shape)
inputs_y = inputs_y.astype(np.int64)
with tf.compat.v1.Session(config=session_config) as sess:
for i in range(100000):
result = sess.run(equal_ret, feed_dict={x:inputs_x, y:inputs_y})
print(result)
- Run the inference in a loop. After the model starts inference for a period of time, obtain the PID of the running program. For example, if the PID is 9527, run the following command for dynamic profiling:
msprof --dynamic=on --pid=9527 --output=/home/projects/output --model-execution=on --runtime-api=on --aicpu=on > start ... > stop ... > quitThe time window for dynamic profiling starts after the start command is executed and ends when the stop command is executed.
- After the data is collected, you need to manually parse the data. Go to the directory (generally a directory with a timestamp) where the data is collected and run the following command to parse the data:
// Enable parsing and export the profiling result to the current directory. msprof --parse=on --output=./ // Enable the export function and save the result in CSV format to the current directory. msprof --export=on --output=. --summary-format=csv
If the collection time is too long, the parsing time will be long. Generally, the collection time is 5s.
Parent topic: msprof