Collecting Profile Data Locally (by Calling the Profiler Class)

You can call the npu_bridge.profiler.profiler class to locally collect profile data. That is, only the commands in the scope of the Profiler class can enable profile data collection.

The following describes how to call the Profiler class to enable profile data collection.

  1. Import the Profiler class.
    1
    from npu_bridge.npu_init import *
    
  2. Use the with statement to call the Profiler class and include the operations that require profile data collection in the Profiler class.
    In the following simple code snippet, a graph containing the Add operator is implemented and executed in a session. As sess.run (add, ...) is within the Profiler class, the L1 profile data and the ratios of computing-related metrics are collected. The profile data is stored in the current script execution path.
    1
    2
    3
    4
    5
    6
    7
    8
    a = tf.placeholder(tf.int32, (None,None))
    b = tf.constant([[1,2],[2,3]], dtype=tf.int32, shape=(2,2))
    c = tf.placeholder(tf.int32, (None,None))
    add = tf.add(a, b)
    
    with tf.Session(config=session_config, graph=g) as sess:
      with profiler.Profiler(level="L1", aic_metrics="ArithmeticUtilization", output_path = "./"):
        result=sess.run(add, feed_dict={a: [[-20, 2],[1,3]],c: [[1],[-21]]})
    

    Currently, you can collect the profile data of a specified step by defining the specified step operation in the corresponding Profiler, as shown below.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    a=tf.placeholder(tf.int32, (None,None))
    b=tf.constant([[1,2],[2,3]], dtype=tf.int32, shape=(2,2))
    c = tf.placeholder(tf.int32, (None,None))
    d = tf.constant([[1,2],[2,3]], dtype=tf.int32, shape=(2,2))
    add1 = tf.add(a, b)
    add2 = tf.add(c, d)
    add3 = tf.add(add1, add2)
    
    with tf.Session(config=session_config, graph=g) as sess:
      with profiler.Profiler(level="L1", aic_metrics="PipeUtilization", output_path = "/home/test/profiling_data"):
        for i in range(2):
          result=sess.run(add1, feed_dict={a: [[-20],[1]]})
      with profiler.Profiler(level="L1", aic_metrics="ArithmeticUtilization", output_path = "/home/test/profiling_data"):
        for i in range(4):
          result=sess.run(add3, feed_dict={a: [[-20, 2],[1,3]],c: [[1],[-21]]})
    

    For details about the constraints on the Profiler class, see "Constraints" in Profiler Constructor.