Profile Data Collection

You can collect profile data for performance analysis during graph loading and running. This section describes how to collect profile data.

Overview

This feature is not supported by the Atlas 200I/500 A2 inference products .

Profile data can be collected in two modes, as described in the following table.

Table 1 Profile data collection modes

No.

Collection Mode

Mode 1

Pass the options arguments to the GEInitializeV2 call:
  • ge.exec.profilingMode
  • ge.exec.profilingOptions

This method saves profiled data to the location set in the output parameter of ge.exec.profilingOptions.

Mode 2

To enable iteration tracing, pass the ge.exec.profilingOptions arguments to the GEInitializeV2 call. The required fields include training_trace, bp_point, and fp_point.

This method saves profiled data to the location set in the profiler_path parameter of aclgrphProfInit.

Configuration Before Profiling

Before profiling, obtain the sample by referring to the description in Sample Reference, build and run the graph sample, and then perform the following operations:

  1. Add #include "ge/ge_prof.h" at the beginning of the source code file main.cpp.
  2. Add the -lmsprofiler field under the LIBS line in the Makefile build script. Alternatively, add the msprofiler field under the target_link_libraries line in the CMakeLists.txt file.

The header file ge/ge_prof.h is used to define the Profiling configuration APIs. The corresponding library file is libmsprofiler.so. The header file is stored in the CANN component directory/include/ directory, and the library file is stored in the CANN component directory/lib64/ directory.

Collecting Profile Data Using Mode 2

This feature is a global feature rather than a session-level feature. The configuration takes effect on all sessions.

Call the APIs in the following sequence.

A call example is as follows.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
  // Construct a graph. The details are omitted here.
  // ...

  // Initialize the GE.
  std::map<std::string, std::string> ge_options = {{"ge.socVersion", "xxx"}, {"ge.graphRunMode", "1"}};
  ge::GEInitializeV2(ge_options);

 // Configure and start Profiling.
  std::string profilerResultPath = "/home/test/prof";        // This path must be created in advance.
  uint32_t length = strlen("/home/test/prof");
  ret = ge::aclgrphProfInit(profilerResultPath.c_str(), length); // Initialize Profiling.

  // Create a session and add a graph.
  std::map<string, string> options = {{"a", "b"}, {"c", "d"}};
  uint32_t graphId = 0;
  ge::Graph graph;
  ge::GeSession *session = new GeSession(options);
  ret = session->AddGraph(graphId, graph);

 // Set sampling parameters for Profiling.
  uint32_t deviceid_list[1] = {0};
  uint32_t device_nums = 1;
  uint64_t data_type_config = ProfDataTypeConfig::kProfTaskTime | ProfDataTypeConfig::kProfAiCoreMetrics | ProfDataTypeConfig::kProfAicpu | ProfDataTypeConfig::kProfTrainingTrace | ProfDataTypeConfig::kProfHccl | ProfDataTypeConfig::kProfL2cache;
  std::vector<ge::Tensor> inputs;
  std::vector<ge::Tensor> outputs;
  ProfAicoreEvents *aicore_events = NULL;
  ProfilingAicoreMetrics aicore_metrics = ProfilingAicoreMetrics::kAicoreArithmeticUtilization;  
  ge::aclgrphProfConfig *pro_config = ge::aclgrphProfCreateConfig(deviceid_list, device_nums, aicore_metrics, aicore_events, data_type_config);

  ge::aclgrphProfStart(pro_config);          // Start Profiling.

  session->RunGraph(graphId, inputs, outputs);  // Run the graph.

  ge::aclgrphProfStop(pro_config);            // Stop profile data collection.

  // Destroy allocations.
  ge::aclgrphProfDestroyConfig(pro_config);   // Destroy the Profiling configuration.

  ge::aclgrphProfFinalize();                   // End profiling.

  delete session;                                 // Release the session.
  ge::GEFinalizeV2();                               // Destroy GE allocations.