Weight Update

API Call Sequence

In the weight update scenario, you can call the following APIs to dynamically update the weight during model execution after the model is built:

  1. Build and save the model in Ascend Graph mode. The model contains the inference graph, weight initialization graph, and weight update graph.

    The aclgrphBundleBuildModel API is called to build the model, and the aclgrphBundleSaveModel API is called to save the model. For details about the APIs, see aclgrphBundleBuildModel and aclgrphBundleSaveModel .

    Weight initialization is optional. You can determine whether to include the weight initialization graph based on the service scenario. If the weight initialization graph is not included, the device memory required for model loading can be saved.

  2. Call aclmdlBundleLoadFromFile or aclmdlBundleLoadFromMem to load the model.
  3. Call aclmdlBundleGetModelId to obtain the IDs of the three graphs.
  4. Call the model execution API (for example, aclmdlExecute) to execute the weight initialization graph based on the weight initialization graph ID.
  5. To update the weight, call aclmdlSetDatasetTensorDesc to set the tensor description of the graph before updating the weight.
  6. Update the graph based on the weight update graph ID and call the model execution API (for example, aclmdlExecute) to update the weight graph.
  7. Call the model execution API (for example, aclmdlExecute) to execute the inference graph based on the inference graph ID.
  8. After the inference is complete, call aclmdlBundleUnload to unload the model.

Sample Code

This section focuses on the code logic of model inference. For details about how to initialize and deinitialize AscendCL, see Initializing AscendCL. For details about how to allocate and destroy runtime resources, see Runtime Resource Allocation and Deallocation.

After APIs are called, you need to add exception handling branches and record error logs and info logs. The following is a code sample of key steps for your reference only.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// 1. Initialize resources.
aclInit(nullptr);
aclrtSetDevice(0);

// 2. Load the model built in Ascend Graph mode. The model contains the inference graph, weight initialization graph, and weight update graph. The model file bundle.om is used as an example.
uint32_t bundle_id = 0;
aclmdlBundleLoadFromFile("./bundle.om", &bundle_id);

// 3. Obtain the ID of each graph in the model.
size_t modelNum = 0;
aclmdlBundleGetModelNum(bundle_id, &modelNum);

// The input parameters of aclgrphBundleBuildModel are three images with fixed IDs.
uint32_t infer_id= 0;
aclmdlBundleGetModelId(bundle_id, 0, &infer_id);
uint32_t init_id= 0;
aclmdlBundleGetModelId(bundle_id, 1, &init_id);
uint32_t update_id= 0;
aclmdlBundleGetModelId(bundle_id, 2, &update_id);

// If the weight does not need to be updated, execute the weight initialization graph and inference graph.
// 4. Execute the weight initialization graph. For details about how to prepare the model input and output, see the sample code in other sections about inference features under "Model Inference".
aclmdlExecute(init_id, init_mdl_input, init_mdl_output);

// 5. Execute the inference graph. For details about how to prepare the model input and output, see the sample code in other sections about inference features under "Model Inference".
aclmdlExecute(infer_id, infer_mdl_input, infer_mdl_output);

// If a weight needs to be updated, update the weight before executing the inference graph.
// 6. Execute the weight update graph.
// If a weight does not need to be updated, for example, the 0th weight, the shape can be passed as an empty tensor, but the device memory must be valid.
size_t no_need_refresh_index = 0;
std::vector<int64_t> dims{0};
// If the elements in the dims array are 0, the tensor is empty.
auto tensorDesc = aclCreateTensorDesc(ACL_FLOAT, dims.size(), dims.data(), ACL_FORMAT_ND);
aclmdlSetDatasetTensorDesc(update_mdl_input, tensorDesc, no_need_refresh_index);

// The following is an example of updating the first weight.
size_t need_refresh_index = 1;
std::vector<int64_t> dims{1, 3, 224, 224};
auto tensorDesc = aclCreateTensorDesc(ACL_FLOAT, dims.size(), dims.data(), ACL_FORMAT_ND);
aclmdlSetDatasetTensorDesc(update_mdl_input, tensorDesc, need_refresh_index);

// Execute the weight update graph. For details about how to prepare the model input and output, see the sample code in other sections about inference features under "Model Inference".
aclmdlExecute(update_id, update_mdl_input, update_mdl_output);

// 8. Execute the inference graph. For details about how to prepare the model input and output, see the sample code in other sections about inference features under "Model Inference".
aclmdlExecute(infer_id, infer_mdl_input, infer_mdl_output);

// 9. Unload the bundle model.
aclmdlBundleUnload(bundle_id);

//10. Destroy allocations.
aclrtResetDevice(0);
aclFinalize();