Weight Update
API Call Sequence
In the weight update scenario, you can call the following APIs to dynamically update the weight during model execution after the model is built:
- Build and save the model in Ascend Graph mode. The model contains the inference graph, weight initialization graph, and weight update graph.
The aclgrphBundleBuildModel API is called to build the model, and the aclgrphBundleSaveModel API is called to save the model. For details about the APIs, see Ascend Graph Developer Guide.
Weight initialization is optional. You can determine whether to include the weight initialization graph based on the service scenario. If the weight initialization graph is not included, the device memory required for model loading can be saved.
- Call the bundle_load_from_file or load_from_mem API to load the model.
- Call the bundle_get_model_id API to obtain the IDs of the three graphs.
- Call the model execution API (for example, execute) to execute the weight initialization graph based on the weight initialization graph ID.
- To update the weight, call the set_dataset_tensor_desc API to set the tensor description of the graph before updating the weight.
- Call the model execution API (for example, execute) to update the weight graph based on the weight update graph ID.
- Call the model execution API (for example, execute) to execute the inference graph based on the inference graph ID.
- After the inference is complete, call the bundle_unload API to unload the model.
Sample Code
The examples in this section focus on the code logic of model inference. For details about AscendCL initialization and deinitialization, see Initializing pyACL. For details about how to allocate and release runtime resources, see Runtime Resource Allocation and Deallocation.
After APIs are called, add an exception handling branch, and record error logs and warning logs. The following is a code snippet of key steps only, which is not ready to be built or run.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
# 1. Initialize resources. ret = acl.init(config_path) ret = acl.rt.set_device(device_id) # 2. Load the model built in Ascend Graph mode. The model contains the inference graph, weight initialization graph, and weight update graph. The model file bundle.om is used as an example. bundle_id, ret = acl.mdl.bundle_load_from_file("./bundle.om") # 3. Obtain the ID of each graph in the model. model_num, ret = acl.mdl.bundle_get_model_num(bundle_id) # The input parameters of aclgrphBundleBuildModel are three images with fixed IDs. infer_id, ret = acl.mdl.bundle_get_model_id(bundle_id, 0) init_id, ret = acl.mdl.bundle_get_model_id(bundle_id, 1) update_id, ret = acl.mdl.bundle_get_model_id(bundle_id, 2) # If the weight does not need to be updated, execute the weight initialization graph and inference graph. # 4. Execute the weight initialization graph. For details about how to prepare the model input and output, see the sample code in other sections about inference features under "Model Inference". ret = acl.mdl.execute(init_id, init_mdl_input, init_mdl_output) # 5. Execute the inference graph. For details about how to prepare the model input and output, see the sample code in other sections about inference features under "Model Inference". ret = acl.mdl.execute(infer_id, infer_mdl_input, infer_mdl_output) # If a weight needs to be updated, update the weight before executing the inference graph. # 6. Execute the weight update graph. // If a weight does not need to be updated, for example, the 0th weight, the shape can be passed as an empty tensor, but the device memory must be valid. no_need_refresh_index = 0 dims = [0] # If the elements in the dims array are 0, the tensor is empty. tensor_desc = acl.create_tensor_desc(data_type, dims, format) update_mdl_input, ret = acl.mdl.set_dataset_tensor_desc(update_mdl_input, tensor_desc, no_need_refresh_index) # The following is an example of updating the first weight. need_refresh_index = 1 dims = [1, 3, 224, 224] tensor_desc = acl.create_tensor_desc(data_type, dims, format) update_mdl_input, ret = acl.mdl.set_dataset_tensor_desc(update_mdl_input, tensor_desc, need_refresh_index) # 7. Execute the weight update graph. For details about how to prepare the model input and output, see the sample code in other sections about inference features under "Model Inference". ret = acl.mdl.execute(update_id, update_mdl_input, update_mdl_output) # 8. Execute the inference graph. For details about how to prepare the model input and output, see the sample code in other sections about inference features under "Model Inference". ret = acl.mdl.execute(infer_id, infer_mdl_input, infer_mdl_output) # 9. Unload the bundle model. ret = acl.mdl.unload(bundle_id) # 10. Destroy allocations. ret = acl.rt.reset_device(0) ret = acl.finalize() |