Weight Update

In the weight update scenario, you can call the following APIs to dynamically update the weight during model execution after the model is built.

Overview

The following figure illustrates the API call sequence for building a graph with updatable weights into an offline model.

aclgrphBuildInitialize: initializes the system and allocates resources after a graph is defined.
aclgrphConvertToWeightRefreshableGraphs: generates graphs whose weights can be updated, including the weight initialization graph, weight update graph, and inference graph.
aclgrphBundleBuildModel: builds the graph with updated weight into the offline model adapted to the Ascend AI Processor, during which the built-in OPP and custom OPP are loaded. The model is stored in the memory buffer.
aclgrphBundleSaveModel: serializes the offline model in the memory buffer to an .om file.
aclgrphBuildFinalize: ends the process and destroys allocations.

The APIs for model building and saving can be called repeatedly in a process, enabling building and saving of multiple offline models.

Procedure

      
       
         
         
           // Included header files:
#include "ge_ir_build.h" 
#include "ge_api_types.h"

// 1. Generate a graph.
Graph origin_graph("Irorigin_graph");
GenGraph(origin_graph);

// 2. After the graph is created, call aclgrphBuildInitialize to initialize the system and allocate resources.
std::map<AscendString, AscendString> global_options;
auto status = aclgrphBuildInitialize(global_options);

// 3. Generate graphs whose weights can be updated, including the weight initialization graph, weight update graph, and inference graph.
// Weight initialization is optional. You can determine whether to include the weight initialization graph based on the service requirements. If the weight initialization graph is not included, the device memory required for model loading can be saved.
WeightRefreshableGraphs weight_refreshable_graphs;
std::vector<AscendString> const_names;
const_names.emplace_back(AscendString("const_1"));
const_names.emplace_back(AscendString("const_2"));
status = aclgrphConvertToWeightRefreshableGraphs(origin_graph, const_names, weight_refreshable_graphs);
if (status != GRAPH_SUCCESS) {
    cout << "aclgrphConvertToWeightRefreshableGraphs failed!" << endl;
    aclgrphBuildFinalize();
    return -1;
}

// 4. Build the graph whose weight can be updated into an offline model and save it in the memory buffer.
std::map<AscendString, AscendString> options;
std::vector<ge::GraphWithOptions> graph_and_options;
// Inference graph
graph_and_options.push_back(GraphWithOptions{weight_refreshable_graphs.infer_graph, options});
// Weight initialization graph
graph_and_options.push_back(GraphWithOptions{weight_refreshable_graphs.var_init_graph, options});
// Weight update graph
graph_and_options.push_back(GraphWithOptions{weight_refreshable_graphs.var_update_graph, options});
ge::ModelBufferData model;
status = aclgrphBundleBuildModel(graph_and_options, model);
if (status != GRAPH_SUCCESS) {
    cout << "aclgrphBundleBuildModel failed" << endl;
    aclgrphBuildFinalize();
    return -1;
}

// 5. Save the model in the memory buffer as an offline model file.
const char *file="./xxx" ;
status = aclgrphBundleSaveModel(file, model);
if (status != GRAPH_SUCCESS) {
    cout << "aclgrphBundleSaveModel failed" << endl;
    aclgrphBuildFinalize();
    return -1;
}
// 6. End the build process and destroy allocations.
aclgrphBuildFinalize();

          

        

      
     

Follow-up Procedure

If the acl API is used to perform inference on the offline model generated after the preceding graph with weight updated is built, you must use the aclmdlBundleLoadFromFile or aclmdlBundleLoadFromMem API to load the model and then use the aclmdlExecute API to perform inference. For details, see ""Weight Update"" in Application Development Guide (C&C++).

Parent topic: Special Topics