Model Adaptation
You need to make your model compatible, and you can add Rec SDK TensorFlow functions to your model during the adaptation process. This section describes some key steps in model adaptation and how to add desired functions.
You can use functions together by just modifying the corresponding key steps. To view the invoking process of a single function, see Function Training Process.
The functions of feature eviction and dynamic capacity expansion of the NPU's on-chip memory cannot be enabled at the same time.
The key steps are as follows:
- Initialize the framework.
Initialize the Rec SDK TensorFlow model training framework by calling init.
If you want to add a function, select the required function in this step and do as follows.
Table 1 Features Feature
Configuration Procedure
Dynamic capacity expansion
Set use_dynamic_expansion to True to enable dynamic capacity expansion of the NPU's on-chip memory. The default value of this parameter is False. The DDR and SSD modes support only dynamic capacity expansion of the memory or drive.
Dynamic shape
Set use_dynamic = True in the init API.
Before enabling dynamic shape, install the Kernels operator package. For details, see "Installing Kernels" in "Installing CANN" in CANN Software Installation Guide.
Automatic graph modification
-
Feature access and eviction
-
- Define features or enable automatic graph modification.
- Defining the feature list and model
Use FeatureSpec to define the feature list and configure the corresponding model.
If you want to add a function, select the required function in this step and do as follows.Table 2 Features Feature
Configuration Procedure
Dynamic capacity expansion
-
Dynamic shape
-
Feature access and eviction
- To enable the access function, set access_threshold to a value greater than or equal to 0 (unit: count). If access_threshold is set to a value less than -1, a parameter error is reported.
- To enable feature eviction, perform the following steps:
- Set eviction_threshold to a value greater than or equal to 0 (unit: second). If the threshold is less than -1, a parameter error is reported.
- Set index_key to FeatureSpec of timestamp and carry the is_timestamp=True parameter, indicating that the dataset contains a timestamp.
- Use the EvictHook API to set hook for the eviction triggering mode. This API contains three parameters: evict_enable=True, evict_time_interval=24 * 60 * 60, and evict_step_interval=10000, which respectively indicate the eviction function switch, eviction triggering interval (unit: second), and global step interval. Either evict_time_interval or evict_step_interval can be set.
- The feature eviction function hook is used only in training mode.
- Automatic graph modification
In NPUEstimator mode, you need to add GraphModifierHook of the automatic graph modification function to multiple NPUEstimator modes (train, predict, and train_and_evaluate). For example, if the current mode is train, add GraphModifierHook to the training hook to complete training in automatic graph modification mode.
If you want to add a function, select the required function and do as follows.
Table 3 Features Feature
Configuration Procedure
Dynamic capacity expansion
-
Dynamic shape
-
Feature access and eviction
When using sparse_lookup, you need to set access_and_evict_config. The parameter type is dict consisting of two key-value pairs. The values of key are access_threshold and eviction_threshold, and value is the corresponding threshold.
- Defining the feature list and model
- Define a dataset. Skip this step if you select automatic graph modification mode.
Use FeatureSpec to define a feature list, create a dataset based on the feature list, preprocess the dataset, call the get_asc_insert_func API to obtain the data preprocessing API of Rec SDK TensorFlow, and apply the API to the dataset.
- Define an optimizer.
Select an optimizer under mx_rec.optimizers and call the optimizer API to obtain the optimizer object at the sparse network layer. For details about the available optimizers, see Optimizers. The optimization API of the dense network layer can use the built-in optimizer of TensorFlow.
If you want to add a function, select the required function in this step and do as follows.
Table 4 Features Feature
Configuration Procedure
Dynamic capacity expansion
Call the create_hash_optimizer_by_address API of the corresponding optimizer in the mx_rec.optimizers package to create a sparse_optimizer table to enable dynamic capacity expansion of the NPU's on-chip memory. The following lists the available optimizers:
Dynamic shape
-
Automatic graph modification
-
Feature access and eviction
-
- Create a sparse table.
Create a sparse network layer by calling the create_table API. A sparse network layer can be created for each sparse feature.
In Estimator mode, the create_table API must be called in model_fn passed to Estimator. The Estimator source code creates a graph instance when model_fn is called, but it is not the same as the default graph where the entry main function is located.
Import the sparse network layer and feature list, create a model computational graph, and call the sparse_lookup API in the computational graph to query features and calculate errors.
Table 5 Features Feature
Configuration Procedure
Dynamic capacity expansion
-
Dynamic shape
-
Automatic graph modification
Query the sparse feature table. Call sparse_lookup and set modify_graph to True to enable the automatic graph modification mode during table query. The default value of this parameter is False.
Feature access and eviction
-
- Define the gradient calculation and optimization processes.
Call get_dense_and_sparse_variable to obtain the parameters of the dense network layer and sparse network layer. Use the optimizer to calculate gradients and perform optimization.
If you want to add a function, select the required function in this step and do as follows.
- Loads and preprocesses data. Skip this step if automatic graph modification is enabled.
When FeatureSpec is used to define the feature list, call start_asc_pipeline to start data pipeline.