Model Adaptation

You need to make your model compatible, and you can add Rec SDK TensorFlow functions to your model during the adaptation process. This section describes some key steps in model adaptation and how to add desired functions.

You can use functions together by just modifying the corresponding key steps. To view the invoking process of a single function, see Function Training Process.

The functions of feature eviction and dynamic capacity expansion of the on-chip memory cannot be enabled at the same time.

The key steps are as follows:

  1. Initialize the framework.

    Initialize the Rec SDK TensorFlow model training framework by calling init.

    If you want to add a function, select the required function in this step and do as follows.

    Table 1 Features

    Feature

    Configuration Procedure

    Dynamic capacity expansion

    Set use_dynamic_expansion to True to enable dynamic capacity expansion of the NPU's on-chip memory. The default value of this parameter is False. The DDR and SSD modes support only dynamic capacity expansion of the memory or drive.

    Dynamic shape

    Set use_dynamic = True in the init API.

    Before enabling dynamic shape, install the Kernels operator package. For details, see "Installing Kernels" in "Installing CANN" in CANN Software Installation Guide.

    Automatic graph modification

    -

    Feature access and eviction

    -

  2. Define an optimizer.

    Select an optimizer under mx_rec.optimizers and call the optimizer API to obtain the optimizer object at the sparse network layer. For details about the available optimizers, see Optimizers. The optimization API of the dense network layer can use the built-in optimizer of TensorFlow.

    If you want to add a function, select the required function in this step and do as follows.

    Table 2 Features

    Feature

    Configuration Procedure

    Dynamic capacity expansion

    Call the create_hash_optimizer_by_address API of the corresponding optimizer in the mx_rec.optimizers package to create a sparse_optimizer table to enable dynamic capacity expansion of the on-chip memory. The following lists the available optimizers:

    Dynamic shape

    -

    Automatic graph modification

    -

    Feature access and eviction

    -

  3. Define features or enable automatic graph modification.
    • Defining the feature list and model

      Use FeatureSpec to define the feature list and configure the corresponding model.

      If you want to add a function, select the required function in this step and do as follows.

      Table 3 Features

      Feature

      Configuration Procedure

      Dynamic capacity expansion

      -

      Dynamic shape

      -

      Feature access and eviction

      In FeatureSpec mode, perform the configuration by referring to FeatureSpec.
      1. To enable the access function, set access_threshold to a value greater than or equal to 0 (unit: count). If access_threshold is set to a value less than -1, a parameter error is reported.
      2. To enable feature eviction, perform the following steps:
        1. Set eviction_threshold to a value greater than or equal to 0 (unit: second). If the threshold is less than -1, a parameter error is reported.
        2. Set index_key to FeatureSpec of timestamp and carry the is_timestamp=True parameter, indicating that the dataset contains a timestamp.
        3. Use the EvictHook API to set hook for the eviction triggering mode. This API contains three parameters: evict_enable=True, evict_time_interval=24 * 60 * 60, and evict_step_interval=10000, which respectively indicate the eviction function switch, eviction triggering interval (unit: second), and global step interval. Either evict_time_interval or evict_step_interval can be set.
      3. The feature eviction function hook is used only in training mode.
    • Automatic graph modification

      Skip this step if you select this mode.

  4. Define a dataset. Skip this step if you select automatic graph modification mode.

    Use FeatureSpec to define a feature list, create a dataset based on the feature list, preprocess the dataset, call the get_asc_insert_func API to obtain the data preprocessing API of Rec SDK TensorFlow, and apply the API to the dataset.

  5. Create a sparse table.

    Create a sparse network layer by calling the create_table API. A sparse network layer can be created for each sparse feature.

  6. Create a model computational graph.

    Import the sparse network layer and feature list, create a model computational graph, and call the sparse_lookup API in the computational graph to query features and calculate errors.

    If you want to add a function, select the required function in this step and do as follows.

    Table 4 Features

    Feature

    Configuration Procedure

    Dynamic capacity expansion

    -

    Dynamic shape

    -

    Automatic graph modification

    Call sparse_lookup to query the sparse feature table. If modify_graph is set to True, the automatic graph modification mode is used during table query. The default value of this parameter is False.

    Feature access and eviction

    In automatic graph modification mode, you need to set access_and_evict_config parameter when using sparse_lookup. The parameter type is dict consisting of two key-value pairs. key is access_threshold and eviction_threshold, and value is the corresponding threshold.

  7. Define the gradient calculation and optimization processes.

    Call get_dense_and_sparse_variable to obtain the parameters of the dense network layer and sparse network layer. Use the optimizer to calculate gradients and perform optimization.

    If you want to add a function, select the required function in this step and do as follows.

    Table 5 Features

    Feature

    Configuration Procedure

    Dynamic capacity expansion

    Dynamic capacity expansion of the on-chip memory.

    1. Obtain the embedding representation result (emb) and mapping address (addr).
      • Use the tf.get_collection("ASCEND_SPARSE_LOOKUP_LOCAL_EMB") API to obtain the embedding representation result for training.
      • Use the tf.get_collection("ASCEND_SPARSE_LOOKUP_ID_OFFSET") API to obtain the mapping address for training.
    2. Perform backward gradient calculations. Use the tf.gradients(loss, emb) API to calculate the derivation of the embedding representation result obtained in 1 to obtain the gradient (grad).
    3. Perform backward sparse table update.

      Use the sparse optimizer to import the created sparse_optimizer.apply_gradients([grad, addr]) API to update the sparse table corresponding to the mapping address.

    Dynamic shape

    -

    Automatic graph modification

    -

    Feature access and eviction

    -

  8. Loads and preprocesses data.
    • FeatureSpec mode

      Call start_asc_pipeline to start data pipeline.

    • Automatic graph modification mode

      Call modify_graph_and_start_emb_cache and change sess.run(iterator.initializer) to the dataset initialization API sess.run(get_initializer(True)) or sess.run(get_initializer(False)) for automatic graph modification. The former is used for training, and the latter is used for evaluation.