create_table

Function

Creates a sparse table.

Prototype

1
def create_table(key_dtype, dim, name, emb_initializer, device_vocabulary_size=1, host_vocabulary_size=0, ssd_vocabulary_size=0, ssd_data_path=(os.getcwd(),), is_save=True, is_dp=False, init_param=1.0, all2all_gradients_op=All2allGradientsOp.SUM_GRADIENTS.value, enable_merge=False, padding_keys=None, padding_keys_mask=False, padding_keys_len=None, value_dtype=tf.float32, shard_num=1, fusion_optimizer_var=True, hashtable_threshold=0)

Parameters

Parameter

Type

Mandatory/Optional

Description

key_dtype

Dtype of TensorFlow

Mandatory

Data type of the sparse feature key. The value can be tf.int64 and tf.int32.

dim

  • int
  • tf.Tensorshape

Mandatory

Embedding dimension. The value ranges from 1 to 8192. If dim needs to be set to a value greater than 512, ensure that the memory and drive space are sufficient. Alternatively, you can use the DDR mode or reduce the vocabulary size of a sparse table.

If the input parameter is of the tf.TensorShape type, the value of ndims must be 1, indicating the dimension of the embedding layer.

Set this parameter based on the actual server configurations.

name

str

Mandatory

Name of a sparse table. The name can contain only digits, letters, underscores (_), and periods (.). Range of the table name length: [1, 100]

The sparse table name must be unique.

emb_initializer

Initializer type of TensorFlow

Mandatory

Initial value generator at the embedding layer.

device_vocabulary_size

int

Optional

Number of embedding layers on the device. The default value is 1. The value ranges from 1 to 1 billion. If this parameter is set to a value greater than 25600000, ensure that the memory and drive space are sufficient. Alternatively, you can enable dynamic capacity expansion of the on-chip memory or reduce the size of dim of a sparse table. Set this parameter based on the actual server configurations.

If DDR/SSD storage is enabled, that is, host_vocabulary_size is not 0, device_vocabulary_size must be greater than or equal to the number of keys after deduplication in two consecutive batches. The on-chip memory must be able to store data of at least two batches. In this case, the on-chip memory is used only as cache.

host_vocabulary_size

int

Optional

Number of embedding layers stored in the DDR on the host. The default value is 0. The value ranges from 0 to 1 billion.
  • If the value is 0, the DDR function on the host is disabled. If the SSD storage is not enabled, ensure that the DDR can store all data.
  • If the value is not 0, the function is enabled. In this case, you need to disable dynamic capacity expansion on the on-chip memory side, that is, set use_dynamic_expansion to False. By default, the dynamic capacity expansion mode on the DDR memory side is used. If host_vocabulary_size is set to a value greater than 100 million, ensure that the memory and drive space are sufficient, or reduce the value of dim in the sparse table.

In dynamic capacity expansion mode (when use_dynamic_expansion is set to True), the on-chip memory is used as the unique storage unit by default, and this variable is set to 0.

OMM occurs when the memory of the single-server system is exceeded.

Set this parameter based on the actual server configurations.

ssd_vocabulary_size

int

Optional

Enables the function of storing embedding data on SSDs. The default value is 0, indicating that the function is disabled. If the value is greater than 0, this function can be enabled only when the value of host_vocabulary_size is also greater than 0. The value ranges from 0 to 1 billion.

In dynamic capacity expansion mode (when use_dynamic_expansion is set to True), the on-chip memory is used as the unique storage unit by default, and this variable is set to 0.

Set this parameter based on the actual server configurations.

ssd_data_path

  • list[str]
  • Tuple[str]

Optional

The value is the path of the running script by default.
  • If the list is empty, the SSD storage path is the one where the running script is located by default.
  • If the list is not empty and the path is valid, data is stored in the corresponding path in sequence.
  • If the drive space corresponding to a path is insufficient, the system tries the next path until an exception is thrown when all drive space is insufficient.

is_save

bool

Optional

Whether to save embedding data. The default value is True.

Value:

  • True: Save embedding data.
  • False: Do not save embedding data.

is_dp

bool

Optional

Whether to enable the data parallelism function of sparse tables. The default value is False.

Enable the DP mode (is_dp = True) to configure a data parallel mode for sparse tables. You are advised to enable this mode when there are 10 GB-level tables, an end-to-end bottleneck occurs on the NPU, and the NPU ALL2ALL communication traffic of sparse tables is less than 16 MB. This can improve performance to some extent, with a sparse communication gain of approximately 15%.

Note that the function and accuracy are not affected when the DP and MP modes are used together for training resumption. However, this operation is not recommended.

init_param

float

Optional

Coefficient for embedding initialization. The default value is 1.0. Value range: [–10, 10].

If the value of init_param is greater than 1.0 or less than -1.0, you are advised to decrease the value of batch_size to prevent program exceptions caused by excessive NPU usage.

all2all_gradients_op

string

Optional

Gradient aggregation mode after distributed gradient backpropagation. The default value is sum_gradients.

  • sum_gradients: sum of gradients of all ranks.
  • sum_gradients_and_div_by_ranksize: sum of all rank gradients divided by rank size.

enable_merge

bool

Optional

The embedding table combination function determines whether embedding tables can be combined based on whether the key_dtype, dim, emb_initializer, is_save, is_dp, init_param, all2all_gradients_op, and padding_keys_mask (optional) parameters are the same during creation. The combined embedding table effectively reduces the number of CPU threads and the number of communication channels between the host and device, saving system resources.

The IDs of the tables to be merged must be independent of each other. Otherwise, the precision will be affected. For example, if UserEmbeddingTable and ItemEmbeddingTable are to be merged, the values of UserID and ItemID cannot be the same.

Currently, only the dynamic on-chip memory expansion mode is supported.

To enable automatic table merging, you need to enable the automatic graph modification and one-table-multiple-query functions.

  • False: Disable automatic merging. This is the default value.
  • True: Enable automatic merging.

padding_keys

  • int64
  • list[int64]
  • None

Optional

This parameter is usually the key of the sparse feature in the dataset. The default value is None. In this case, you need to set padding_keys_mask to False and padding_keys_len to None, indicating normal training update. If the value is int64/list[int64], set padding_keys_mask to True and padding_keys_len to shape, where shape indicates the shape of the sparse feature in the dataset corresponding to the key, indicating that the embedding corresponding to the key does not need to be updated.

padding_keys_mask

bool

Optional

Whether to update the embedding corresponding to padding_keys. The default value is False. In this case, set padding_keys to None and padding_keys_len to None, indicating that the embedding is updated during normal training. If the value is True, set padding_keys to int64/list[int64] and padding_keys_len to int32, indicating that the embedding does not need to be updated.

padding_keys_len

  • int32
  • None

Optional

Shape of the sparse feature in the dataset. Generally, the value is in the format of batch size x feature vector dimension. The default value is None. In this case, set padding_keys to None and padding_keys_mask to False, indicating normal training update. If the value is int32, set padding_keys to int64/list[int64] and padding_keys_mask to True, indicating that the embedding corresponding to padding_keys does not need to be updated.

value_dtype

Dtype of TensorFlow

Optional

Data type of the sparse feature value. Only tf.float32 is supported, which is also the default value.

shard_num

int

Optional

Number of partitions at the embedding layer. The default value is 1. Value range: [1, 8192]

fusion_optimizer_var

bool

Optional

Whether to enable fusion optimization. The default value is True.

Value:

  • True: fusion optimization enabled
  • False: fusion optimization disabled

hashtable_threshold

int

Optional

Hash table threshold. If the value is greater than the threshold, the hash table is used. If the value is less than the threshold, the linear table is used. The default value is 0. Value range: [0, 2147483647].

  • For padding_keys, padding_keys_mask, and padding_keys_len:
    • The DP mode and lazy adam fusion operator mode are not supported.
    • If padding keys need to be set for all tables, set the static shape mode in the initialization API, for example, init(use_dynamic=False). In this case, drop_remainder=True in the constraint model script, for example, dataset = dataset.batch(batch_size, drop_remainder=True).
  • Before enabling the DDR/SSD mode, ensure that the automatic graph modification mode is also enabled.
  • If dynamic capacity expansion of the on-chip memory is required (use_dynamic_expansion=True is passed to init), the passed host_vocabulary_size and ssd_vocabulary_size parameters are set to 0, meaning they do not take effect. Or you can leave them blank.
  • If dynamic capacity expansion of the on-chip memory is not required, determine the storage mode based on whether the values of host_vocabulary_size and ssd_vocabulary_size are 0.
  • If host_vocabulary_size is 0, the DDR function on the host is disabled. It is enabled when the value is not 0. All embedding tables must use or do not use the DDR function of the host at the same time. That is, host_vocabulary_size of all tables must be 0 or not 0 at the same time. Otherwise, an error is reported during parameter verification. The error information is as follows:
    1
    ValueError: The host-side DDR function of all tables must be used or not used at the same time. However, host voc size of each table is [].
    

Return Value

  • Success: Sparse table instance
    The following describes two methods of accessing the instance through the returned sparse table instance.

    Method

    Function

    Prototype

    Parameter

    Return Value

    size()

    Obtains the size of a sparse table.

    1
    def size()
    

    None

    • Success: size of a sparse table
    • Failure: An exception is thrown.

    capacity()

    Obtains the capacity of a sparse table.

    1
    def capacity()
    

    None

    • Success: capacity of a sparse table
    • Failure: An exception is thrown.
  • Failure: An exception is thrown.

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
import tensorflow as tf
from mx_rec.core.embedding import create_table
sparse_hashtable = create_table(key_dtype=tf.int32,
                                dim=tf.Tensorshape([128]),
                                name="sparse_embeddings_table",
                                emb_initializer=tf.truncated_normal_initializer(),
                                device_vocabulary_size=24_000_000 * 8,
                                host_vocabulary_size=0)
table_size = sparse_hashtable.size()      # Obtain the used size of a sparse table.
table_capacity = sparse_hashtable.capacity()   # Obtain the capacity of a sparse table.

See Also

For details about the API call sequence and example, see Porting and Training.