create_table
Function
Creates a sparse table.
Prototype
1 | def create_table(key_dtype, dim, name, emb_initializer, device_vocabulary_size=1, host_vocabulary_size=0, ssd_vocabulary_size=0, ssd_data_path=(os.getcwd(),), is_save=True, is_dp=False, init_param=1.0, all2all_gradients_op=All2allGradientsOp.SUM_GRADIENTS.value, enable_merge=False, padding_keys=None, padding_keys_mask=False, padding_keys_len=None, value_dtype=tf.float32, shard_num=1, fusion_optimizer_var=True, hashtable_threshold=0) |
Parameters
Parameter |
Type |
Mandatory/Optional |
Description |
|---|---|---|---|
key_dtype |
Dtype of TensorFlow |
Mandatory |
Data type of the sparse feature key. The value can be tf.int64 and tf.int32. |
dim |
|
Mandatory |
Embedding dimension. The value ranges from 1 to 8192. If dim needs to be set to a value greater than 512, ensure that the memory and drive space are sufficient. Alternatively, you can use the DDR mode or reduce the vocabulary size of a sparse table. If the input parameter is of the tf.TensorShape type, the value of ndims must be 1, indicating the dimension of the embedding layer. Set this parameter based on the actual server configurations. |
name |
str |
Mandatory |
Name of a sparse table. The name can contain only digits, letters, underscores (_), and periods (.). Range of the table name length: [1, 100] The sparse table name must be unique. |
emb_initializer |
Initializer type of TensorFlow |
Mandatory |
Initial value generator at the embedding layer. |
device_vocabulary_size |
int |
Optional |
Number of embedding layers on the device. The default value is 1. The value ranges from 1 to 1 billion. If this parameter is set to a value greater than 25600000, ensure that the memory and drive space are sufficient. Alternatively, you can enable dynamic capacity expansion of the on-chip memory or reduce the size of dim of a sparse table. Set this parameter based on the actual server configurations. If DDR/SSD storage is enabled, that is, host_vocabulary_size is not 0, device_vocabulary_size must be greater than or equal to the number of keys after deduplication in two consecutive batches. The on-chip memory must be able to store data of at least two batches. In this case, the on-chip memory is used only as cache. |
host_vocabulary_size |
int |
Optional |
Number of embedding layers stored in the DDR on the host. The default value is 0. The value ranges from 0 to 1 billion.
In dynamic capacity expansion mode (when use_dynamic_expansion is set to True), the on-chip memory is used as the unique storage unit by default, and this variable is set to 0. OMM occurs when the memory of the single-server system is exceeded. Set this parameter based on the actual server configurations. |
ssd_vocabulary_size |
int |
Optional |
Enables the function of storing embedding data on SSDs. The default value is 0, indicating that the function is disabled. If the value is greater than 0, this function can be enabled only when the value of host_vocabulary_size is also greater than 0. The value ranges from 0 to 1 billion. In dynamic capacity expansion mode (when use_dynamic_expansion is set to True), the on-chip memory is used as the unique storage unit by default, and this variable is set to 0. Set this parameter based on the actual server configurations. |
ssd_data_path |
|
Optional |
The value is the path of the running script by default.
|
is_save |
bool |
Optional |
Whether to save embedding data. The default value is True. Value:
|
is_dp |
bool |
Optional |
Whether to enable the data parallelism function of sparse tables. The default value is False. Enable the DP mode (is_dp = True) to configure a data parallel mode for sparse tables. You are advised to enable this mode when there are 10 GB-level tables, an end-to-end bottleneck occurs on the NPU, and the NPU ALL2ALL communication traffic of sparse tables is less than 16 MB. This can improve performance to some extent, with a sparse communication gain of approximately 15%. Note that the function and accuracy are not affected when the DP and MP modes are used together for training resumption. However, this operation is not recommended. |
init_param |
float |
Optional |
Coefficient for embedding initialization. The default value is 1.0. Value range: [–10, 10]. If the value of init_param is greater than 1.0 or less than -1.0, you are advised to decrease the value of batch_size to prevent program exceptions caused by excessive NPU usage. |
all2all_gradients_op |
string |
Optional |
Gradient aggregation mode after distributed gradient backpropagation. The default value is sum_gradients.
|
enable_merge |
bool |
Optional |
The embedding table combination function determines whether embedding tables can be combined based on whether the key_dtype, dim, emb_initializer, is_save, is_dp, init_param, all2all_gradients_op, and padding_keys_mask (optional) parameters are the same during creation. The combined embedding table effectively reduces the number of CPU threads and the number of communication channels between the host and device, saving system resources. The IDs of the tables to be merged must be independent of each other. Otherwise, the precision will be affected. For example, if UserEmbeddingTable and ItemEmbeddingTable are to be merged, the values of UserID and ItemID cannot be the same. Currently, only the dynamic on-chip memory expansion mode is supported. To enable automatic table merging, you need to enable the automatic graph modification and one-table-multiple-query functions.
|
padding_keys |
|
Optional |
This parameter is usually the key of the sparse feature in the dataset. The default value is None. In this case, you need to set padding_keys_mask to False and padding_keys_len to None, indicating normal training update. If the value is int64/list[int64], set padding_keys_mask to True and padding_keys_len to shape, where shape indicates the shape of the sparse feature in the dataset corresponding to the key, indicating that the embedding corresponding to the key does not need to be updated. |
padding_keys_mask |
bool |
Optional |
Whether to update the embedding corresponding to padding_keys. The default value is False. In this case, set padding_keys to None and padding_keys_len to None, indicating that the embedding is updated during normal training. If the value is True, set padding_keys to int64/list[int64] and padding_keys_len to int32, indicating that the embedding does not need to be updated. |
padding_keys_len |
|
Optional |
Shape of the sparse feature in the dataset. Generally, the value is in the format of batch size x feature vector dimension. The default value is None. In this case, set padding_keys to None and padding_keys_mask to False, indicating normal training update. If the value is int32, set padding_keys to int64/list[int64] and padding_keys_mask to True, indicating that the embedding corresponding to padding_keys does not need to be updated. |
value_dtype |
Dtype of TensorFlow |
Optional |
Data type of the sparse feature value. Only tf.float32 is supported, which is also the default value. |
shard_num |
int |
Optional |
Number of partitions at the embedding layer. The default value is 1. Value range: [1, 8192] |
fusion_optimizer_var |
bool |
Optional |
Whether to enable fusion optimization. The default value is True. Value:
|
hashtable_threshold |
int |
Optional |
Hash table threshold. If the value is greater than the threshold, the hash table is used. If the value is less than the threshold, the linear table is used. The default value is 0. Value range: [0, 2147483647]. |
- For padding_keys, padding_keys_mask, and padding_keys_len:
- The DP mode and lazy adam fusion operator mode are not supported.
- If padding keys need to be set for all tables, set the static shape mode in the initialization API, for example, init(use_dynamic=False). In this case, drop_remainder=True in the constraint model script, for example, dataset = dataset.batch(batch_size, drop_remainder=True).
- Before enabling the DDR/SSD mode, ensure that the automatic graph modification mode is also enabled.
- If dynamic capacity expansion of the on-chip memory is required (use_dynamic_expansion=True is passed to init), the passed host_vocabulary_size and ssd_vocabulary_size parameters are set to 0, meaning they do not take effect. Or you can leave them blank.
- If dynamic capacity expansion of the on-chip memory is not required, determine the storage mode based on whether the values of host_vocabulary_size and ssd_vocabulary_size are 0.
- If host_vocabulary_size is 0, the DDR function on the host is disabled. It is enabled when the value is not 0. All embedding tables must use or do not use the DDR function of the host at the same time. That is, host_vocabulary_size of all tables must be 0 or not 0 at the same time. Otherwise, an error is reported during parameter verification. The error information is as follows:
1ValueError: The host-side DDR function of all tables must be used or not used at the same time. However, host voc size of each table is [].
Return Value
- Success: Sparse table instanceThe following describes two methods of accessing the instance through the returned sparse table instance.
Method
Function
Prototype
Parameter
Return Value
size()
Obtains the size of a sparse table.
1def size()
None
- Success: size of a sparse table
- Failure: An exception is thrown.
capacity()
Obtains the capacity of a sparse table.
1def capacity()
None
- Success: capacity of a sparse table
- Failure: An exception is thrown.
- Failure: An exception is thrown.
Example
1 2 3 4 5 6 7 8 9 10 | import tensorflow as tf from mx_rec.core.embedding import create_table sparse_hashtable = create_table(key_dtype=tf.int32, dim=tf.Tensorshape([128]), name="sparse_embeddings_table", emb_initializer=tf.truncated_normal_initializer(), device_vocabulary_size=24_000_000 * 8, host_vocabulary_size=0) table_size = sparse_hashtable.size() # Obtain the used size of a sparse table. table_capacity = sparse_hashtable.capacity() # Obtain the capacity of a sparse table. |
See Also
For details about the API call sequence and example, see Porting and Training.