Feature Overview
Rec SDK Torch provides the following functions:
- Core model training capabilities:
- Supports single-server single-device training and single-server multi-device distributed training.
- Supports models developed based on Torch.
- Recommendation-specific functions:
Leveraging its sparse table solution, Rec SDK Torch provides essential recommendation features, such as non-affinity operator offloading and hash mapping.
- Large-scale sparse table features:
Key Features
Rec SDK Torch provides features such as hash mapping, row-wise table splitting, EBC table lookup, pipeline table lookup, and table lookup fusion operators.
- Hash mapping
Torch provides nn.Embedding for dense ID table lookup. However, in recommendation scenarios, most original feature IDs are discrete and cannot be directly used during table lookup. The common practice is to convert discrete IDs into row numbers of the table. Therefore, Rec SDK Torch provides the hash mapping function to map discrete IDs to row numbers of the embedding table, without the need to convert IDs in advance.
- EBC table lookup
The nn.EmbeddingBag function of the native Torch is benchmarked. For multiple specified IDs, the pooling operation of summation or averaging is performed during table lookup.
- Row-wise table splitting
When splitting embeddings across multiple tables, the system partitions the embeddings row-wise. A modulo-based bucketing strategy is employed, where the remainder of the ID determines the specific bucket location of the embedding within the table.
- Pipeline table lookup
A Rec SDK Torch table lookup task consists of multiple subtasks, such as communication, CPU computing, and NPU computing. Rec SDK Torch provides a pipeline table lookup method to enable subtasks to run in parallel, fully utilizing the hardware computing power.
- Fused table lookup operators
Rec SDK Torch provides fused table lookup operators for gradient computation and optimizer to optimize table lookup performance.