RunGraphDistribute
Applicability
|
Product |
Supported or Not |
|---|---|
|
|
√ |
|
|
√ |
|
|
x |
|
|
√ |
|
|
√ |
Header File/Library File
- Header file: #include <ge/ge_api.h>
- Library file: libge_runner.so
Function Usage
Runs the graph corresponding to a specified ID synchronously after the input refdata node is partitioned, and outputs the running result.
Difference between this API and RunGraph: The input refdata node of this API is partitioned, and the output is the output result of each device.
Prototype
1
|
Status RunGraphDistribute(uint32_t graph_id, const std::map<int32_t, std::vector<Tensor>> &device_to_inputs, std::map<int32_t, std::vector<Tensor>> &device_to_outputs) |
Parameters
|
Parameter |
Input/Output |
Description |
|---|---|---|
|
graph_id |
Input |
ID of the graph to be run. |
|
device_to_inputs |
Input |
Input tensors of the computational graph, which are memory allocated on the host. Use const std::map<int32_t, std::vector<Tensor>> as inputs, which correspond to each device ID after partitioning. |
|
device_to_outputs |
Output |
Output tensors of the computational graph. You do not need to allocate the memory manually. After the running is complete, GE allocates and initializes the memory. Use std::map<int32_t, std::vector<Tensor>> as the output result, which records the computational graph result corresponding to each device ID. |
Returns
Restrictions
- For the full input in graph_id, the input sequence is as follows: Model data input + batch_index + kv.
- For the incremental input in graph_id, the input sequence is as follows: Model data input + kv.