ShardGraphsToFile

Description

This API is used for the distributed build and graph sharding of foundation models. For a large-scale graph, call this API to shard the graph and save the sharded graphs as .pb files.

Shard a graph in a session based on the AddGraph sequence. Sharded graphs are saved as .pb files.

The sharding mode is specified by the ge.graphParallelOptionPath option. See Command-Line Options. If the parallel graph function is disabled, this API does not work.

Naming rules for sharded graphs:

  • In SPMD mode: The naming format is {Original graph name}_{ClusterId}_{ItemId}_{ChipId}_{VirtualStageId}_{Original GraphId}. The number of sharded graphs equals the number of original graphs multiplied by the number of shards.
  • In non-SPMD mode: The naming format is {Original graph name}_{Original GraphId}. The number of sharded graphs equals the number of original graphs.

After a graph is sharded, it does not exist in the session. For the ID of the newly generated graph after sharding, the original graph ID is replaced by another ID.

  • Difference between this API and SaveGraphsToPb:

    ShardGraphsToFile is applicable to distributed build and graph sharding of foundation models, while SaveGraphsToPb applies to any graph.

  • Differences between this API and ShardGraphs:
    • ShardGraphsToFile can search for the strategy, shard a graph, and output sharded graphs. Sharded graphs are also flushed to disks using this API (file_path must be a valid path).
    • ShardGraphs can search for the strategy, shard a graph, and output sharded graphs. Sharded graphs are saved in the memory and then flushed to disks using SaveGraphsToPb.

Prototype

Status ShardGraphsToFile(const char_t *file_path = "./") const;

Restrictions

This API works only when the parallel graph function is enabled.

Parameters

Parameter

Input/Output

Description

file_path

Input

Directory for storing the graph and weight. It must be a valid path.

If this parameter is null, the graph is sharded without being saved as .pb files.

Returns

Parameter

Type

Description

-

Status

SUCCESS: success.

FAILED: failure.

Example

1
2
3
4
5
Session session(options); // The parallel graph function is enabled through options.
Graph init_graph("init_graph");
Graph first_graph("first_graph");
Graph second_graph("second_graph");
session.ShardGraphsToFile("/xxx/graph_path");