initialize_system
Applicability
Product |
Supported |
|---|---|
√ |
|
√ |
|
☓ |
|
☓ |
|
√ |
Description
Excludes the GE initialization time in the training time statistics. Generally, this API is not required for training. Before using the collective communication API, call this API to initialize the collective communication.
Prototype
1 | def initialize_system(name = None) |
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
name |
Input |
Operator name |
Returns
An operator for the user to initialize GE by using sess.run(op)
Restrictions
If the initialize_system API needs to be called and the following functions need to be enabled during training, the configuration must be performed when a session is started in initialize_system.
Example
If you use an HCCL API such as get_local_rank_id, get_rank_size, or get_rank_id before sess.run() or estimator.train(), you need to start another session and execute initialize_system to initialize collective communication. After the training is complete, execute shutdown_system and close the session.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | import tensorflow as tf from npu_bridge.npu_init import * npu_int = npu_ops.initialize_system() npu_shutdown = npu_ops.shutdown_system() config = tf.ConfigProto() custom_op = config.graph_options.rewrite_options.custom_optimizers.add() custom_op.name = "NpuOptimizer" custom_op.parameter_map["use_off_line"].b = True config.graph_options.rewrite_options.remapping = RewriterConfig.OFF config.graph_options.rewrite_options.memory_optimization = RewriterConfig.OFF init_sess = tf.Session(config=config) init_sess.run(npu_int) # Call an HCCL API... # Perform training... init_sess.run(npu_shutdown) init_sess.close() |
Or:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | import tensorflow as tf from npu_bridge.npu_init import * npu_init = npu_ops.initialize_system() npu_shutdown = npu_ops.shutdown_system() config = tf.ConfigProto() custom_op = config.graph_options.rewrite_options.custom_optimizers.add() custom_op.name = "NpuOptimizer" custom_op.parameter_map["use_off_line"].b = True config.graph_options.rewrite_options.remapping = RewriterConfig.OFF config.graph_options.rewrite_options.memory_optimization = RewriterConfig.OFF with tf.Session(config=config) as sess: sess.run(npu_init) # Call an HCCL API... # Perform training... sess.run(npu_shutdown) |