register_checker

Function

Registers an asynchronous callback.

Format

mindio_acp.register_checker(callback, check_dict, user_context, timeout_sec)

Parameters

Parameter

Mandatory/Optional

Description

Value

callback

Mandatory

Callback. The first parameter result indicates the data integrity check result (0: success; other values: failure), and the second parameter is user_context.

Valid function name

check_dict

Mandatory

Data integrity check condition of the dict type, which is used to check whether the number of files in a specified path meets the requirement.

  • key: data path
  • value: number of files in the corresponding path specified by key

user_context

Mandatory

Second parameter of the callback.

-

timeout_sec

Mandatory

Callback timeout interval, in seconds

NOTE:

If the training client log contains "watching checkpoint failed," increase the value of this parameter.

The code is in the async_write_tracker_file function in the actual installation path (mindio_acp/acc_checkpoint/framework_acp.py) of mindio_acp.

[1, 3600]

Usage Example

>>> def callback(result, user_context):
>>>    if result == 0:
>>>        print("success")
>>>    else:
>>>        print("fail")
>>> context_obj = None
>>> check_dict = {'/mnt/dpc01/checkpoint-last': 4}
>>> mindio_acp.register_checker(callback, check_dict, context_obj, 1000)

Return Value

  • None: failure
  • 1: success