ReportRecoverStrategy

Description

Receives a fault recovery policy of a job that is reported by the client.

Prototype

 rpc ReportRecoverStrategy(RecoverStrategyRequest) returns (Status) {}

Input Parameters

Parameter

Type (Defined by Protobuf)

Description

RecoverStrategyRequest

message RecoverStrategyRequest{

string jobId = 1;

repeated FaultRank faultRankIds = 2;

repeated string strategies = 3;

}

RecoverStrategyRequest.jobId: job ID

RecoverStrategyRequest.faultRankIds: global fault rank list of faulty processors. FaultRank is a key-value pair of fault information, including rankId (global rank ID) and faultType (fault type). faultType = 0 indicates an on-chip memory fault. faultType = 1 indicates other faults.

RecoverStrategyRequest.strategies: recovery policy supported by the current job

Return Value

Return Value

Type (Defined by Protobuf)

Description

Status

message Status{

int32 code = 1;

string info = 2;

}

Status.code: return code

  • 0: The recovery process is normal.
  • Other values: The recovery process is abnormal and rescheduling is triggered.

Status.info: return information