SubscribeProcessManageSignal

Description

Receives requests for subscribing to process control signals from the client. The server allocates a message queue to each job and monitors whether the message queue contains messages to be transmitted. If yes, the server transmits the messages to the client through the gRPC stream.

Prototype

 rpc SubscribeProcessManageSignal(ClientInfo) returns (stream ProcessManageSignal){}

Input Parameters

Parameter

Type (Defined by Protobuf)

Description

ClientInfo

message ClientInfo{

string jobId = 1;

string role = 2;

}

ClientInfo.jobId: job ID

ClientInfo.role: client role

Data to Be Sent

Parameter

Type (Defined by Protobuf)

Description

ProcessManageSignal

message FaultRank{

string rankId = 1;

string faultType = 2;

}

message ProcessManageSignal{

string uuid=1;

string jobId = 2;

string signalType = 3;

repeated string actions = 4;

repeated FaultRank faultRanks = 5;

string changeStrategy = 6;

int64 timeout = 7;

}

rankId: string; fault card ID

faultType: string; fault type

uuid: string; UUID of the signal

jobId: string; training job ID

signalType: string; signal type

actions: repeated string; actions to be executed

faultRanks: repeated FaultRank; fault card information

changeStrategy: string; recovery policy to be executed

timeout: int64; timeout interval

Return Value

Return Value

Type (Defined by Protobuf)

Description

stream

gRPC stream

  • This API returns a gRPC stream. (The data structure of the return value is based on the programming language selected by the client.)
  • The client can call the stream's Receive method (the actual name is determined by the client's programming language) to receive data pushed by the server.

nodeRankIds

String array

Node rank IDs of faulty nodes.

extraParams

String

The scaling policy information is serialized as a JSON string, passed transparently to MindIO via TaskD, and ultimately delivered to the callback function for parsing.