LaunchWithZeroEleOutputTensors
Function
When the operator output is an all-empty tensor, you can configure the operator to still be executed on the NPU.
Prototype
1 | OpAICoreDef &OpAICoreDef::LaunchWithZeroEleOutputTensors(bool launchFlag) |
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
launchFlag |
Input |
For a custom operator developed by the user, if the operator needs to be executed on the NPU when all outputs are empty tensors, set this parameter to true. Otherwise, the operator will not be executed. |
Returns
OpAICoreDef operator definition. For details, see OpAICoreDef.
Restrictions
None
Example
class AddCustom : public OpDef {
public:
AddCustom(const char* name) : OpDef(name)
{
this->Input("x")
.ParamType(REQUIRED);
this->Input("y")
.ParamType(REQUIRED);
this->Output("z")
.ParamType(REQUIRED);
this->SetInferShape(ge::InferShape);
this->AICore()
.SetTiling(optiling::TilingFunc)
.SetTilingParse(optiling::TilingPrepare)
.SetOpSelectFormat(optiling::OpSelectFormat)
.SetCheckSupport(optiling::CheckSupported)
.LaunchWithZeroEleOutputTensors(true);
OpAICoreConfig aicConfig;
aicConfig.DynamicCompileStaticFlag(true)
.DynamicFormatFlag(true)
.DynamicRankSupportFlag(true)
.DynamicShapeSupportFlag(true)
.NeedCheckSupportFlag(true)
.PrecisionReduceFlag(true);
// Note: Replace soc_version with the actual AI Processor version.
this->AICore().AddConfig("soc_version", aicConfig);
}
};
Parent topic: OpAICoreDef