表1 接口列表
接口定义	功能说明
aclOpExecutor()	aclOpExecutor是用于记录整个host侧api运行信息的上下文结构，host侧api中几乎所有操作都以该数据结构为媒介，该函数为其构造函数。
CreateView(const aclTensor *tensor, const op::Shape &shape, int64_t offset)	针对一个已有的aclTensor，创建一个它的view类tensor，两个tensor共享内存，可以指定后者的shape和offset。
UpdateTensorAddr(void *workspaceAddr, const size_t size)	在workspace地址分配后，刷新每个workspace的地址。
GetWorkspaceAddr()	获取workspace的起始指针。
GetWorkspaceSize()	获取所需workspace的大小。
GetLinearWorkspaceSize()	废弃接口，开发者无需关注。
GetWorkspaceOffsets()	获取记录的workspace offset列表。
SetWorkspaceOffsets(const op::FVector<uint64_t> &workspaceOffsets)	设置要记录的workspace offset列表。
Run()	运行aclOpExecutor执行队列中的任务。
GetStream()	获取当前执行流。
GetInputTensors()	获取host侧api接口中的输入aclTensor。
GetOutputTensors()	获取host侧api接口中的输出aclTensor。
GetLogInfo()	获取aclOpExecutor中存储的日志相关信息。
SetLogInfo(const op::internal::OpLogInfo &logInfo)	设置aclOpExecutor中存储的日志相关信息。
SetStream(aclrtStream stream)	设置当前运行流。
AddTensorRelation(const aclTensor tensorOut, const aclTensor tensorMiddle)	记录两个aclTensor间的地址等价关系，用于aclnn cache。
AbandonCache(bool disableRepeat = false)	标记当前aclOpExecutor中记录的任务弃用aclnn cache。
UpdateStorageAddr()	在aclOpExecutor复用场景中，刷新aclTensor地址。
SetRepeatable()	尝试设置当前aclOpExecutor为可复用状态。
IsRepeatable()	判断当前aclOpExecutor是否为可复用状态。
FinalizeCache()	完成aclnn cache最终的数据保存工作。
RepeatRunWithCache(void *workspaceAddr, const aclrtStream stream)	复用aclOpExecutor场景，尝试利用aclnn cache完成任务执行。
CheckLauncherRepeatable()	判断aclOpExecutor任务列表中的每个任务都允许aclOpExecutor复用。
AddCache()	将aclOpExecutor中的aclnn cache保存到全局管理。
GetOpExecCache()	获取aclOpExecutor中记录的aclnn cache。
SetIOTensorList()	记录host侧api的输入/输出aclTensor。
GetGraph()	获取host侧api的执行图。
GetMagicNumber()	获取magic number，用于不同对象的区分。
UniqueExecutor(const char *funcName)	UniqueExecutor是aclOpExecutor的构造工厂，该函数为其构造函数。
UniqueExecutor()	UniqueExecutor是aclOpExecutor的构造工厂，该函数为其构造函数。
get()	获取UniqueExecutor中的aclOpExecutor指针。
ReleaseTo(aclOpExecutor **executor)	将UniqueExecutor中的aclOpExecutor指针传递给目标aclOpExecutor指针。
UniqueExecutor(const UniqueExecutor &)	废弃接口，开发者无需关注。
GetOpExecCacheFromExecutor(aclOpExecutor *)	尝试将外部的aclOpExecutor转为aclnn cache对象。
InitL2Phase1Context(const char l2Name, [[maybe_unused]] aclOpExecutor *executor)	初始化host侧api阶段一中的部分DFX变量值。
InitL2Phase2Context([[maybe_unused]] const char* l2Name, aclOpExecutor* executor)	初始化host侧api阶段二中的部分DFX变量值。
InitL0Context(const char profilingName, aclOpExecutor executor)	初始化L0接口的部分DFX变量值。
CreatAiCoreKernelLauncher([[maybe_unused]] const char l0Name, uint32_t opType, aclOpExecutor executor, op::OpArgContext *args)	创建一个AI Core任务对象。
CreatDSAKernelLauncher([[maybe_unused]] const char l0Name, uint32_t opType, DSA_TASK_TYPE dsaTask, aclOpExecutor executor, op::OpArgContext *args)	创建一个DSA任务对象。
InferShape(uint32_t optype, op::OpArgList &inputs, op::OpArgList &outputs, op::OpArgList &attrs)	执行指定算子的infer shape，获取infer shape的结果。
AllocTensor(const op::Shape &shape, op::DataType dataType, op::Format format = op::Format::FORMAT_ND)	根据不同的输入信息组合，申请一个device侧aclTensor，该aclTensor默认为workspace。
AllocTensor(const op::Shape &storageShape, const op::Shape &originShape, op::DataType dataType, op::Format storageFormat, op::Format originFormat)
AllocTensor(op::DataType dataType, op::Format storageFormat, op::Format originFormat)
AllocHostTensor(const op::Shape &shape, op::DataType datatype, op::Format format = op::Format::FORMAT_ND)	根据不同的输入信息组合，申请一个host侧aclTensor。
AllocHostTensor(const op::Shape &storageShape, const op::Shape &originShape, op::DataType dataType, op::Format storageFormat, op::Format originFormat)	根据不同的输入信息组合，申请一个host侧aclTensor。
AllocHostTensor(const int64_t *value, uint64_t size, op::DataType dataType)	申请一个host侧aclTensor，将指定数据类型的内存作为该aclTensor的内容。
AllocHostTensor(const uint64_t *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const bool *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const char *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const int32_t *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const uint32_t *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const int16_t *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const uint16_t *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const int8_t *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const uint8_t *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const double *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const float *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const op::fp16_t *value, uint64_t size, op::DataType dataType)
AllocHostTensor(const op::bfloat16 *value, uint64_t size, op::DataType dataType)
AllocIntArray(const int64_t *value, uint64_t size)	申请int类型的aclArray，并赋值。
AllocFloatArray(const float *value, uint64_t size)	申请float类型的aclArray，并赋值。
AllocBoolArray(const bool *value, uint64_t size)	申请bool类型的aclArray，并赋值。
AllocTensorList(const aclTensor const tensors, uint64_t size)	申请一个aclTensorList，并赋值。
AllocScalarList(const aclScalar const scalars, uint64_t size)	申请一个aclScalarList，并赋值。
AllocScalar(const void *data, op::DataType dataType)	申请指定类型的aclScalar，并赋值。
AllocScalar(float value)
AllocScalar(double value)
AllocScalar(op::fp16_t value)
AllocScalar(op::bfloat16 value)
AllocScalar(int32_t value)
AllocScalar(int64_t value)
AllocScalar(int16_t value)
AllocScalar(int8_t value)
AllocScalar(uint32_t value)
AllocScalar(uint64_t value)
AllocScalar(uint16_t value)
AllocScalar(uint8_t value)
AllocScalar(bool value)
ConvertToTensor(const aclIntArray *value, op::DataType dataType)	将不同类型的host侧数据，转为一个host侧的aclTensor。
ConvertToTensor(const aclBoolArray *value, op::DataType dataType)
ConvertToTensor(const aclFloatArray *value, op::DataType dataType)
ConvertToTensor(const aclFp16Array *value, op::DataType dataType)
ConvertToTensor(const aclBf16Array *value, op::DataType dataType)
ConvertToTensor(const T *value, uint64_t size, op::DataType dataType)
ConvertToTensor(const aclScalar *value, op::DataType dataType)	将不同类型的host侧数据，转为一个host侧的aclTensor。
AddToKernelLauncherList(op::KernelLauncher *obj)	添加一个AI Core任务到执行队列。
AddToKernelLauncherListDvpp(uint32_t opType, op::KernelLauncher obj, op::OpArgContext args)	添加一个DVPP任务到执行队列。
AddToKernelLauncherListCopyTask(uint32_t opType, op::KernelLauncher *obj, op::OpArgList &inputs, op::OpArgList &outputs, op::OpArgList &workspace)	添加一个数据拷贝类任务到执行队列。
AddToKernelLauncherListAiCpu(int32_t opType, op::KernelLauncher obj, op::OpArgContext args)	添加一个AI CPU任务到执行队列。
CommonOpExecutorRun(void workspace, uint64_t workspaceSize, aclOpExecutor executor, aclrtStream stream)	根据外部给出的aclOpExecutor及workspace、stream，执行上下文中的所有任务。

op_executor