msServiceProfiler

Performance tuning for serving frameworks often feels like a "black box," making it difficult to locate issues (for example, slower responses under higher request loads or varying performance across different devices).

msServiceProfiler (a serving optimization tool) provides end-to-end performance profiling. It clearly displays the performance of framework scheduling and model inference, helping users quickly locate performance bottlenecks (helping determine whether the problem is caused by the framework or model) and improve service performance.

The following provides only a quick start guide for the serving tuning tool. For details about the tool operations, APIs, parameters, and fields, see "msServiceProfiler".

Prerequisites

Procedure

  1. Configure the environment variables.

    The collection capability of msServiceProfiler is enabled by setting the environment variable SERVICE_PROF_CONFIG_PATH before the MindIE Motor service is deployed. If the environment variable is misspelled or not set before deploying the MindIE Motor service, the msServiceProfiler collection capability cannot be enabled.

    The following uses ms_service_profiler_config.json as an example to describe how to set environment variables.

    export SERVICE_PROF_CONFIG_PATH="./ms_service_profiler_config.json"

    The value of SERVICE_PROF_CONFIG_PATH must point to the JSON file name. The JSON file is the configuration file for controlling profile data collection. For example, it specifies the path for storing profile metadata and enables or disables operator collection. For details about the fields, see 3. If no configuration file exists at the specified path, the tool automatically generates a default configuration (with the collection function disabled by default).

    In multi-node deployment, you are not advised to place the configuration file or its specified data storage path in a shared directory (such as a network shared location). Because data writing may involve additional network or buffering steps rather than direct disk writing, such configurations may lead to unexpected system behavior or results in certain situations.

  2. Run the MindIE Motor service.

    If the environment variables are correctly configured, the tool outputs the following logs starting with [msservice_profiler] before the service deployment is complete, indicating that msServiceProfiler has been started:

    [msservice_profiler] [PID:225] [INFO] [ParseEnable:179] profile enable_: false
    [msservice_profiler] [PID:225] [INFO] [ParseAclTaskTime:264] profile enableAclTaskTime_: false
    [msservice_profiler] [PID:225] [INFO] [ParseAclTaskTime:265] profile msptiEnable_: false
    [msservice_profiler] [PID:225] [INFO] [LogDomainInfo:357] profile enableDomainFilter_: false

    If the configuration file specified by SERVICE_PROF_CONFIG_PATH does not exist, the tool outputs logs indicating automatic creation. Using the configuration in 1 as an example, the tool outputs the following logs:

    [msservice_profiler] [PID:225] [INFO] [SaveConfigToJsonFile:588] Successfully saved profiler configuration to: ./ms_service_profiler_config.json
  3. Collect data.

    After the MindIE Motor service is successfully deployed, you can precisely control collection behavior by modifying fields in the configuration file.

    1
    2
    3
    4
    5
    6
    {
    	"enable": 1,
    	"prof_dir": "${PATH}/prof_dir/",
    	"acl_task_time": 0
    ...    # Only the three fields are shown as an example.
    }
    
    Table 1 Parameter description

    Parameter

    Description

    Required (Yes/No)

    enable

    Globally enables or disables profile data collection. Possible values are:

    • 0: disabled
    • 1: enabled

    If this parameter is set to 0, no data collection occurs even if other parameters enable their corresponding features. If only this parameter is set to 1, only serving profile data is collected.

    Yes

    prof_dir

    Path for storing the collected profile data. The default value is ${HOME}/.ms_server_profiler.

    This path stores the original profile data. Subsequent parsing steps are required to obtain visualized profile data files for analysis.

    If prof_dir is modified when enable is set to 0, the change takes effect when the value of enable is later changed to 1. If prof_dir is modified when enable is set to 1, the change does not take effect.

    No

    acl_task_time

    Whether to enable profiling for operator delivery and execution durations. The options are as follows:

    • 0: disabled (by default); if this parameter is set to 0 or an invalid value, profiling is disabled.
    • 1: enabled
    NOTE:
    • Enabling this function will occupy certain device performance, resulting in inaccurate profile data. You are advised to enable this function when the model execution time is abnormal for further analysis.
    • Operator collection generates large amounts of data. Generally, it is advised to collect data for 3 to 5 seconds. Longer collection time consumes additional disk space and increases parsing time, resulting in longer time to locate performance issues.
    • The default operator collection level is L0. To enable other operator collection levels, see "msServiceProfiler" for more parameter information.

    No

    Generally, if enable is set to 1 continuously, the tool collects data from the moment the MindIE Motor inference service receives a request until the request ends. The directory under prof_dir will continue to grow in size. Therefore, it is advised to collect data only during key time periods.

    Whenever the enable field changes, the tool outputs corresponding logs to indicate the change.

    [msservice_profiler] [PID:3259] [INFO] [DynamicControl:407] Profiler Enabled Successfully!

    Or

    [msservice_profiler] [PID:3057] [INFO] [DynamicControl:411] Profiler Disabled Successfully!

    Whenever enable is changed from 0 to 1, all fields in the configuration file are reloaded by the tool, enabling dynamic updates.

  4. Parse data.
    1. Install environment dependencies.
      python >= 3.10
      pandas >= 2.2
      numpy >= 1.24.3
      psutil >= 5.9.5
    2. Run the parsing command.
      python3 -m ms_service_profiler.parse --input-path=${PATH}/prof_dir

      --input-path is the path specified by the prof_dir parameter in 3.

      After parsing, parsed profile data files are generated in the directory where the command is executed.

  5. Perform tuning analysis.

    The parsed profile data includes .db, .csv, and .json formats. You can quickly analyze from different dimensions such as requests and scheduling using .csv files, or import .db or .json files into MindStudio Insight for visualized analysis. For detailed operations, see "Serving Tuning" in MindStudio Insight User Guide.

    The profile data is presented in a visual format on MindStudio Insight, as shown in the following figure.