Starting TF Serving

This section describes how to start TF Serving. Replace the example paths with your actual paths as needed. Ensure that the installation user has the read, or read and write permissions on the paths described in this document.

Create the tf_serving_test folder in any directory (the $HOME directory is used as an example in this section), create the config.cfg configuration file in the folder, and add the following content to the file: For details about the fields, see Session Configuration.

platform_configs {
key: "tensorflow"
  value {
    source_adapter_config {
      [type.googleapis.com/tensorflow.serving.SavedModelBundleSourceAdapterConfig] {
        legacy_config {
          session_config {
            graph_options {
              rewrite_options {
                custom_optimizers {
                  name: "NpuOptimizer"
                  parameter_map: {
                    key: "use_off_line"
                    value: {
                      b: true
                    }
                  }
                  parameter_map: {
                    key: "mix_compile_mode"
                    value: {
                      b: true
                    }
                  }
                  parameter_map: {
                    key: "graph_run_mode"
                    value: {
                      i: 0
                    }
                  }
                  parameter_map: {
                    key: "precision_mode"
                    value: {
                      s: "force_fp16"
                    }
                  }                                    
                }
                remapping: OFF
              }
            }
          }
        } 
      }
    }
  }
}

(Optional) If multiple models are loaded, create the model import configuration file models.config in the tf_serving_test folder and add the following content.

The inception_v3_flowers, inception_v4, and inception_v4_imagenet models are used as examples. Replace them with the actual model names.

model_config_list:{
        config:{
          name:"inception_v3_flowers",      # Model name
          base_path:"$HOME/tf_serving_test/inception_v3_flowers",  # Model path
          model_platform:"tensorflow"
	},
	config:{
          name:"inception_v4",
          base_path:"$HOME/tf_serving_test/inception_v4",
          model_platform:"tensorflow"
	},
        config:{
          name:"inception_v4_imagenet",
          base_path:"$HOME/tf_serving_test/inception_v4_imagenet",
          model_platform:"tensorflow"
        }
}

Place the trained SavedModel in the tf_serving_test directory. For details, see the following directory structure.

       
            squeezenext/ 
└── 1
     ├── saved_model.pb
     └── variables
         ├── variables.data-00000-of-00001
         └── variables.index

Wherein, 1 indicates the version number.

To improve the TF Serving deployment performance, you can convert the model from the SavedModel format to the .om format. For details, see Converting a SavedModel to an .om Model. When an .om model is used for online inference, the data dump function for accuracy comparison is not supported.

Set environment variables.

Add the npu_bridge path to the environment variable LD_LIBRARY_PATH.

         
              export LD_LIBRARY_PATH=${TFPLUGIN_INSTALL_PATH}/npu_bridge:$LD_LIBRARY_PATH

${TFPLUGIN_INSTALL_PATH} indicates the installation path of the TF Adapter package.

Add the tf_adapter path to the LD_LIBRARY_PATH environment variable.

         
              export LD_LIBRARY_PATH=$HOME/serving-1.15.0/third_party/tf_adapter:$LD_LIBRARY_PATH

Set environment variables based on the selected CANN package.

         
              # Configure environment variables of the CANN software. The default installation path of the root user is used as an example.
source /usr/local/Ascend/cann/set_env.sh

# TF Adapter Python library. ${TFPLUGIN_INSTALL_PATH} indicates the installation path of the TF Adapter package.
export PYTHONPATH=${TFPLUGIN_INSTALL_PATH}:$PYTHONPATH

export JOB_ID=10087

Start tensorflow_model_server, and import the configuration file in steps 1 and 2. For example:

For a single model, run the following command:

        
             tensorflow_model_server --port=8500 --rest_api_port=8501 --model_base_path=$HOME/tf_serving_test/squeezenext --model_name=squeezenext --platform_config_file=$HOME/tf_serving_test/config.cfg

For multiple models, run the following command:

       
            tensorflow_model_server --port=8500 --rest_api_port=8501 --model_config_file=$HOME/tf_serving_test/models.config --allow_version_labels_for_unavailable_models=true --model_config_file_poll_wait_seconds=60 --platform_config_file=$HOME/tf_serving_test/config.cfg

If tensorflow_model_server fails to be started after the CANN software of another version is installed, rectify the fault by referring to Rebuilding TF Serving.

Use an absolute path. If the startup is successful, the following information is displayed:

You can run the tensorflow_model_server --help command to view the startup mode and options. The following table describes the options.

**Table 1** Options
Option	Description
--port	Uses the GPRC mode for communication.
--rest_api_port	Uses the HTTP/REST API mode for communication. If set to 0, this option does not take effect. In addition, the specified port number must be different from that of the GPRC mode.
--model_config_file	Imports multiple models. The file must be in the same directory as the models and --platform_config_file configuration file.
--model_config_file_poll_wait_seconds	Sets the interval for updating the --model_config_file configuration file. When the service is enabled, the models written to --model_config_file are updated in real time and loaded to the server. Unit: s.
--model_name	Loads a single model. The value is the parent directory name of the version directory where the model is located.
--model_base_path	Sets the path of the loaded model. If --model_config_file has been configured, ignore this option.
--platform_config_file	Sets the feature configuration file.

Parent topic: Deploying TensorFlow Serving–based Online Inference Services