Starting TF Serving

The following uses installation user HwHiAiUser as an example to describe how to start TF Serving. Replace it with the actual username. Ensure that the installation user has the read, or read and write permissions on the paths described in this document.

Create the tf_serving_test folder in the installation user directory, create the config.cfg file in the folder, and add the following content to the file. For details about the fields, see Session Configuration.

platform_configs {
key: "tensorflow"
  value {
    source_adapter_config {
      [type.googleapis.com/tensorflow.serving.SavedModelBundleSourceAdapterConfig] {
        legacy_config {
          session_config {
            graph_options {
              rewrite_options {
                custom_optimizers {
                  name: "NpuOptimizer"
                  parameter_map: {
                    key: "use_off_line"
                    value: {
                      b: true
                    }
                  }
                  parameter_map: {
                    key: "mix_compile_mode"
                    value: {
                      b: true
                    }
                  }
                  parameter_map: {
                    key: "graph_run_mode"
                    value: {
                      i: 0
                    }
                  }
                  parameter_map: {
                    key: "precision_mode"
                    value: {
                      s: "force_fp16"
                    }
                  }                                    
                }
                remapping: OFF
              }
            }
          }
        } 
      }
    }
  }
}

(Optional) If multiple models are loaded, create the model import configuration file models.config in the tf_serving_test folder and add the following content.

The inception_v3_flowers, inception_v4, and inception_v4_imagenet models are used as examples. Replace them with the actual model names.

model_config_list:{
        config:{
          name:"inception_v3_flowers",      # Model name
          base_path:"/home/HwHiAiUser/tf_serving_test/inception_v3_flowers",  # Model path
          model_platform:"tensorflow"
	},
	config:{
          name:"inception_v4",
          base_path:"/home/HwHiAiUser/tf_serving_test/inception_v4",
          model_platform:"tensorflow"
	},
        config:{
          name:"inception_v4_imagenet",
          base_path:"/home/HwHiAiUser/tf_serving_test/inception_v4_imagenet",
          model_platform:"tensorflow"
        }
}

Place the trained SavedModel in the tf_serving_test directory. For details, see the following directory structure.

       
            squeezenext/ 
└── 1
     ├── saved_model.pb
     └── variables
         ├── variables.data-00000-of-00001
         └── variables.index

Wherein, 1 indicates the version number.

To improve the TF Serving deployment performance, you can convert the model from the SavedModel format to the .om format. For details, see Converting a SavedModel to an .om Model. When an .om model is used for online inference, the data dump function of Model Accuracy Analyzer is not supported.

Set environment variables.

Add the npu_bridge path to the environment variable LD_LIBRARY_PATH.

         
              export LD_LIBRARY_PATH=${TFPLUGIN_INSTALL_PATH}/npu_bridge:$LD_LIBRARY_PATH

${TFPLUGIN_INSTALL_PATH} is the installation path of the TF Adapter package.

Add the tf_adapter path to the LD_LIBRARY_PATH environment variable.

         
              export LD_LIBRARY_PATH=/home/HwHiAiUser/xxx/serving-1.15.0/third_party/tf_adapter:$LD_LIBRARY_PATH

xxx is the installation path of the TF Serving.

Set environment variables based on the selected CANN package.

Include the installation path of the online inference dependency in the environment variable.

Scenario 1: Install Ascend-CANN-Toolkit for inference on an Ascend AI device, which serves as the development environment.

           
                . /home/HwHiAiUser/Ascend/ascend-toolkit/set_env.sh

Scenario 2: Install Ascend-CANN-NNAE on an Ascend AI device.

           
                . /home/HwHiAiUser/Ascend/nnae/set_env.sh

Set the environment variable for the TFPlugin package.

         
              export PYTHONPATH=${TFPLUGIN_INSTALL_PATH}:$PYTHONPATH

${TFPLUGIN_INSTALL_PATH} is the installation path of the TF Adapter package.

If multiple Python versions exist in the operating environment, specify the Python 3.7.5 installation path.

         
              export PATH=/usr/local/python3.7.5/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/python3.7.5/lib:$LD_LIBRARY_PATH

Start tensorflow_model_server, and import the configuration file in steps 1 and 2. For example:

For a single model, run the following command:

        
             tensorflow_model_server --port=8500 --rest_api_port=8501 --model_base_path=/home/HwHiAiUser/tf_serving_test/squeezenext --model_name=squeezenext --platform_config_file=/home/HwHiAiUser/tf_serving_test/config.cfg

For multiple models, run the following command:

       
            tensorflow_model_server --port=8500 --rest_api_port=8501 --model_config_file=/home/HwHiAiUser/tf_serving_test/models.config --allow_version_labels_for_unavailable_models=true --model_config_file_poll_wait_seconds=60 --platform_config_file=/home/HwHiAiUser/tf_serving_test/config.cfg

If tensorflow_model_server fails to be started after the CANN software of another version is installed, rectify the fault by referring to Recompling TF Serving.

Use an absolute path. If the startup is successful, the following information is displayed:

You can run the tensorflow_model_server --help command to view the startup mode and options. The following table describes the options.

**Table 1** Options
Option	Description	Example
--port	Uses the GPRC mode for communication.	8500
--rest_api_port	Uses the HTTP/REST API mode for communication. If set to 0, this option does not take effect. In addition, the specified port number must be different from that of the GPRC mode.	8501
--model_config_file	Imports multiple models. The file must be in the same directory as the models and --platform_config_file configuration file.	/home/HwHiAiUser/tf_serving_test/models.config
--model_config_file_poll_wait_seconds	Sets the interval for updating the --model_config_file configuration file. When the service is enabled, the models written to --model_config_file are updated in real time and loaded to the server. Unit: s.	60
--model_name	Loads a single model. The value is the parent directory name of the version directory where the model is located.	squeezenext
--model_base_path	Sets the path of the loaded model. If --model_config_file has been configured, ignore this option.	/home/HwHiAiUser/tf_serving_test/squeezenext
--platform_config_file	Sets the feature configuration file.	/home/HwHiAiUser/tf_serving_test/config.cfg

Parent topic: Deploying TensorFlow Serving–based Online Inference Services