Preparing .npy Data of a TensorFlow Model

Generating the .npy File

This version does not support the generation of NumPy (.npy) files of a TensorFlow model. You need to install the TensorFlow environment and prepare NumPy data in advance. This section provides only an example of the TensorFlow .npy file for reference.

Before generating .npy files of a TensorFlow model, a complete, executable, standard TensorFlow model application project is required. You can use the TensorFlow debugger (tfdbg) to generate .npy files. The major steps are as follows:

  1. Modify the TensorFlow inference script to add the debugging configuration option by adding the following code:
    • In Estimator mode:
      from tensorflow.python import debug as tf_debug
      training_hooks = [train_helper.PrefillStagingAreaHook(), tf_debug.LocalCLIDebugHook()]
      Add the tfdbg hook, as shown in Figure 1.
      Figure 1 Estimator mode
    • In session.run mode:
      from tensorflow.python import debug as tf_debug
      sess = tf_debug.LocalCLIDebugWrapperSession(sess, ui_type="readline")
      Set the tfdbg wrapper before run, as shown in Figure 2.
      Figure 2 Session.run mode
  2. Run the inference script.
  3. After the application inference is complete, the view enters the debugging CLI interaction mode tfdbg. Run the run command.

    After the run command is executed, the data is saved as an .npy file on the CLI interaction page.

Collecting the .npy File

After the run command is executed, you need to collect .npy files. tfdbg can dump only one tensor at a time. To automatically collect all .npy files, perform the following operations:

  1. Run lt > tensor_name in the tfdbg CLI view to temporarily store all tensor names to a file.
  2. Open a new CLI and run the following command to generate the commands to be executed in the tfdbg CLI:
    timestamp=$[$(date +%s%N)/1000] ; cat tensor_name | awk '{print "pt",$4,$4}' | awk '{gsub("/", "_", $3);gsub(":", ".", $3);print($1,$2,"-n 0 -w "$3".""'$timestamp'"".npy")}' > tensor_name_cmd.txt

    The .npy file generated in the example complies with the naming rules for accuracy comparison. tensor_name indicates the name of the file corresponding to the customized tensor list. The value of timestamp must comply with the [0-9]{1,255} regular expression.

  3. Go back to the tfdbg CLI and run the command generated in the previous step for saving all .npy files.
    By default, .npy files are stored using numpy.save(). Slashes (/) and colons (:) are replaced by underscores (_).

    If the command cannot be pasted on the CLI, run the mouse off command in the tfdbg command line to disable the mouse mode before pasting again.

  4. Check whether names of the generated .npy files comply with the naming rules, as shown in Figure 3.
    • An .npy file is named in the format of {op_name}.{output_index}.{timestamp}.npy, where op_name must comply with the A-Za-z0-9_- regular expression, timestamp must comply with the [0-9]{1,255} regular expression, and output_index is a number.
    • If the name of an .npy file exceeds 255 characters due to a long operator name, comparison of this operator is not supported.
    • The name of some .npy files may not meet the naming requirements due to the tfdbg or operating environment. You can manually rename the files based on the naming rules. If there are a large number of .npy files that do not meet the requirements, generate .npy files again by referring to How Do I Handle Exceptions in the Generated .npy File Names in Batches?
    Figure 3 Viewing the .npy files