Text Detection (CTPN)

Training Parameters and Value Ranges

The following table describes the names, types, value ranges, default values, and descriptions of training parameters of the CTPN model.

Table 1 Parameters for CTPN model training (ctpn_mindspore/model_train.py)

Parameter

Type

Value Range

Default Value

Description

--train_dataset_path

String

-

None

Path of the training dataset.

--pretrained_ckpt_path

String

-

"pre_trained_ckpt"

Path of the pre-trained model.

--train_output_path

String

-

"train_output_path"

Output path of the training result.

--device_id

Integer

0–7

0

ID of the NPU used for training.

--device_num

Integer

The value can only be 1. That is, only single-device training is supported.

1

Number of NPUs used for training.

--input_width

String

An integer multiple of 16 within (0,1280)

960

Input width for training.

--input_height

String

An integer multiple of 16 within (0,1280)

576

Input height for training.

--batch_size

Integer

The value can only be 1.

1

Number of image batches for training.

--init_lr

Float

(0,1)

0.0005

Training learning rate.

--epoch_size

Integer

[1,10000]

10

Number of training epochs.

--run_eval

Bool

True or False

True

Whether to perform evaluation during training.

--eval_start_epoch

Integer

[1,10000]

1

Sequence number of epochs from which evaluation is performed during training.

--save_best_ckpt

Bool

True or False

True

Whether to save the optimal CKPT file. (This parameter will be deleted in later versions.) Currently, when run_eval is set to True, the optimal CKPT file is saved by default.)

--eval_interval

Integer

[1,10000]

1

Number of epochs at which evaluation is performed.

Training Command Reference

The command for training the CTPN model is as follows:

python3 model_train.py --train_dataset_path=/path_to/images --train_output_path=./output_dir --pretrained_ckpt_path=/path_to/ckpt --epoch_size=20 --batch_size=1 --input_width=960 --input_height=576 --init_lr=0.0005 --device_num=1 --device_id=0 --run_eval=True

The model training process is random. Use the evaluated accuracy. The following shows the training accuracy of four images and twenty epochs when the preceding parameters are used.

Figure 1 Training accuracy

After the training is complete, the log information is displayed as follows:

Figure 2 Log information

After the training is complete, the .ckpt, .a310.om, .a310p.om, and .air model files are generated in the output directory specified by the --train_output_path parameter.

Evaluation Parameters and Value Ranges

The following table describes the names, types, value ranges, default values, and descriptions of evaluation parameters of the CTPN model.

Table 2 Evaluation parameters (ctpn_mindspore/model_eval.py)

Parameter

Type

Value Range

Default Value

Description

--eval_dataset_path

String

-

""

Path of the evaluation dataset.

--eval_ckpt_path

String

-

""

Path of the trained model.

--eval_output_path

String

-

''/eval_output_path''

Output path of the evaluation results.

--device_id

Integer

0–7

0

ID of the NPU used for evaluation.

--text_iou_thresh

Float

-

0.2

IoU threshold for text connection.

Evaluation Command Reference

The command for evaluating the CTPN model is as follows:

python3 model_eval.py --eval_dataset_path= path_to/images --eval_ckpt_path= path_to/your_ckpt --eval_output_path=./eval_result --device_id=0

The CKPT output during training is used to evaluate the model accuracy. The following is an example.

Figure 3 Model evaluation accuracy

For the folder and file generated in the evaluation directory, the all_images\ng_images\ok_images folder stores the image evaluation results, and the statistics.csv file stores the corresponding accuracy results.

Figure 4 Evaluation directory