Model List

Note: The model list of this migration tool is for reference only. The code lines mentioned in the remarks are also for reference.

Table 1 Supported PyTorch models

No.

Model

Reference Link to Original Training Project Code

Remarks

1

3D-Transformer-tr_spe

https://github.com/smiles724/Molformer/tree/f5cad25e037b0a63c7370c068a9c477f4004c5ea

-

2

3D-Transformer-tr_cpe

3

3D-Transformer-tr_full

4

AFM

https://github.com/shenweichen/DeepCTR-Torch/tree/b4d8181e86c2165722fa9331bc16185832596232

Except the DIN model, other models do not have corresponding training scripts. Before migrating such a model, copy the ./examples/run_din.py file, name it run_<model_name>.py, and modify it as follows:

  1. Import the model structure, for example, from deepctr_torch.models.ccpm import CCPM.
  2. Input different arguments based on the model structure to initialize the network, for example, model = CCPM(feature_columns, feature_columns, device=device).
  3. Modify the inputs of the network based on whether the network supports the dense_feature.

5

AutoInt

6

CCPM

7

DCN

8

DeepFM

9

DIN

10

FiBiNET

11

MLR

12

NFM

13

ONN

14

PNN

15

WDL

16

xDeepFM

17

BERT

https://github.com/codertimo/BERT-pytorch/tree/d10dc4f9d5a6f2ca74380f62039526eb7277c671

  • After the migration, the project can be used only after being installed. The installation procedure is as follows:
    • Delete the torch item from the requirements.txt file.
    • Run the python3 setup.py install command.
  • For details, see the README.md file of the repository.

18

BEiT

https://github.com/microsoft/unilm/tree/9cbfb3e40eedad33a8d2f1f15c4a1e26fa50a5b1

  • Perform the following operations before the migration:
    • Download the model's source code and retain only the beit folder.
    • Download the open source code of pytorch-image-models0.3.3 and move the timm folder in the code to the beit folder.
  • After the migration, because the weight of the PyTorch model cannot be migrated to that of the MindSpore model, comment out the code in lines 550 and 560 of utils.py.

19

BiT-M-R101x1

https://github.com/google-research/big_transfer/tree/140de6e704fd8d61f3e5ea20ffde130b7d5fd065

20

BiT-M-R101x3

21

BiT-M-R152x2

22

BiT-M-R152x4

23

BiT-M-R50x1

24

BiT-M-R50x3

25

BiT-S-R101x1

26

BiT-S-R101x3

27

BiT-S-R152x2

28

BiT-S-R152x4

29

BiT-S-R50x1

30

BiT-S-R50x3

31

CenterNet-ResNet50

https://github.com/bubbliiiing/centernet-pytorch/tree/91b63b9d0fef2e249fbddee8266c79377f0c7946

  • After the migration, process the dataset based on the README.md file of the repository.
  • No trained MindSpore model weight exists. You need to leave model_path in train.py empty.

32

CenterNet-HourglassNet

33

Conformer-tiny

https://github.com/pengzhiliang/Conformer/tree/815aaad3ef5dbdfcf1e11368891416c2d7478cb1

  • Before the migration, place the timm library (version 0.3.2 is recommended) in the root directory of the original code.
  • The framework does not support --repeated-aug currently. Use --no-repeated-aug during training.

34

Conformer-small

35

Conformer-base

36

DeiT-tiny

37

DeiT-small

38

DeiT-base

39

CvT-13

https://github.com/microsoft/CvT/tree/f851e681966390779b71380d2600b52360ff4fe1

  • Before the migration, place the timm (version 0.3.2 is recommended) and einops libraries in the root directory of the original code.
  • Modify the ./run.sh file before the migration as follows:
    • Change the training start mode (lines 4 to 10) in train() to python3 tools/train.py ${EXTRA_ARGS}.
    • Change the test start mode (lines 15 to 21) in test() to python3 tools/test.py ${EXTRA_ARGS}.

40

CvT-21

41

CvT-W24

42

albert-base-v1

https://github.com/huggingface/transformers/tree/49cd736a288a315d741e5c337790effa4c9fa689

Before the migration, remove the template files of the original repository. These files are not Python files but are suffixed with .py.

mv templates ../  
After the migration, make the following modifications:
  • To avoid the 'list out of range' error, cancel the use of index in the d["torch_dtype"] = x2ms_adapter.tensor_api.split(str(d["torch_dtype"]), ".")[1] statement of src/transformers/configuration_utils.py.
    After the modification:
    d["torch_dtype"] = x2ms_adapter.tensor_api.split(str(d["torch_dtype"]), ".")
  • Cancel the use of index in the model_to_save.config.torch_dtype = x2ms_adapter.tensor_api.split(str(dtype), ".")[1] statement of src/transformers/modeling_utils.py.

    After the modification:

    model_to_save.config.torch_dtype = x2ms_adapter.tensor_api.split(str(dtype), ".")
  • Change the return value of is_torch_available() in ./src/transformers/utils/import_utils.py to True to follow the original PyTorch process.

    Before the modification:

    def is_torch_available():
        return _torch_available

    After the modification:

    def is_torch_available():
        return True

43

albert-large-v1

44

albert-xlarge-v1

45

albert-xxlarge-v1

46

albert-Text classification

47

albert-TokenClassification

48

albert-QA

49

albert-MultipleChoice

50

bert-base-uncased

51

bert-large-uncased

52

bert-base-QA

53

bert-base-Text classification

54

bert-base-Multiple Choice

55

bert-base-token-classification

56

distilbert-base-uncased

57

distilbert-base-QA

58

distilbert-base-Text classification

59

roberta-base

60

roberta-large

61

roberta-base-Multiple Choice

62

roberta-base-Text classification

63

roberta-base-token-classification

64

roberta-base-QA

65

xlm-mlm-en-2048

66

xlm-mlm-ende-1024

67

xlm-mlm-enro-1024

68

xlm-clm-enfr-1024

69

xlm-Text classification

70

xlm-Roberta-base

71

xlm-roberta-large

72

xlm-roberta-Text classification

73

Xlm-reberta-token-classification

74

xlm-roberta-QA

75

xlnet-base-cased

76

xlnet-large-cased

77

XLNet-base-Text classification

78

XLNet-base-token-classification

79

XLNet-base-Multiple Choice

80

XLNet-base-QA

81

DistilRoBERTa

After the migration, modify the definition of is_torch_available() in ./src/transformers/utils/import_utils.py.

Before the modification:

def is_torch_available():
    return _torch_available

After the modification:

def is_torch_available():
    return True

82

Transform-XL

After the migration, make the following modifications:

  • Modify the definition of is_torch_available() in ./src/transformers/utils/import_utils.py.

    Before the modification:

    def is_torch_available():
        return _torch_available

    After the modification:

    def is_torch_available():
        return True
  • Because the structure after the dtype of MindSpore is cast to a character string is different from that of torch, you need to modify ./src/transformers/modeling_utils.py as follows:

    Before the modification:

    model_to_save.config.torch_dtype = x2ms_adapter.tensor_api.split(str(dtype), ".")[1]

    After the modification:

    model_to_save.config.torch_dtype = str(dtype) 

83

EfficientNet-B0

https://github.com/lukemelas/EfficientNet-PyTorch/tree/7e8b0d312162f335785fb5dcfa1df29a75a1783a

-

84

EfficientNet-B1

85

EfficientNet-B2

86

EfficientNet-B3

87

EfficientNet-B4

88

EfficientNet-B5

89

EfficientNet-B6

90

EfficientNet-B7

91

EfficientNet-B8

92

egfr-att

https://github.com/lehgtrung/egfr-att/tree/0666ee90532b1b1a7a2a179f8fbf10af1fdf862f

-

93

Faster R-CNN

https://github.com/AlphaJia/pytorch-faster-rcnn/tree/943ef668facaacf77a4822fe79331343a6ebca2d

  • The following backbone networks are supported:
    • MobileNet
    • ResNet-FPN
    • VGG16
    • HRNet
  • Before the migration, make the following modifications:
    • Because the MultiScaleRoIAlign operator of TorchVision 0.9.0 is used, you need to copy the torchvision/ops/poolers.py file where the operator is located to the root directory, and modify the content that uses the operator in ./utils/train_utils.py and ./utils/faster_rcnn_utils.py as follows:
      from poolers import MultiScaleRoIAlign
    • Because MindSpore does not have the API corresponding to torch.utils.data.Subset, you need to comment out the code related to the API in ./utils/coco_utils.py. The following is an example:
      # if isinstance(dataset, torch.utils.data.Subset):
      #     dataset = dataset.dataset
  • After the migration, make the following modifications:
    • Because the BitwiseOr operator in MindSpore does not support UINT8 inputs, you need to modify the following expression in ./utils/roi_header_util.py:
      Before the modification:
      pos_inds_img | neg_inds_img
      After the modification:
      pos_inds_img.astype(mindspore.int32) | neg_inds_img.astype(mindspore.int32)

94

FCOS-ResNet50

https://github.com/zhenghao977/FCOS-PyTorch-37.2AP/tree/2bfa4b6ca57358f52f7bc7b44f506608e99894e6

After the migration, make the following modifications:

  • Because the VOC dataset is used, you need to change the dataset path in line 39 of the ./train_voc.py code to the actual path.
  • Because MindSpore does not have the corresponding scatter operator, you need to modify the ./model/loss.py file as follows:
    1. Replace lines 125 and 126 with the following code:
      min_indices = mindspore.ops.ArgMinWithValue(-1)(areas.reshape(-1, areas.shape[-1]))
      tmp = np.arange(0, batch_size * h_mul_w).astype(np.int32)
      indices = mindspore.ops.Concat(-1)((mindspore.ops.ExpandDims()(mindspore.Tensor(tmp), -1), mindspore.ops.ExpandDims()(min_indices[0], -1)))
      reg_targets = mindspore.ops.GatherNd()(ltrb_off.reshape(-1, m, 4), indices) 
    2. Replace line 130 with the following code:
      cls_targets = mindspore.ops.GatherNd()(classes.reshape(-1, m, 1), indices) 
    3. Import the corresponding package to line 7 of the file.
      import numpy as np 
  • Because no pre-trained MindSpore models exist, you need to change the values of pretrained, freeze_stage_1, and freeze_bn in ./model/config.py to False.

95

FCOS-ResNet101

96

MGN-strong

https://git.openi.org.cn/Smart_City_Model_Zoo/mgn-strong

  • Before the migration, make the following modifications:
    1. Because this model depends on TorchVision, you need to copy the models/ folder in the torchvision/ directory to the ./mgn-strong/model/ directory.
    2. Change the content of ./mgn-strong/model/models/__init__.py to the following:
      from .resnet import *
    3. Modify the import statement in line 7 of ./mgn-strong/model/mgn.py.

      Before the modification:

      from torchvision.models.resnet import resnet50, Bottleneck, resnet101

      After the modification:

      from .models.resnet import resnet50, Bottleneck, resnet101
    4. Modify the addmm_ call statement in line 83 of ./mgn-strong/loss/triplet.py.

      Before the modification:

      dist.addmm_(1, -2, inputs, inputs.t())
      After the modification:
      dist.addmm_(inputs, inputs.t(), beta=1, alpha=-2)
  • Ensure that the model runs on Mindspore 1.7.0.

97

MobileNetV1 SSD

https://github.com/qfgaohao/pytorch-ssd/tree/f61ab424d09bf3d4bb3925693579ac0a92541b0d

In MindSpore, tensors cannot be used during dataset loading and models do not support slicing of ModuleList. Therefore, before the migration, modify the ./vision/ssd/ssd.py file in the original project folder as follows:

  • Change the for loop in line 57 to the following loop:
    for idx in range(start_layer_index, end_layer_index):
          layer = self.base_net[idx]
  • Insert the center_form_priors = center_form_priors.asnumpy() statement before the self.center_form_priors = center_form_priors statement in line 143.

98

MobileNetV1 SSD-Lite

99

MobileNetV2 SSD-Lite

100

MobileNetV3-Large SSD-Lite

101

MobileNetV3-Small SSD-Lite

102

SqueezeNet SSD-Lite

103

VGG16 SSD

104

SqueezeNet

https://github.com/weiaicunzai/pytorch-cifar100/tree/2149cb57f517c6e5fa7262f958652227225d125b

105

InceptionV3

106

InceptionV4

107

InceptionResNetV2

108

Xception

109

Attention56

110

StochasticDepth18

111

StochasticDepth34

112

StochasticDepth50

113

StochasticDepth101

114

VGG11

115

VGG13

116

VGG16

117

DenseNet161

118

DenseNet169

119

DenseNet201

120

PreActResNet34

121

PreActResNet50

122

PreActResNet101

123

PreActResNet152

124

ResNeXt152

125

SEResNet34

126

SEResNet50

127

SEResNet101

128

VGG19

https://github.com/kuangliu/pytorch-cifar/tree/49b7aa97b0c12fe0d4054e670403a16b6b834ddd

129

PreActResNet18

130

DenseNet121

131

ResNeXt29_2x64d

132

MobileNet

133

MobileNetV2

134

SENet18

135

ShuffleNetG2

136

GoogleNet

137

DPN92

138

RetineNet-ResNet34

https://github.com/yhenon/pytorch-retinanet/tree/0348a9d57b279e3b5b235461b472d37da5feec3d

  • Because the original code in the repository contains the code about the torch version and torch model loading, you need to modify the original project script ./train.py before the migration.
    • In backbone model selection in lines 77 to 88, set the pretrained parameter to False.
    • Delete assert torch.__version__.split('.')[0] == '1' from line 18.
  • After the migration, make the following modifications due to the restrictions on MindSpore backpropagation and dataset loading:
    • Replace lines 25 and 26 in ./retinanet/losses.py with the following code:
      def print_grad_fn(cell_id, grad_input, grad_output):
          pass
      class FocalLoss(mindspore.nn.Cell):
          def __init__(self):
              super(FocalLoss, self).__init__()
              self.register_backward_hook(print_grad_fn)

139

RetineNet-ResNet50

140

Res2Net

https://github.com/Res2Net/Res2Net-ImageNet-Training/tree/d77c16ff111522c64e918900f100699acc62f706

Because the migration of torchvision.models APIs is not supported, you need to perform the following operations:

Modify the original project.

  1. Create the ./res2net_pami/models directory.
  2. In ./res2net_pami/main.py, change import torchvision.models as models to import models.

141

ResNet-18

https://github.com/pytorch/examples/tree/41b035f2f8faede544174cfd82960b7b407723eb/imagenet

Because the migration of torchvision.models APIs is not supported, you need to perform the following operations:

Modify the original project.

  1. Create the ./imagenet/models directory.
  2. Copy torchvision/models/resnet.py from the TorchVision library (version 0.6.0) to ./imagenet/models and delete the from .utils import load_state_dict_from_url statement.
  3. Create the ./imagenet/models/__init__.py file. The file content is as follows:
    from .resnet import *
  4. In ./main.py, change import torchvision.models as models to import models.

142

ResNet-34

143

ResNet-50

144

ResNet-101

145

ResNet-152

146

ResNeXt-50 (32x4d)

147

ResNeXt-101 (32x8d)

148

Wide ResNet-50-2

149

Wide ResNet-101-2

150

sparse_rcnnv1-resnet50

https://github.com/liangheming/sparse_rcnnv1/tree/65f54808f43c34639085b01f7ebc839a3335a386

After the migration, make the following modifications:
  • In ./x2ms_adapter/nn.py, manually change the values of batch_size, src_seq_length, and tgt_seq_length of the initialization function in the MultiheadAttention class.
  • In ./nets/common.py, change the if x.requires_grad: statement to if True:.
  • In ./losses/sparse_rcnn_loss.py, convert the item[i] variable of the linear_sum_assignment function to NumPy.
    indices = linear_sum_assignment(item[i].asnumpy())
  • Modify ./datasets/coco.py as follows:
    • Change the return statement return box_info of the __getitem__ definition module to return box_info.img,box_info.labels,box_info.boxes.
    • Modify the for loop of the collect_fn definition module.
      image,labels,boxes = item #Add
      img = (image[:, :, ::-1] / 255.0 - np.array(rgb_mean)) / np.array(rgb_std)# Change item.img to image.
      target = x2ms_np.concatenate([labels[:, None], boxes], axis=-1)# Change item.labels to labels, and item.boxes to boxes.

151

sparse_rcnnv1-resnet101

152

ShuffleNetV2

https://github.com/megvii-model/ShuffleNet-Series/tree/aa91feb71b01f28d0b8da3533d20a3edb11b1810

-

153

ShuffleNetV2+

154

SMSD

https://git.openi.org.cn/PCLNLP/Sarcasm_Detection/src/commit/54bae1f2306a4d730551b4508ef502cfdbe79918

Before the migration, perform the following operations:

  • Create the state_dict folder in ./SMSD/.
  • Add the following statement to ./SMSD/models/__init__.py to import the SMSD_bi model:
    from models.SMSD_bi import SMSD_bi

You can use the --repeat parameter in the migrated code to control the number of training repetitions (with the SMSD_bi model as an example).

python3 train.py --model_name SMSD_bi --repeat 1

155

SMSD_bi

156

Swin-Transformer

https://github.com/microsoft/Swin-Transformer/tree/5d2aede42b4b12cb0e7a2448b58820aeda604426

  • Before the migration, place the timm library code to the root directory of the original code.
  • The recommended timm library version is 0.4.12.
  • Currently, the --cfg parameter supports only the following configuration files:
    • swin_tiny_patch4_window7_224.yaml
    • swin_tiny_c24_patch4_window8_256.yaml
    • swin_small_patch4_window7_224.yaml
    • swin_base_patch4_window7_224.yaml

157

Transformer

https://github.com/SamLynnEvans/Transformer/tree/e06ae2810f119c75aa34585442872026875e6462

Migrate the torchtext library on which the scripts in the code repository depend and note the following:

  • Copy the migrated torchtext_x2ms to the script folder.
  • Rename torchtext_x2ms as torchtext to ensure that the migrated torchtext is called.
  • The recommended torchtext version is 0.6.0.

158

UNet

https://github.com/milesial/Pytorch-UNet/tree/e1a69e7c6ce18edd47271b01e4aabc03b436753d

-

159

RCNN-Unet

https://github.com/bigmb/Unet-Segmentation-Pytorch-Nest-of-Unets/tree/c050f5eab6778cba6dcd8f8a68b74c9e62a698c8

Before the migration, perform the following operations:

  • Due to the syntax restrictions on MindSpore derivation, the comments in lines 249 and 252 in ./pytorch_run.py need to be aligned by a multiple of four spaces.
  • The model requires that the size of the input image be a multiple of 16. Therefore, if the size of a dataset image is not a multiple of 16, you need to remove the comments in lines 121, 122, 505, and 506 in ./pytorch_run.py to crop and scale the image to a multiple of 16.
  • If dataset label images have one channel, add .convert('RGB') to the end of line 293 in ./pytorch_run.py to convert the images to 3-channel images.
  • If ModuleList is used in MindSpore, the weight name of the sublayer changes. Therefore, you need to change torch.nn.ModuleList in line 350 of ./pytorch_run.py to list to prevent the failure to reload the checkpoint file after it is saved.

160

Attention Unet

161

RCNN-Attention Unet

162

Nested Unet

163

ViT-B_16

https://github.com/jeonsworld/ViT-pytorch/tree/460a162767de1722a014ed2261463dbbc01196b6

The cifar-10-bin dataset is required, which can be obtained from https://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz.

164

ViT-B_32

165

ViT-L_16

166

ViT-L_32

167

ViT-H_14

168

R50-ViT-B_16

169

YOLOv3

https://github.com/ultralytics/yolov3/tree/ae37b2daa74c599d640a7b9698eeafd64265f999

After the migration, make the following modifications:

  • Modify ./models/yolo.py.
    class Detect(mindspore.nn.Cell):
        stride = None #Delete
        onnx_dynamic = False 
        def __init__(self, …):
    ...
    	self.stride = None #Add
  • Modify the build_targets function in ./utils/loss.py.
    Before the modification:
    gij = x2ms_adapter.tensor_api.long((...))
    gi, gj = gij.T
    ...
    tbox.append(...)
    After the modification:
    gij = x2ms_adapter.tensor_api.long((…)).T
    gi, gj = gij
    ...
    gij = gij.T
    tbox.append(...)
  • Modify the run function in ./val.py.
    • Delete path, shape = Path(paths[si]), shapes[si][0].
    • Delete the calling position of the scale_coords function.
    • Delete callbacks.run('on_val_image_end',…).
  • Change all nn.* in the model configuration file {model_name}.yaml in the ./models/ directory to x2ms_adapter.nn.*.
  • In the multi-device scenario, change the value of rect in val_loader = create_dataloader (...) in train.py to False.

170

YOLOv3-Tiny

171

YOLOv3-SSP

172

YOLOv4

https://github.com/WongKinYiu/PyTorch_YOLOv4/tree/eb5f1663ed0743660b8aa749a43f35f505baa325

After the migration, make the following modifications:

  • Modify the create_module function in ./model/models.py.
    Before the modification:
    module_list[j][0].bias = mindspore.Parameter(bias_, ...)
    After the modification:
    module_list[j][0].bias = mindspore.Parameter(bias.reshape(bias_.shape), ...)
  • Modify ./utils/datasets.py.
    Before the modification:
    if os.path.isfile(cache_path): 
    After the modification:
     if False:
  • Modify the build_targets function in ./utils/loss.py.
    Before the modification:
    gij = x2ms_adapter.tensor_api.long((...))
    gi, gj = gij.T
    ...
    tbox.append(...)
    After the modification:
    gij = x2ms_adapter.tensor_api.long((…)).T
    gi, gj = gij
    ...
    gij = gij.T
    tbox.append(...)
  • Modify ./train.py.
    • Change if '.bias' in k: to if '.bias' in k or '.beta' in k:.
    • Change character string 'Conv2d.weight' to '.weight'.
  • In the multi-device scenario, change the value of rect in testloader = create_dataloader(…) in ./train.py to False.

173

YOLOv4-tiny

174

YOLOv4-pacsp

175

YOLOv4-paspp

176

YOLOv4-csp-leaky

177

YOLOv5l

https://github.com/ultralytics/yolov5/tree/8c420c4c1fb3b83ef0e60749d46bcc2ec9967fc5

After the migration, make the following modifications:

  • Modify ./models/yolo.py.
    class Detect(mindspore.nn.Cell):
        stride = None #Delete
    ...
        def __init__(self, …):
    ...
    	self.stride = None #Add
  • Modify the build_targets function in ./utils/loss.py.
    Before the modification:
    gij = x2ms_adapter.tensor_api.long((...))
    gi, gj = gij.T
    ...
    tbox.append(...)
    After the modification:
    gij = x2ms_adapter.tensor_api.long((…)).T
    gi, gj = gij
    ...
    gij = gij.T
    tbox.append(...)
  • Modify the run function in ./val.py.
    • Delete path, shape = Path(paths[si]), shapes[si][0].
    • Delete the calling position of the scale_coords function.
    • Delete callbacks.run('on_val_image_end',…).
  • Change all nn.* in the model configuration file {model_name}.yaml in the ./models/ directory to x2ms_adapter.nn.*.
  • In the multi-device scenario, change the value of rect in val_loader = create_dataloader (...) in ./train.py to False.

178

YOLOv5m

179

YOLOv5n

180

YOLOv5s

181

YOLOv5x

182

YOLOX

https://github.com/bubbliiiing/yolox-pytorch/tree/1448e849ac6cdd7d1cec395e30410f49a83d44ec

After the migration, make the following modifications:
  • Comment out the code in line 341 of ./train.py.
    #'adam'  : optim_register.adam(pg0, Init_lr_fit, betas = (momentum, 0.999))
  • Before training, run the following command to prevent HCCL timeout:
    export HCCL_CONNECT_TIMEOUT=3000

183

AAGCN-ABSA

https://git.openi.org.cn/PCLNLP/SentimentAnalysisNLP/src/commit/7cf38449dad742363053c4cc380ebfe33292184d

-

184

CAER-ABSA

  • This model depends on the third-party library pytorch-pretrained-bert. Download it and copy its subdirectory pytorch_pretrained_bert to the SentimentAnalysisNLP/ directory.
  • Move the definition of the BertLayerNorm class in line 158 of the ./SentimentAnalysisNLP/pytorch_pretrained_bert/modeling.py file out of the try-except block.

185

GIN-ABSA

  • This model depends on the third-party library pytorch-pretrained-bert. Download it and copy its subdirectory pytorch_pretrained_bert to the SentimentAnalysisNLP/ directory.
  • MindSpore does not support tensor creation during data processing. Therefore, you need to remove tensor creation during dataset initialization from ./GIN-ABSA/data_utils.py, including removing the torch.tensor() operation from line 147 and the torch.tensor() operation from line 234.

186

Scon-ABSA

  • Because this model depends on the pre-training weight of bert-base-uncased in huggingface, you need to download pytorch_model.bin, convert it to pytorch_model.ckpt in MindSpore format, and load the generated model weight in the script.
  • Because this model depends on the third-party library pytorch-pretrained-bert, you need to download it and copy its subdirectory pytorch_pretrained_bert to the SentimentAnalysisNLP directory.
  • Move the definition of the BertLayerNorm class in line 158 of the ./pytorch_pretrained_bert/modeling.py file out of the try-except block.

187

Trans-ECE

  • Because this model depends on the pre-training weight of bert-base-chinese in huggingface, you need to download bert-base-chinese to the current directory and convert the model weight file pytorch_model.bin to pytorch_model.bin.ckpt in MindSpore format.
  • Due to defects in the original code, you need to use a list to wrap the filter in lines 48 and 49 and delete the unnecessary trans_optimizer parameter from line 54 in ./Trans-ECE/Run.py.
  • Because custom optimizers are not supported, you need to change BertAdam in line 55 of ./Trans-ECE/Run.py to optim.Adam.

188

PyramidNet 101

https://github.com/dyhan0920/PyramidNet-PyTorch/tree/5a0b32f43d79024a0d9cd2d1851f07e6355daea2

Before the migration, make the following modifications:

  • Because the original code in the repository has restrictions on the Python and PyTorch versions, you need to make code adaptation according to https://github.com/dyhan0920/PyramidNet-PyTorch/issues/5.
  • Because the migrated code does not need the torchvision module, you need to comment out lines 23 to 25 in train.py.
    #model_names = sorted(name for name in models.__dict__
    #    if name.islower() and not name.startswith("__")
    #    and callable(models.__dict__[name]))

189

PyramidNet 164 bottleneck

190

PyramidNet 200 bottleneck

Table 2 TensorFlow 2 model list

No.

Model

Reference Link to Original Training Project Code

Remarks

1

ALBERT_base_v2

https://github.com/huggingface/transformers/tree/49cd736a28

Before the migration, remove the template files of the original repository. These files are not Python files but are suffixed with .py.

mv templates ../  

After the migration, make the following modifications:

  • In ./examples/tensorflow/language-modeling/run_mlm.py:
    • Add the following package import statement:
      from x2ms_adapter.keras.losses import SparseCategoricalCrossentropy
    • Change the value of return_tensors in DataCollatorForLanguageModeling from tf to np.

      Before the modification:

      data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm_probability=data_args.mlm_probability, return_tensors="tf")

      After the modification:

      data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm_probability=data_args.mlm_probability, return_tensors="np")
    • Modify the parameter called by model.compile.

      Before the modification:

      model.compile(optimizer=optimizer)

      After the modification:

      model.compile(optimizer=optimizer, loss=SparseCategoricalCrossentropy(True))
  • Modify the dtype_byte_size method of ./src/transformers/modeling_tf_utils.py.

    Before the modification:

    bit_search = re.search("[^\d](\d+)$", dtype.name)

    After the modification:

    bit_search = re.search("[^\d](\d+)$", str(dtype))

2

ALBERT_large_v2

3

ALBERT_xlarge_v2

4

ALBERT_xxlarge_v2

5

ALBERT_base_v1

6

ALBERT_large_v1

7

ALBERT_xlarge_v1

8

ALBERT_xxlarge_v1

9

roberta-base

10

roberta-large

11

RBT6

12

RBT4

13

RBTL3

14

RBT3

15

DenseNet_121

https://github.com/calmisential/Basic_CNNs_TensorFlow2/tree/f063c84451f12e904f9c91c51278be52afccb0c2

  • Configure epoch, batch_size, and the dataset path in ./configuration.py as required.
  • Before the migration, comment out the regnet.RegNet code line in the ./models/__init__.py file.
    #regnet.RegNet()

16

DenseNet_169

17

EfficientNet_B0

18

EfficientNet_B1

19

Inception_V4

20

MobileNet_V1

21

MobileNet_V2

22

MobileNet_V3_Large

23

MobileNet_V3_Small

24

ResNet_101

25

ResNet_152

26

ResNet_18

27

ResNet_34

28

ResNet_50

29

ResNext_101

30

ResNext_50

31

Shufflenet_V2_x0_5

32

Shufflenet_V2_x1_0

33

AFM

https://github.com/ZiyaoGeng/Recommender-System-with-TF2.0/tree/1d2aa5bf551873d5626539c196705db46d55c7b6

Because every network folder depends on the ./data_process/ directory, you need to directly migrate the Recommender-System-with-TF2.0/ directory or copy the ./data_process/ directory to the network folder before the migration.

34

Caser

35

DCN

36

Deep_Crossing

37

DeepFM

38

DNN

39

FFM

40

FM

41

MF

42

NFM

43

PNN

44

WDL

45

BiLSTM-CRF

https://github.com/kangyishuai/BiLSTM-CRF-NER/tree/84bde29105b13cd8128bb0ae5d043c4712a756cb

  • Training must be performed in MindSpore 1.7.
  • Download the complete dataset according to README.md of the original training project, decompress the dataset, and copy files in the dataset to ./data.
  • After the migration, decrease the values of batch_size, hidden_num, and embedding_size in ./main.py based on the training status. The following is an example:
    params = {
        "maxlen": 128,
        "batch_size": 140,
        "hidden_num": 64,
        "embedding_size": 64,
        "lr": 1e-3,
        "epochs": 10
    }

46

FCN

https://github.com/YunYang1994/TensorFlow2.0-Examples/tree/299fd6689f242d0f647a96b8844e86325e9fcb46/5-Image_Segmentation/FCN

The scipy.misc.imread method used in ./parser_voc.py is an API of SciPy earlier than 1.2.0. MindSpore is compatible with SciPy 1.5.2 or later. Therefore, use the imageio.imread recommended in the official deprecation warning of SciPy.

47

GoogleNet

https://github.com/marload/ConvNets-TensorFlow2/tree/29411e941c4aa72309bdb53c67a6a2fb8db57589

After the load_data() API is migrated, use the data_dir parameter to specify the dataset path or place the dataset in the default path ~/x2ms_datasets/cifar10/cifar-10-batches-py.

48

SqueezeNet

49

Unet

https://github.com/YunYang1994/TensorFlow2.0-Examples/tree/299fd6689f242d0f647a96b8844e86325e9fcb46/5-Image_Segmentation/Unet

Use Membrane as the dataset, which can be obtained from README.md of the training project.

50

Vit

https://github.com/tuvovan/Vision_Transformer_Keras/tree/6a1b0959a2f5923b1741335aca5bc2f8dcc7c1f9

  • After the load() API is ported, use the data_dir parameter to specify the dataset path or place the dataset in the default path ~/x2ms_datasets/cifar10/cifar-10-batches-bin.
  • Delete the comma (,) from early_stop = tf.keras.callbacks.EarlyStopping(patience=10), in train.py to ensure that the callback object is a single instance instead of a tuple.
Table 3 TensorFlow 1 model list

No.

Model

Reference Link to Original Training Project Code

Remarks

1

ALBERT-base-v2

https://github.com/google-research/ALBERT/tree/a36e095d3066934a30c7e2a816b2eeb3480e9b87

Before the migration, make the following modifications:

  • In ./classifier_utils.py, change the following statement:
    if t.dtype == tf.int64:

    To:

    if t.dtype == "int64":
  • Modify the ./optimization.py file as follows:
    • optimizer = AdamWeightDecayOptimizer(

      Changes to:

      optimizer = tf.keras.optimizers.Adam(
    • train_op = tf.group(train_op, [global_step.assign(new_global_step)])

      Changes to:

      train_op = tf.group(train_op, global_step)
  • If the Glue-MNLI dataset is used, the Record dataset needs to be generated based on the README file.

2

ALBERT-large-v2

3

ALBERT-xlarge-v2

4

ALBERT-xxlarge-v2

5

Attention-Based Bidirectional RNN

https://github.com/dongjun-Lee/text-classification-models-tf/tree/768ea13547104f56786c52f0c6eb99912c816a09

Because the training parameters have been processed by the dropout operator in MindSpore, you need to change the value of the self.keep_prob attribute in the model definition file to 0.5 without using the where statement.

6

Character-level CNN

7

RCNN

8

Very Deep CNN

9

Word-level Bidirectional RNN

10

Word-level CNN

11

BERT-Tiny

https://github.com/google-research/bert/tree/eedf5716ce1268e56f0a50264a88cafad334ac61

Before the migration, make the following modifications:

  • In ./run_classifier.py:
    • Delete .value from hidden_size = output_layer.shape[-1].value in line 592 of the source code.
      hidden_size = output_layer.shape[-1]
    • Comment out the following code in lines 869 and 870 of the source code:
      #file_based_convert_examples_to_features(
      #     train_examples, label_list, FLAGS.max_seq_length, tokenizer, train_file)
    • In the _decode_record function, change line 529 of the source code:
      if t.dtype == tf.int64:

      To:

      if t.dtype == 'int64'
  • In ./optimization.py:
    • Replace the instantiation code of AdamWeightDecayOptimizer with optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate).
      #optimizer = AdamWeightDecayOptimizer(
      #...
      #)
      optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
    • Before the modification:
      train_op = tf.group(train_op, [global_step.assign(new_global_step)])

      After the modification:

      train_op = tf.group(train_op, global_step)

12

BERT-Mini

13

BERT-Small

14

BERT-Medium

15

BERT-Base

16

RBT6

https://github.com/bojone/bert4keras/tree/9c1c916def4d515a046c414

Before the migration, make the following modifications:

  • In ./examples/task_language_model.py:
    • Change the values of checkpoint_path, config_path, dict_path, input training data, and batch_size.
    • txt = open(txt, encoding='gbk').read()

      Changes to:

      txt = open(txt, encoding='utf8').read()
  • In ./bert4keras/layers.py, add the following import statement after the from keras.layers import * statement:
    from keras.layers import Input, Dropout, Lambda, Add, Dense, Activation
  • In ./bert4keras/models.py, add the following import statement after the from bert4keras.layers import * statement:
    from bert4keras.layers import Input, Dropout, Lambda, Add, K, Dense, Activation

17

RBT4

18

RBTL3

19

RBT3

20

RoBERTa-wwm-ext-large

21

RoBERTa-wwm-ext

22

Bi-LSTM-CRF

https://github.com/fzschornack/bi-lstm-crf-tensorflow/tree/5181106

  • Before the migration, create the ./bi-lstm-crf-tensorflow.py file and copy the code in the bi-lstm-crf-tensorflow.ipynb file to the newly created Python file.
  • After the migration, change the assignment statement value of the num_units variable in ./bi-lstm-crf-tensorflow.py based on the training status. The following example changes the value to 64:
    #num_units = 128
    num_units = 64

23

CNN-LSTM-CTC

https://github.com/watsonyanghx/CNN_LSTM_CTC_Tensorflow/tree/6999cd19285e7896cfe77d50097b0d96fb4e53e8

  • Before the migration, comment out line 43 in utils.py.
    #tf.app.flags.DEFINE_string('log_dir', './log', 'the logging dir')
  • After the migration, decrease the value of validation_steps in ./utils.py to facilitate quick observation of the model convergence effect during training. The following example changes the value to 50:
    x2ms_FLAGS.define_integer('validation_steps', 50, 'the step to validation')
  • In the root directory of the project, create the ./imgs/train and ./imgs/val folders, and save the specified training and test data to the folders.

24

LeNet

https://github.com/Jackpopc/aiLearnNotes/tree/7069a705bbcbea1ac24

  • After the migration, download the MNIST dataset and save it to the directory where commands are executed. Decompress all .gz files in MNIST and delete the original .gz files.
  • If the network convergence is poor, decrease the learning rate (LR) and increase the training epochs (EPOCHS).

25

AlexNet

26

ResNet-18

https://github.com/taki0112/ResNet-Tensorflow/tree/f395de3a53d

Before the migration, install the jedi dependency. After the migration, perform the following adaptation:

  • After the load() API is migrated, use the data_dir parameter to specify the dataset path or place the required dataset in the default path.
    • ~/x2ms_datasets/cifar100/cifar-100-python
    • ~/x2ms_datasets/cifar10/cifar-10-batches-py
    • ~/x2ms_datasets/mnist.npz
    • ~/x2ms_datasets/fashion-mnist
  • MindSpore does not support the test of a large amount of data at a time.
    • You need to modify the code to test the test dataset in batches.
    • Change the first dimension of the shape of placeholders in the test dataset to 1.

27

ResNet-34

28

ResNet-50

29

ResNet-101

30

ResNet-152