当用户已经基于ATB高性能加速库完成模型迁移后,MindIE提供了MindIE-Service组件快速帮助用户搭建推理服务化流程,同时MindIE-Service也提供了兼容第三方推理服务化接口,帮助用户快速接入第三方框架。
本章节以服务化调度推理Llama2-7B为例进行步骤说明,具体请参考《MindIE-Service开发指南》 章节,了解MindIE-Serivce服务化调度推理。
import csv from pathlib import Path import pyarrow.parquet as pq import glob, os from transformers import AutoTokenizer def read_oa(dataset_path, tokenizer_model): out_list = [] for file_path in glob.glob((Path(dataset_path) / "*.parquet").as_posix()): file_name = file_path.split("/")[-1].split("-")[0] data_dict = pq.read_table(file_path).to_pandas() data_dict = data_dict[data_dict['lang'] == 'zh'] ques_list = data_dict['text'].to_list() for ques in ques_list: tokens = tokenizer_model.encode(ques) if len(out_list) <= 2048: out_list.append(tokens) else: out_list.append(tokens[0:2048]) return out_list def save_csv(file_path, out_tokens_list): with open(file_path, 'w', newline='') as csvfile: csv_writer = csv.writer(csvfile) for row in out_tokens_list: csv_writer.writerow(row) if __name__ == '__main__': model_path = "/data/models/baichuan2-7b" oa_dir = "/home/xxx/oasst1" save_path = "oa_tokens.csv" tokenizer_model = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True, use_fast=True) tokens_lists = read_oa(oa_dir, tokenizer_model) save_csv(save_path, tokens_lists)
cd /usr/local/Ascend/mindie/latest/mindie-service
其中${HOME}为当前用户目录。
“conf/config.json”文件修改需要增加修改权限。
修改参数 |
参数值 |
---|---|
httpsEnabled |
false |
npuDeviceIds |
[[0]] |
modelName |
llama2_7b |
modelWeightPath |
llama2-7B权重实际路径 |
worldSize |
1 |
./bin/llm_engine_test /path/of/token_gsm8k.csv