Multi LoRA推理样例

需要参考Multi LoRA使用样例添加LoRA模型权重配置。
需要参考LoraModules参数说明配置Multi LoRA服务化参数。

指定单一LoRA模型测试样例

benchmark \
--DatasetPath "/{数据集路径}/GSM8K/" \
--DatasetType gsm8k \
--ModelName {Lora模型ID} \
--ModelPath "/{模型权重路径}/llama_7b" \
--TestType vllm_client \
--Http https://{ipAddress}:{port} \
--ManagementHttp https://{managementIpAddress}:{managementPort} \
--Tokenizer True \
--TestAccuracy True \
--DoSampling False

指定不同LoRA模型测试不同数据样例

benchmark \
--DatasetPath "/{数据集路径}/GSM8K/" \
--DatasetType gsm8k \
--ModelName llama_7b \
--ModelPath "/{模型权重路径}/llama_7b" \
--TestType vllm_client \
--Http https://{ipAddress}:{port} \
--ManagementHttp https://{managementIpAddress}:{managementPort} \
--Tokenizer True \
--TestAccuracy True \
--DoSampling False  \
--LoraDataMappingFile {Lora模型与推理数据映射文件路径}

LoRA模型与推理数据映射文件

LoRA模型与推理数据映射文件（json）需要用户自行创建，参考脚本示例如下（文件名以generate_lora_mapping_file.py为例）：

import json
import random
def create_json(file_path):
	# 可选模型名称列表
	choices = ["BaseModel", "LoraAdapter1", "LoraAdapter2"] 
	with open(file_path, 'w', encoding='utf-8') as file:
		# 根据模型名称随机选择BaseModel或Lora模型对1000条数据进行推理（用户根据需要自行配置映射逻辑）
		for i in range(1000):
			current_choice_idx = random.randint(0, len(choices) - 1)
			if current_choice_idx==0:    # BaseModel无需指定与数据的映射关系，未指定映射的数据默认使用BaseModel
				continue
			# 每一行保存一组数据与Lora模型ID的映射关系
			data = json.dumps({str(i): choices[current_choice_idx]})
			file.write(data + "\n")
if __name__ == "__main__":
	create_json("lora_mapping.json")

执行以上脚本后生成配置文件（文件名以lora_mapping.json为例）：

python generate_lora_mapping_file.py

lora_mapping.json配置文件内容示例如下：

{"0": "LoraAdapter1"}
{"1": "LoraAdapter2"}
{"6": "LoraAdapter1"}
{"7": "LoraAdapter2"}
{"8": "LoraAdapter1"}
{"9": "LoraAdapter2"}
{"10": "LoraAdapter1"}
{"11": "LoraAdapter1"}
{"13": "LoraAdapter2"}

json配置文件内容格式为{"数据ID": "Lora模型ID"}。
BaseModel无需指定与数据的映射关系，未指定映射的数据默认使用BaseModel。
指定的数据ID如果不存在（例如大于实际数据数量），该条映射关系无效。
如果Lora模型ID不存在于LoraModules参数说明的配置中，当--TestType配置为“openai”会出现报错，配置为“vllm_client”或“tgi_client”时默认使用BaseModel推理。

父主题： Client推理模式