评估功能
主要基于sentence-transformers框架提供的InformationRetrievalEvaluator方法,使用前述评估数据辅助生成方法生成的评估数据集对embedding模型进行评估,评估成功后将会返回以下指标:
{'cosine_accuracy@1', 'cosine_accuracy@3', 'cosine_accuracy@5', 'cosine_accuracy@10', 'cosine_precision@1', 'cosine_precision@3', 'cosine_precision@5', 'cosine_precision@10', 'cosine_recall@1', 'cosine_recall@3', 'cosine_recall@5', 'cosine_recall@10', 'cosine_ndcg@10', 'cosine_mrr@10', 'cosine_map@100', 'dot_accuracy@1', 'dot_accuracy@3', 'dot_accuracy@5', 'dot_accuracy@10', 'dot_precision@1', 'dot_precision@3', 'dot_precision@5', 'dot_precision@10', 'dot_recall@1', 'dot_recall@3', 'dot_recall@5', 'dot_recall@10', 'dot_ndcg@10', 'dot_mrr@10', 'dot_map@100'}
调用示例
import torch import torch_npu from sentence_transformers import SentenceTransformer from sentence_transformers.evaluation import InformationRetrievalEvaluator from datasets import load_dataset torch.npu.set_device(torch.device("npu:0")) model = SentenceTransformer("model_path", device="npu" if torch.npu.is_available() else "cpu") eval_data = load_dataset("json", data_files="evaluate_data.jsonl", split="train") eval_data = eval_data.add_column("id", range(len(eval_data))) corpus = dict( zip(eval_data["id"], eval_data["corpus"]) ) queries = dict( zip(eval_data["id"], eval_data["query"]) ) relevant_docs = {} for q_id in queries: relevant_docs[q_id] = [q_id] evaluator = InformationRetrievalEvaluator(queries=queries, corpus=corpus, relevant_docs=relevant_docs, name="model_name") result = evaluator(model) print(result)
父主题: 模型评估和微调方法