Class Introduction

Function

Uses Transformers to start a model locally and calculates relevance scores for documents. It inherits the abstract class Reranker. Currently, the bge-reranker-large and bge-reranker-base models are supported.

If the configured model weight is not in the safetensors format, convert the model weight to the safetensors format before using it. This prevents security problems caused by insecure model weight formats such as CKPT and BIN.

Prototype

from mx_rag.reranker.local import LocalReranker
LocalReranker(model_path, dev_id, k, use_fp16)

Parameters

Parameter

Data Type

Required/Optional

Description

model_path

String

Required

Model weight file directory. The path length cannot exceed 1024 characters. The path cannot be a soft link or a relative path.

  • The size of each file in the directory cannot exceed 10 GB, the level cannot exceed 64, and the total number of files cannot exceed 512.
  • The running user group and non-running users cannot have the write permission on the files in the directory.
  • All files within the directory, as well as the parent directory itself, must have their group ownership set to the running user.

The storage path cannot be in the path list: ["/etc", "/usr/bin", "/usr/lib", "/usr/lib64", "/sys/", "/dev/", "/sbin", "/tmp"].

dev_id

Integer

Optional

ID of the device on which a model runs. The value range is [0, 63]. The default value is 0.

k

Integer

Optional

The most relevant k results after re-ranking. The value range is [1, 10000]. The default value is 1.

use_fp16

Bool

Optional

Whether to use FP16. The default value is True.

Return Value

LocalReranker object.

Example

from paddle.base import libpaddle
from langchain_core.documents import Document
from mx_rag.reranker.local import LocalReranker
# Same as LocalReranker(model_path="path to model", dev_id=0).
doc_1 = Document(
                page_content="I am Xiaohong,"
                metadata={"source": ""}
            )
doc_2 = Document(
                page_content="I am Xiao Ming,"
                metadata={"source": ""}
            )
docs = [doc_1, doc_2]
rerank = LocalReranker.create(model_path="path to model", dev_id=0)
scores = rerank.rerank('Hello', [doc.page_content for doc in docs])
res = rerank.rerank_top_k(docs, scores)
print(res)