Class Introduction
Function
Configures data structure of the similarity cache.
Prototype
from mx_rag.cache import SimilarityCacheConfig SimilarityCacheConfig(vector_config, cache_config, emb_config, similarity_config, retrieval_top_k, clean_size, **kwargs)
Parameters
Parameter |
Data Type |
Required/Optional |
Description |
|---|---|---|---|
vector_config |
Dict[str, Any] |
Required |
Used to configure a vector database. For details about the configuration, see Table 1. If the Faiss database is configured, load_local_index will be overwritten by data_save_folder and auto_save will be False in MindFAISS. The length of the character string in the configuration cannot exceed 1024, and the length of the iterable sequence in the dictionary cannot exceed 1024, the dictionary length cannot exceed 1024, and the number of nested dictionary layers cannot exceed 2. |
cache_config |
String |
Required |
Used to configure a scalar database. The value can only be sqlite. |
emb_config |
Dict[str, Any] |
Required |
Used to configure an embedding model. For details, see Table 2. The dictionary length cannot exceed 1024 characters. The string length in the dictionary cannot exceed 1024 characters. The number of nested dictionary layers cannot exceed 1. |
similarity_config |
Dict[str, Any] |
Required |
Used to configure a similarity calculation model. The dictionary length cannot exceed 1024 characters. The string length in the dictionary cannot exceed 1024 characters. The number of nested dictionary layers cannot exceed 1. For details, see Table 3. |
retrieval_top_k |
Integer |
Optional |
Top k value during similarity retrieval. The default value is 1. The value range is (0, 1000]. |
clean_size |
Integer |
Optional |
Number of aged cache data records when the number of added cache data records exceeds the value of cache_size. The default value is 1. The value range is (0, cache_size]. |
**kwargs |
Any |
Required |
For details, see CacheConfig. |
- This API uses the pickle module, which may be attacked by maliciously constructed data during unpickle. Ensure that data_save_folder is securely stored and only trusted data can be loaded.
- The values of vector_config and cache_config must be None or not None at the same time. If both vector_config and cache_config are None, the function is the same as that of memory cache.
- For the SQlite database, the size of the flushed file cannot exceed 30 GB. For a vector database, the size of the flushed file cannot exceed 20 GB.
Parameter |
Data Type |
Required/Optional |
Description |
|---|---|---|---|
**kwargs |
Dict[str, Any] |
Required |
For details, see create_storage. |
top_k |
Integer |
Optional |
Number of top k records during similarity search. The default value is 5. |
vector_save_file |
String |
Required |
Flushing path. When vector_type is set to npu_faiss_db, this parameter overwrites the load_local_index parameter in MindFAISS. For milvus_db, this parameter does not take effect. |
Parameter |
Data Type |
Required/Optional |
Description |
|---|---|---|---|
x_dim |
Integer |
Optional |
Number of dimensions of the embedding model. The default value is 0. |
skip_emb |
Bool |
Optional |
Whether to skip embedding. The default value is False. |
**kwargs |
Dict[str, Any] |
Required |
For details, see create_embedding. |
Parameter |
Data Type |
Required/Optional |
Description |
|---|---|---|---|
score_min |
Float |
Optional |
Minimum value of the similarity calculation range. The default value is 0.0. The value range is [0.0, 100.0]. |
score_max |
Float |
Optional |
Maximum value of the similarity calculation range. The default value is 1. The value range is [1.0, 100.0]. The value of score_max must be greater than or equal to that of score_min. |
reverse |
Bool |
Optional |
Relationship between the similarity score and similarity. The default value is False.
|
**kwargs |
Dict[str, Any] |
Required |
For details, see create_reranker. |
Example
from mx_rag.cache import SimilarityCacheConfig
from mx_rag.cache import MxRAGCache
dim = 1024
dev = 1
similarity_config = SimilarityCacheConfig(
vector_config={
"vector_type": "npu_faiss_db",
"x_dim": dim,
"devs": [dev],
},
cache_config="sqlite",
emb_config={
"embedding_type": "local_text_embedding",
"x_dim": dim,
"model_path": "path_to_embedding_model", # Embedding model path
"dev_id": dev
},
similarity_config={
"similarity_type": "local_reranker",
"model_path": "path_to_reranker_model", # Reranker path
"dev_id": dev
},
retrieval_top_k=1,
cache_size=1000,
clean_size=20,
similarity_threshold=0.86,
data_save_folder="path_to_cache_save_folder", # Flushing path
disable_report=True
)
similarity_cache = MxRAGCache("similarity_cache", similarity_config)
Example 2: milvus_db + tei_embedding + tei_reranker
import getpass
from paddle.base import libpaddle
from mx_rag.cache import SimilarityCacheConfig
from mx_rag.cache import EvictPolicy
from mx_rag.cache import MxRAGCache
from mx_rag.utils import ClientParam
from pymilvus import MilvusClient
dim = 1024
client = MilvusClient("https://x.x.x.x:port", user="xxx", password=getpass.getpass(), secure=True, client_pem_path="path_to/client.pem", client_key_path="path_to/client.key", ca_pem_path="path_to/ca.pem", server_name="localhost")
similarity_config = SimilarityCacheConfig(
vector_config={
"client": client,
"vector_type": "milvus_db",
"x_dim": dim,
"collection_name": "mxrag_cache_123", # Label of milvus_db
"param": None
},
cache_config="sqlite",
emb_config={
"embedding_type": "tei_embedding",
"url": "https://<ip>:<port>/embed", # IP address and listening port of the tei_embedding service
"client_param": ClientParam(ca_file="/path/to/ca.crt")
},
similarity_config={
"similarity_type": "tei_reranker",
"url": "https://<ip>:<port>/rerank", # IP address and listening port of the tei_reranker service
"client_param": ClientParam(ca_file="/path/to/ca.crt")
},
retrieval_top_k=1,
cache_size=100,
auto_flush=100,
similarity_threshold=0.70,
data_save_folder="path_to_cache_save_folder",
disable_report=True,
eviction_policy=EvictPolicy.FIFO
)
similarity_cache = MxRAGCache("similarity_cache", similarity_config)