Class Introduction
Function
Provides an openGauss-based vector database.
Prototype
from mx_rag.storage.vectorstore import OpenGaussDB OpenGaussDB(engine, collection_name, search_mode, index_type, metric_type)
Parameters
Parameter |
Data Type |
Required/Optional |
Description |
|---|---|---|---|
engine |
Engine |
Required |
Engine instance. For details, see Engine. The openGauss dialect is not allowed. NOTE:
Engine is controlled by users. Use a secure connection mode. |
collection_name |
String |
Optional |
Collection name, which cannot be empty. The maximum length is 1024 characters. The value must be a valid Python identifier. The default value is vectorstore. |
search_mode |
SearchMode |
Optional |
Retrieval mode. Currently, three modes are supported: DENSE for dense retrieval (default), SPARSE for sparse retrieval, and HYBRID for hybrid retrieval. For more details, see SearchMode. |
index_type |
String |
Optional |
Vector retrieval type. Currently, IVFFLAT and HNSW (default) are supported. This parameter is valid for dense vectors in dense and hybrid retrieval modes. HNSW is used for sparse vector retrieval and cannot be changed. |
metric_type |
String |
Optional |
Vector distance calculation mode, which can be IP (default), L2, and COSINE. |
Return Value
Data Type |
Description |
|---|---|
OpenGaussDB |
OpenGaussDB object. |
Example
import getpass
import numpy as np
from mx_rag.storage.vectorstore import OpenGaussDB, SearchMode
from sqlalchemy import URL, create_engine
# OpenGauss
username = "demo"
password = getpass.getpass()
host = "<host here>"
port = "<port here>"
database = "testdb"
# vector config
dim = 128
n_emb = 1000
url = URL.create(
"opengauss+psycopg2",
username=username,
password=password,
host=host,
port=port,
database=database
)
connect_args = {
'sslmode': 'verify-full',
'sslrootcert': "path_to root cert",
'sslkey': "path_to key",
'sslcert': "path_to cert",
'sslpassword': getpass.getpass(prompt="cert key password:")
}
# create an engine
engine = create_engine(url, pool_size=20, max_overflow=10, pool_pre_ping=True, connect_args=connect_args)
# search mode defaults to DENSE
# similarity strategy defaults to FLAT_IP
dense_store = OpenGaussDB.create(
engine=engine,
dense_dim=dim
)
# add vectors
dense_embeddings = np.random.randn(n_emb, dim)
ids = list(range(n_emb))
dense_store.add(ids, dense_embeddings)
# search vectors
res = dense_store.search(dense_embeddings[:3].tolist(), k=3)
print(res)
# delete vectors
count = dense_store.delete(ids)
print(count)
# update vector
dense_store.update([1], dense_embeddings[:1])
# drop table
dense_store.drop_collection()