full_text_search

Function

Searches for text information in a database using the BM25 algorithm. If enable_bm25 is set to False when the MilvusDocstore instance is created, this API cannot be used and an empty list is returned.

Prototype

def full_text_search(query, top_k, drop_ratio_search, filter_dict)

Parameters

Parameter

Data Type

Required/Optional

Description

query

String

Required

Text to be retrieved. The length range is (0, 1000 × 1000].

top_k

Integer

Optional

Number of most matched chunks. The default value is 3. If the value of this parameter is greater than the number of valid chunks found, only valid chunks are returned. The value range is (0, 10000].

drop_ratio_search

Float

Optional

Proportion of the smallest values to ignore during BM25 retrieval. This parameter allows you to fine-tune the search process by specifying the proportion of the smallest values to be ignored. It helps balance search accuracy and performance. A smaller value of drop_ratio_search indicates that these smallest values contribute less to the final score. By ignoring some small values, the search performance can be improved while the impact on accuracy is minimized.

The value range is [0, 1). The default value is 0.2. For details, see Milvus Sparse Vector Indexes.

filter_dict

Dict

Required

Dictionary consisting of retrieval criteria. Currently, only document IDs can be filtered. The filtered document IDs are passed in a list. The length of the ID list cannot exceed 1000 × 1000. For example, if you need to filter the documents whose IDs are 1, 2, and 4, the input dictionary is {"document_id": [1, 2, 4]}.

Return Value

Data Type

Description

List[MxDocument]

If a result is found, a list of MxDocument class instances is returned. If no result is found, an empty list is returned. For details about MxDocument, see MxDocument.