embed_documents

Function

Uses a model to convert the text provided by the user into vectors.

Prototype

def embed_documents(texts, batch_size)

Parameters

Parameter

Data Type

Required/Optional

Description

texts

List[str]

Required

Text list. The list length range is [1, 1000 × 1000], and the string length range is [1, 128 × 1024 × 1024].

batch_size

Integer

Optional

Batch size. texts of batch_size is combined for embedding each time. The value range is [1, 1024]. The default value is 32. The value is determined by the device's graphics memory.

Return Value

Data Type

Description

List[Dict[int, float]]

Vector array after the conversion of texts.

If texts is an array whose length is 4 and the embedding model outputs a dictionary where the key is token_id and the value is token_weights, the final output is a 4-dimensional array, where each element is a dictionary.