chat
Function
Interacts with vision-language model (VLM) services to obtain the inference result of a VLM.
Prototype
def chat(image_url, prompt, sys_messages, role, llm_config)
Parameters
Parameter |
Data Type |
Required/Optional |
Description |
|---|---|---|---|
image_url |
Dict |
Required |
Dictionary containing the Base64 code of an image. The key is url, and the value is a string with image_base64 as the variable. In the example {"url": f"data:image/jpeg;base64,{image_base64}"}, image_base64 indicates the Base64 code of an image. The length range is [1, 4 × 1024 × 1024]. |
sys_messages |
List[dict] |
Optional |
System message. The list can contain up to 16 messages. Each dictionary within the list can have a maximum length of 16. The maximum length for dictionary key strings is 16, and the maximum length for value strings is 4 × 1024 × 1024. The default value is None. |
role |
String |
Optional |
Role of an inference request. The length range is [1, 16]. The default value is user. |
llm_config |
LLMParameterConfig |
Optional |
Parameters for calling an LLM. For details, see LLMParameterConfig. |
Return Value
Data Type |
Description |
|---|---|
String |
Summary of the VLM's description of the image content. |