chat_streamly

Function

Communicates with LLM services to obtain the streaming inference result of an LLM.

Prototype

def chat_streamly(query, sys_messages, role, llm_config)

Parameters

Parameter

Data Type

Required/Optional

Description

query

String

Required

Inference request text. The string length falls within the range of [1, 4 × 1024 × 1024].

sys_messages

List[dict]

Optional

System message. The list can contain up to 16 messages. Each dictionary within the list can have a maximum length of 16. The maximum length for dictionary key strings is 16, and the maximum length for value strings is 4 × 1024 × 1024. The default value is None.

role

String

Optional

Role of an inference request. The length range is [1, 16]. The default value is user.

llm_config

LLMParameterConfig

Optional

Parameters for calling an LLM. For details, see LLMParameterConfig.

Return Value

Data Type

Description

Iterator[str]

Streaming inference result of an LLM.