summarize

Function

Extracts and summarizes documents via an LLM.

Prototype

def summarize(texts, not_summarize_threshold, prompt)

Parameters

Parameter

Data Type

Required/Optional

Description

texts

List[str]

Required

Input text list with a length range of (0, 1024]. The length range of all texts in the list is (0, 1024 × 1024].

not_summarize_threshold

Integer

Optional

For a given short text, an LLM may either fail to summarize it or produce an incorrect summary. This parameter specifies the text length threshold for an LLM to summarize the text. If the text length is less than or equal to the value of not_summarize_threshold, the model does not summarize the text, and the summary content remains the original text. The default value is 30, and the value range is (0, 1024 × 1024].

prompt

langchain_core.prompts.PromptTemplate

Optional

The value of input_variables in prompt must be ["text"], indicating the input text. The template length range is (0, 1024 × 1024]. The query of an LLM request is a prompt combined with a text. The valid value depends on MindIE configurations. For details, see the description of maxSeqLen in "Core Concepts and Configurations" > "Configuration Parameters (Serving)" in MindIE LLM Development Guide. It is recommended that the language type of the prompt be the same as that of the text, or the language type of the LLM answer be specified. Otherwise, the answering performance will be affected.

_SUMMARY_TEMPLATE = PromptTemplate(

input_variables=["text"],

template="""Use simple Chinese to summarize the following content, including as much key information as possible. The output should contain only the content information.\n\n{text}"""),

Return Value

Data Type

Description

List[str]

Text list after summary.