generate_evaluate_data
Function
Generates a certain number of questions for each document in the list, producing an initial domain-specific evaluation dataset for subsequent manual filtering.
Prototype
def generate_evaluate_data(split_doc_list: list[str], generate_qd_prompt: str , question_number: int, batch_size: int)
Parameters
Parameter |
Data Type |
Required/Optional |
Description |
|---|---|---|---|
split_doc_list |
list[str] |
Required |
Original document list. The list length range is [1, 1000 × 1000], and the string length range is [1, 128 × 1024 × 1024]. |
generate_qd_prompt |
String |
Optional |
Prompt for generating an evaluation dataset, which can be modified based on the domain characteristics. The value range is (0, 1 × 1024 × 1024]. The default value is GENERATE_QD_PROMPT. |
question_number |
Integer |
Optional |
Number of questions generated for each original document chunk. A higher value ensures more comprehensive coverage. Increasing this value improves fine-tuning effect at the expense of longer time. The default value is 3. The value range is (0, 20]. |
batch_size |
Integer |
Optional |
Number of concurrent records when evaluation data is synthesized. The default value is 8. The value range is (0, 1024]. |
The GENERATE_QD_PROMPT is defined as follows:
GENERATE_QD_PROMPT = """Read an article and generate a related question.
Article: Climate change has severely altered marine ecosystems through escalating sea temperatures, rising sea levels, and ocean acidification. These shifts have fundamentally disrupted species distribution, ecosystem stability, and global fisheries. Consequently, in an era of accelerating global warming, the conservation of marine environments has become an urgent international priority.
Question: What are the main impacts of climate change on marine ecosystems?
Article: The retail sector represents another critical frontier for AI-driven transformation. By leveraging data analytics and machine learning algorithms, retailers can gain deeper insights into consumer behavior, emerging trends, and personal preferences. These technologies enable brands to optimize inventory management, refine recommendation engines, and sharpen marketing strategies—ultimately driving both sales growth and customer loyalty.
Question: How does AI help retailers improve customer experience and sales performance?
Ask {question_number} questions about the following article according to the preceding examples:
Article: {doc}
Output format: Number questions starting from 1 and do not include numbers after the colon in each entry.
Question 1:
...
""":