MxRAGCache and Automatic QA Generation

Sample Introduction

In this example, the MxRAGCache and automatic QA generation functions are added based on Knowledge Base Building and QA Retrieval. The automatic QA generation function supports markdown parsing, whose results are saved into MxRAGCache. The memory cache and similarity cache are used.
Figure 1 Cache-based RAG SDK process

Prerequisites

  • You have downloaded and run Llama3-8B-Chinese-Chat in the MindIE container. (You can download the model from here.)
  • The RAG SDK container needs to access the config.json and tokenizer.json files in the path of the Llama3-8B-Chinese-Chat model to calculate the text token size.
  • You have completed containerized deployment on the host by referring to "Installing MindIE" > "Mode 3: Container Installation" in MindIE Installation Guide and started the service by referring to "Quick Start" > "Service Startup" in MindIE Motor Development Guide.
  • You have completed operations in Installing RAG SDK.
  • You have downloaded acge_text_embedding and bge-reranker-large and saved them to the model storage directory configured when the container is run in 2.a. Model download links:

Procedure

  1. Compile the retrieval operator to implement the retrieval function.
    cd $MX_INDEX_INSTALL_PATH/tools/ && python3 aicpu_generate_model.py -t <chip_type> && python3 flat_generate_model.py -d <dim> -t <chip_type>  && cp op_models/* $MX_INDEX_MODELPATH 
    • The MX_INDEX_INSTALL_PATH and MX_INDEX_MODELPATH variables have been configured in ~/.bashrc and do not need to be configured separately. For more details, see ~/.bashrc.
    • -d <dim> indicates the dimension of the embedding model after vectorization. Because the vector dimension of the acge_text_embedding model is 1024, set this parameter to -d 1024.
    • -t <chip_type> indicates the processor type. For the Atlas 300I Duo inference card, run the npu-smi info command on the server where the Ascend AI Processor is installed and then delete the last digit of Name. The obtained value is the value of <chip_type>. For the Atlas 800I A2 inference server, run the npu-smi info command on the server where the Ascend AI Processor is installed to obtain the value of Name. For the Atlas 800I A3 SuperPoD Server, run the npu-smi info -t board -i 0 -c 0 command to obtain the NPU Name information. 910_<NPU Name> is the value of <chip_type>.
  2. Create a domain-specific knowledge document.

    Create a gaokao.md file in the UTF-8 format in the /home/HwHiAiUser directory. The file content is as follows:

    Composition Test of the 2024 National College Entrance Examination
    New Course Standard (I)
    Read the following materials and write a composition. (60 points)
    With the popularization of the Internet and artificial intelligence, more and more questions can be quickly answered. So, will we have fewer problems?
    How do you think about the above materials? Please write a composition no fewer than 800 words.
    Requirements: Select a proper angle and style to describe your opinions. Prepare your own title. Do not copy other articles, and do not disclose personal information.

    The training deadline of the selected model is before 2024. The model itself has not learned the knowledge related to the composition test of the 2024 National College Entrance Examination.

  3. Run rag_demo_cache_qa.py by referring to the Demo. Modify the default parameters such as the file path, model path, and IP address and port number of the foundation model in the code as required. For details, see the README.md file.
  4. Execute the sample code.
    python3 rag_demo_cache_qa.py  --query "Describe the requirements of the composition test of the 2024 National College Entrance Examination."
  5. Run the sample code twice to obtain the result.
    # The first execution result is identical to the second. However, in the second execution, the response is returned upon a cache hit, significantly reducing the response time.
    {'query':''Describe the requirements of the composition test of the 2024 National College Entrance Examination', 'result':'The specific content of the composition test for the 2024 National College Entrance Examination has not been disclosed. Generally, the composition test will be published by the education department on the day of the examination or a period of time before the examination. We cannot provide you with the specific content of the examination.\n\nHowever, according to the information you provide, the question may center around the theme "With the popularization of the Internet and artificial intelligence, more and more questions can be quickly answered. So, will we have fewer problems?", requiring students to write a unique article no less that 800 words, with a proper perspective and style.\n\nIf you need further guidance or help, such as conceiving compositions, organizing ideas, and improving writing quality, we can provide some general suggestions.', 'source_documents': [{'metadata': {'source': '/home/HwHiAiUser/gaokao.md'}, 'page_content': Composition Test of the 2024 National College Entrance Examination\nRead the following materials and write a composition. (60 points)\nWith the popularization of the Internet and artificial intelligence, more and more questions can be quickly answered. So, will we have fewer problems?\nHow do you think about the above materials? Please write a composition no less than 800 words.\nRequirements: Select a proper angle and style to describe your opinions. Prepare your own title. Do not copy other articles, and do not disclose personal information.\n'}]}
    Duration: 0.0007343292236328125s