Class Introduction

Function

This class is a custom MxDocument class, which is used to store data APIs after document loading and parsing.

Prototype

from mx_rag.storage.document_store import MxDocument
class MxDocument(BaseModel):
    page_content: str
    metadata: dict
    document_name: str

Parameters

Parameter

Data Type

Required/Optional

Description

page_content

String

Required

Text after splitting. The length range is [0, 16 MB].

metadata

Dict

Optional

Metadata, for example, {'source': '/home/HwHiAiUser/gaokao.txt'}. The dictionary length cannot exceed 1024 characters. The character string length in the dictionary cannot exceed 128 × 1204 × 1024 characters. The number of nested dictionary layers cannot exceed 1.

document_name

String

Required

File name. The length range is [0, 1024].

Example

from langchain_community.document_loaders import TextLoader
from mx_rag.storage.document_store import MxDocument
loader = TextLoader("/xxx/gaokao.txt", encoding="utf-8")
document = loader.load()[0]
mx_document = MxDocument(page_content=document.page_content, metadata=document.metadata, document_name="gaokao.txt")