Class Introduction

Function

Class for parsing Markdown and returning the title and content.

Prototype

from mx_rag.cache import MarkDownParser
MarkDownParser(file_path, max_file_num)

Parameters

Parameter

Data Type

Required/Optional

Description

file_path

String

Required

Path of the folder where the markdown file is located. The path length cannot exceed 1024 characters. When the parse function is called, the following items are verified: The path cannot be a soft link or a relative path. The size of the markdown file in the folder path cannot exceed 10 MB, and the number of markdown files cannot exceed max_file_num. The path cannot be in the path list: ["/etc", "/usr/bin", "/usr/lib", "/usr/lib64", "/sys/", "/dev/", "/sbin", "/tmp"].

max_file_num

Integer

Optional

Maximum number of markdown files that can be parsed. The default value is 1000. The value range is [1, 10000].

Return Value

Data Type

Description

Tuple[List[str], List[str]]

Title and content lists of the parsed markdown file.

Example

from paddle.base import libpaddle
from mx_rag.cache import MarkDownParser
dir_path = "path of .md document "
parser = MarkDownParser(dir_path)
titles, contents = parser.parse()
print(titles)
print(contents)