Terms, Acronyms, and Abbreviations

Term/Acronym/Abbreviation

Description

ACP

async checkpoint persistence

DPC

distributed parallel client

LLM

large language model

MemFS

memory file system

MindIO

Memory cache system that improves the read and write speed of training checkpoints