Overview

The cache is categorized into memory cache (hash map + exact match) and similarity cache (vector database + embedding + similarity calculation). The memory cache is utilized when user questions are completely matched, while the similarity cache is employed when user questions are similar but not identical. In practice, both types of caches can be cascaded.