Description

  • For details about other APIs on the user side, see "API Description" > RESTful API Reference" > "RESTful APIs on the EndPoint Service Plane" in MindIE LLM Development Guide.

    {ip} and {port} in the API format must meet the following requirements:

    • Preferably set {ip} to {predict_ip} in Startup Command. If the parameter is not configured, set {ip} to the value of the predict_ip parameter in the ms_coordinator.json configuration file.
    • Preferably set {port} to {predict_port} in Startup Command. If the parameter is not configured, set {port} to the value of the predict_port parameter in the ms_coordinator.json configuration file.
  • Only administrators can use Intra-cluster Communication APIs, and these APIs can be accessed only within the cluster.
  • Configure ulimit.
    Pay attention to the maximum number of files in the operating environment. Run the following command to check the upper limit of the ulimit value in the environment:
    ulimit -n

    The Coordinator uses HTTPS to communicate with users. When receiving an inference request, the Coordinator generates a socket system file. The number of socket system files is related to the number of concurrent inference requests. When the number of files exceeds the upper limit, the Coordinator program fails to run.

    If the maximum number of files is too small, you are advised to set ulimit to 3 times the maximum number of concurrent requests. Run the following command to set the ulimit value:

    For example, if the maximum number of concurrent requests is 500, the recommended value is 1500 (3 × 500).
    ulimit -n 1500