•Containerized: Linux containers with an Ubuntu desktop—one container, one Agent instance.
•Ready to use: Full Ubuntu desktop and isolated workspace, with a browser and common tools preinstalled.
•Secure isolation: Dedicated storage and network; CPU and memory quotas supported.
•Self-service: Access the desktop in a browser and configure agents, models, and IM channels on your own.
•Containerized: Run LLM stacks such as Ollama and vLLM as standard container instances.
•GPU / NPU: Single GPU, multi-GPU, or shared GPUs with MPS / HAMi partitioning; supports NVIDIA, AMD, and Huawei Ascend.
•Fast model delivery: Model artifacts pre-staged on hosts for sub-second attach; curated open-weight model datasets built in.
•Frameworks: vLLM, Ollama, and ComfyUI for inference, serving, and image generation.
Benefits:no GPU capital expense; pay only for token usage.
Benefits:data stays on the LAN; no public API token charges.
Benefits:balance cost against the best available model quality.
Pre-staged models + containers
Instances ready in seconds
Network and storage isolation
Layered resource quotas
GPU sharing and partitioning
Maximize utilization
Leading LLM frameworks
Full API surface
Technical support

Scan the code to join the technical support WeChat group
Official account

Scan the code to follow and get the latest updates