
Moore Threads MCCX D800 384GB
The Moore Threads MCCX D800 is a high-performance 4U AI server purpose-built for the most demanding Large Language Model (LLM) workloads.
It integrates eight MTT S4000 GPUs based on the advanced MUSA architecture, providing a formidable computational foundation for both model training and high-throughput inference.
Equipped with dual Intel® Xeon® Gold 6430 processors and 1TB of high-speed DDR5 memory, the MCCX D800 ensures seamless data orchestration. Its standout feature is the MTLink 1.0 interconnect technology combined with PCIe Gen5 P2P, enabling an inter-card bandwidth of up to 240 GB/s. This allows the server to act as a core building block for massive KUAE computing clusters, supporting linear performance scaling.
With full CUDA compatibility via the MUSA software stack, the MCCX D800 offers a production-ready environment for deep learning frameworks such as PyTorch and Megatron-LM.
Application scenarios
LLM training from 70B to 130B parameters in KUAE clusters.
AI inference (LLaMA, GLM, Baichuan, GPT).
Video analytics and transcoding (768 × 1080p).
Cloud VDI platforms with vGPU/SR-IOV.