Nvidia calls its new DGX Spark the "smallest AI supercomputer in the world." The compact machine costs around $4,000 and isn’t meant for gamers, but for developers, researchers, and businesses that want to run large AI models locally—without using the cloud. Early reviews show mixed results.
The DGX Spark looks like a miniature version of Nvidia’s larger DGX systems, complete with the same design cues and gold side panels. Nvidia CEO Jensen Huang even presented the first unit symbolically to Elon Musk—a nod to when he handed over the original DGX‑1 to OpenAI in 2016.
Compact system with 128 GB of shared memory
Inside, the DGX Spark runs on Nvidia’s new GB10 chip, built on the Grace‑Blackwell architecture. It combines 20 Arm cores (10 Cortex‑X925 and 10 Cortex‑A725) with a Blackwell GPU, fabricated using TSMC’s 3‑nanometer process. CPU and GPU are directly connected via NVLink C2C.
Memory is the key feature: 128 GB of LPDDR5X with 273 GB/s bandwidth form a shared pool accessible by both CPU and GPU. Nvidia says this allows local execution of models with up to 200 billion parameters (at 4‑bit inference) or roughly 70 billion parameters during fine‑tuning.
The system includes 6,144 CUDA cores, 192 fifth‑generation Tensor Cores, and a theoretical FP4 throughput of 1 petaFLOP. It also comes with a 4 TB NVMe SSD, four USB‑C ports, HDMI, 10‑Gigabit Ethernet, and two QSFP56 connectors for 200‑Gigabit networks with RDMA support. Multiple DGX Spark units can be linked together through those 200‑Gigabit interfaces to form small clusters.
Performance: not a speed demon, but dependable
According to tests by The Register, the DGX Spark isn’t optimized for raw speed. It can handle larger models than any current consumer GPU, but it runs slower.
When fine‑tuning a Llama‑3.2 model with 3 billion parameters, the Spark took about 90 seconds per million tokens—roughly twice as long as an RTX 6000 Ada, which quickly hits its 48 GB VRAM limit.
For image generation using the FLUX.1 Dev model, the Spark produced one image in about 97 seconds, compared to 37 seconds on the RTX 6000 Ada. On the other hand, the Spark’s power draw remained relatively modest at 40‑45 watts at idle and around 200 watts under load, while the RTX 6000 Ada alone consumes about 300 watts.
In language model tests like Llama.cpp and TensorRT‑LLM, throughput ranged between 14 and 49 tokens per second depending on model size and batch configuration.
The preinstalled DGX OS is built on Ubuntu 24.04 and comes with CUDA, Docker, drivers, and management tools. Nvidia Sync allows remote operation via VPN or SSH, making it easy to manage web interfaces and development environments from afar.
Competitors: Mac Studio, Strix Halo, and Jetson Thor
Nvidia’s main competition will likely come from Apple’s Mac Studio with the M4 chip and AMD’s upcoming Strix Halo systems. Both offer similar memory capacities but differ in software ecosystems: Nvidia relies on CUDA and its long‑optimized stack, while Apple uses Metal and AMD depends on ROCm.
The Register also points out Nvidia’s own Jetson Thor Developer Kit as an internal rival. Based on the same Blackwell architecture, it delivers twice the FP4 performance and includes 128 GB of memory for about the same price. However, Jetson Thor targets robotics and embedded systems, while the DGX Spark is clearly positioned as a developer platform.
A portable AI workstation?
ServeTheHome calls the DGX Spark a " game-changer for local AI development." With its large shared memory and the option of a high-bandwidth connection to another Spark, it enables the local execution of large models without the need for cloud reliance — an interesting opportunity for research, data analysis or internal business applications. However, it is not suited to gaming or multimedia.
The Register offers a more reserved take: the DGX Spark isn’t built for maximum speed but "it’s about doing everything well enough." For many users, that balance may be exactly what they need.
How large that audience is remains to be seen—and will likely depend on whether AMD can close the software gap. The GB10 chip itself is expected to appear in Windows PCs soon.