
Nvidia is open sourcing the DGX Spark software stack to turn it into a local, on-prem AI node, enabling developers and enterprises to run powerful open-source AI workflows without relying on the cloud.
Nvidia has announced a software-only update for DGX Spark at CES 2026, repositioning the compact system from a standalone developer device into a local, on-prem AI compute node built around open-source AI ecosystems.
The update significantly expands native support for open-source AI frameworks and community-driven models, directly addressing software limitations highlighted in early reviews. Nvidia is adding support for PyTorch, vLLM, SGLang, llama.cpp, and LlamaIndex, alongside popular open and open-weight models from Qwen, Meta, Stability AI, and Wan.
No new hardware is required. The software-only approach reduces custom integration work for organisations relying on open tools, while enabling smoother upgrades as frameworks and models evolve.
DGX Spark is powered by the GB10 Grace Blackwell Superchip, combining CPU and GPU cores with 128GB of unified memory, enabling large language models to run locally without cloud dependence. Nvidia claims performance gains of up to 2.5× compared with launch, driven by TensorRT-LLM updates, tighter quantisation, and decoding optimisations.
In one example, Qwen-235B more than doubled throughput when moving from FP8 to NVFP4 with speculative decoding. Smaller but measurable gains were also reported for Qwen3-30B and Stable Diffusion 3.5 Large.
The update introduces DGX Spark playbooks, bundling tools, models, and setup guides into reusable, on-prem workflows. DGX Spark can also act as an external AI accelerator, demonstrated with MacBook Pro systems where AI video pipelines dropped from eight minutes to roughly one minute.
A local Nsight Copilot enables CUDA assistance without sending code or data to the cloud, reinforcing data sovereignty. Overall, DGX Spark evolves into a flexible, open, and local-first AI node supporting laptops, workstations, and edge deployments.


