I finally loaded a 120B model - #nemotron3 super, onto my #DGXSpark. With all the stars aligned and goats sacrificed, I think this is the NVFP4 flavour. I'm using it to review patches I made earlier for evaluation.
So far I'm blown away by how _fast_ it is, I'm seeing ~20-25 tokens per second.
It's too soon if this is going to replace my go-to model (qwen3.6-35B-A3B) but I'm looking forward to using during my day job tasks.
I run two models, one on #StrixHalo and one more on the spark. A/B :)





