I finally loaded a 120B model - #nemotron3 super, onto my #DGXSpark. With all the stars aligned and goats sacrificed, I think this is the NVFP4 flavour. I'm using it to review patches I made earlier for evaluation.

So far I'm blown away by how _fast_ it is, I'm seeing ~20-25 tokens per second.

It's too soon if this is going to replace my go-to model (qwen3.6-35B-A3B) but I'm looking forward to using during my day job tasks.

I run two models, one on #StrixHalo and one more on the spark. A/B :)

RT @ArtificialAnlys: NVIDIA hat im Rahmen der Computex-Schlüsselrede von Jensen Huang die Veröffentlichung von Nemotron 3 Ultra angekündigt: Mit 550 Milliarden Parametern (55 Milliarden aktiv) ist dies das größte Nemotron-3-Modell bis dato und das intelligenteste US-amerikanische Modell mit offenen Gewichten.

mehr auf Arint.info

#AI #Computex #LLM #Nemotron3 #NVIDIA #OpenWeights #arint_info

https://x.com/ArtificialAnlys/status/2061304911565144230#m

Arint - SEO+KI (@[email protected])

<p>RT @ArtificialAnlys: NVIDIA hat im Rahmen der Computex-Schlüsselrede von Jensen Huang die Veröffentlichung von Nemotron 3 Ultra angekündigt: Mit 550 Milliarden Parametern (55 Milliarden aktiv) ist dies das größte Nemotron-3-Modell bis dato und das intelligenteste US-amerikanische Modell mit offenen Gewichten.</p> <p><a href="https://arint.info/@Arint/116678590202093885">mehr</a> auf <a href="https://arint.info/">Arint.info</a></p> <p>#AI #Computex #LLM #Nemotron3 #NVIDIA #OpenWeights #arint_info</p> <p><a href="https://x.com/ArtificialAnlys/status/2061304911565144230#m">https://x.com/ArtificialAnlys/status/2061304911565144230#m</a></p>

Mastodon Glitch Edition

NVIDIA already controls the hardware most AI models run on. Now they want a say in which models run on that hardware too.

Nemotron 3 Nano Omni is their latest move in that direction. It’s an omnimodal model that can handle text, images, video, and audio natively in one architecture.

The 30B total parameter count with 3B active makes it approachable for serious deployment without needing heavy hardware. https://firethering.com/nemotron-3-nano-omni-nvidia-omnimodal-model/

#ai #llms #technews #nemotron3 #nvidia #news #trending #genai

NVIDIA Built Nemotron 3 Nano Omni to Handle Everything. Here’s the Catch - Firethering

NVIDIA already controls the hardware most AI models run on. Now they want a say in which models run on that hardware too. Nemotron 3 Nano Omni is their latest move in that direction. It's an omnimodal model that can handle text, images, video, and audio natively in one architecture. The 30B total parameter count with 3B active makes it approachable for serious deployment without needing heavy hardware. The architecture underneath it is genuinely unusual. And the benchmark numbers on document intelligence and video understanding are strong enough to take seriously. But there is a catch. Actually there are a few.

Firethering

Nemotron 3 Super pushes the frontier with 40 M supervised & alignment samples, leveraging a Mamba‑Transformer backbone and Mixture‑of‑Experts scaling. The model shows stronger agent reasoning, RL‑based fine‑tuning, and tighter AI alignment. Dive into the details to see how this LLM reshapes open‑source AI. #Nemotron3 #MixtureOfExperts #AIAlignment #SupervisedFineTuning

🔗 https://aidailypost.com/news/nemotron-3-super-incorporates-40-million-supervised-alignment-samples

🚀 New episode #229 dives into Google’s Gemini 3 Flash rollout, the latest ChatGPT app store debut, and Nemotron 3’s surprise. We also unpack OpenAI’s GPT‑5.2‑Codex roadmap, ElevenLabs voice upgrades, and what Meta’s next move could mean for open‑source AI. Tune in for the full breakdown! #Gemini3Flash #ChatGPT #Nemotron3 #GPT52Codex

🔗 https://aidailypost.com/news/lwiai-podcast-229-google-defaults-gemini-3-flash-chatgpt-app-launch

Người dùng đã thử nghiệm Spark với mô hình Nemotron3 Nano 30B, đạt tốc độ xử lý batch ấn tượng ~1300 token/giây với 200 yêu cầu đồng thời. Hiệu suất này rất hứa hẹn so với thế hệ trước và B200. Bạn nghĩ sao về việc so sánh với cấu hình 4x 3090?

#AI #HieuNang #XuLyBatch #DGX #Spark #Nemotron3 #GPU #Performance #BatchProcessing

https://www.reddit.com/r/LocalLLaMA/comments/1ptp8lq/dgx_spark_and_batch_processing/

Nvidia's new Nemotron 3 blends the Mamba hybrid sequence model with 31.6 B parameters, keeping only 3 B active per step. The design outperforms gpt‑oss‑20B and rivals Qwen3‑30B, while integrating LatentMoE and the Artificial Analysis Index. Open‑source fans, see how this hybrid architecture pushes efficiency and scaling. #Nvidia #Nemotron3 #MambaHybrid #LatentMoE

🔗 https://aidailypost.com/news/nvidias-nemotron-3-uses-mamba-hybrid-316b-params-3b-active-per-step

Nemotron 3 - Nvidia débarque dans l'open source et crache du token comme jamais

https://fed.brid.gy/r/https://korben.info/nvidia-nemotron-3-nano-modele-ia-open-source-2.html