I think people aren't shocked enough of this project, hard-coded inference models, hardware printed models, call it as you wish

You should personally try the #Taalas online chatbot to get an idea, I asked it to create a typical python flask project structure and scripts, it made it in 0.002 seconds, and it's currently using a Llama 3.1 8B model with no reasoning.

https://www.nextplatform.com/compute/2026/02/19/taalas-etches-ai-models-onto-transistors-to-rocket-boost-inference/4092140

https://chatjimmy.ai/

When this architecture will hit a reasoning model, something that will happen this year, I'm sure the news will be all about it. Connect it to #claudecode #opencode or similar and you have a recipe for the same magnitude of transformation the personal computer was

Taalas Etches AI Models Onto Transistors To Rocket Boost Inference

Adding big blocks of SRAM to collections of AI tensor engines, or better still, a waferscale collection of such engines, turbocharges AI inference, as has

nextplatform

Taalas just emerged from stealth with a claim that’s shaking the hardware world: 17,000 tokens per second on Llama 3.1 8B.

How? By physically etching the AI model directly into the silicon transistors. No HBM. No liquid cooling. Just raw, hardwired performance that is 10x faster and 20x cheaper than traditional GPU inference.

https://www.buysellram.com/blog/17000-tokens-second-is-taalas-hardwired-silicon-the-ultimate-solution-to-the-ai-memory-wall-and-hbm-shortage/

#AI #ArtificialIntelligence #AIHardware #DataCenter #MemoryWall #HBMShortage #InferenceFactory #HardcoreAI #ASIC #Taalas #NVIDIA #technology

17,000 Tokens/Second: Is Taalas’ Hardwired Silicon the Ultimate Solution to the AI Memory Wall and HBM Shortage?

Can Taalas’ 17,000 tokens/sec HC1 chip solve the AI Memory Wall? Discover why hardwired silicon is disrupting the HBM market and how it impacts GPU resale value.

BuySellRam

Taalas just emerged from stealth with a claim that’s shaking the hardware world: 17,000 tokens per second on Llama 3.1 8B.

How? By physically etching the AI model directly into the silicon transistors. No HBM. No liquid cooling. Just raw, hardwired performance that is 10x faster and 20x cheaper than traditional GPU inference.

The Breakthrough: Taalas has unveiled the HC1 chip, achieving a massive 17,000 tokens/second on Llama 3.1 8B. It is roughly 10x faster and 20x cheaper than traditional GPU inference.

The “Hardwired” Secret: Unlike GPUs that load software, Taalas etches the AI model directly into the silicon transistors. By physically embedding the weights, they eliminate the need for High-Bandwidth Memory (HBM).

Solving the Memory Wall: By removing the “data movement” between external memory and the processor, Taalas bypasses the industry’s biggest bottleneck—the Memory Wall—and operates entirely on standard air cooling.

The Trade-off: The chip is model-specific. While it offers “insane” efficiency for stable, high-volume production (like 24/7 chatbots), it lacks the programmability and flexibility of a GPU.

Market Impact: The rise of these specialized “Inference Factories” actually increases the long-term value of your GPUs. Because GPUs are versatile and can be repurposed for any new model, they remain the “Gold Standard” for resale and training.

Demo LLM: chat jimmy

https://www.buysellram.com/blog/17000-tokens-second-is-taalas-hardwired-silicon-the-ultimate-solution-to-the-ai-memory-wall-and-hbm-shortage/

#AI #ArtificialIntelligence #AIHardware #DataCenter #MemoryWall #HBMShortage #InferenceFactory #HardcoreAI #ASIC #Taalas #NVIDIA #technology

17,000 Tokens/Second: Is Taalas’ Hardwired Silicon the Ultimate Solution to the AI Memory Wall and HBM Shortage?

Can Taalas’ 17,000 tokens/sec HC1 chip solve the AI Memory Wall? Discover why hardwired silicon is disrupting the HBM market and how it impacts GPU resale value.

BuySellRam

Taalas just emerged from stealth with a claim that’s shaking the hardware world: 17,000 tokens per second on Llama 3.1 8B.

How? By physically etching the AI model directly into the silicon transistors. No HBM. No liquid cooling. Just raw, hardwired performance that is 10x faster and 20x cheaper than traditional GPU inference.

https://www.buysellram.com/blog/17000-tokens-second-is-taalas-hardwired-silicon-the-ultimate-solution-to-the-ai-memory-wall-and-hbm-shortage/

#AI #ArtificialIntelligence #AIHardware #DataCenter #MemoryWall #HBMShortage #InferenceFactory #HardcoreAI #ASIC #Taalas #NVIDIA #technology

17,000 Tokens/Second: Is Taalas’ Hardwired Silicon the Ultimate Solution to the AI Memory Wall and HBM Shortage?

Can Taalas’ 17,000 tokens/sec HC1 chip solve the AI Memory Wall? Discover why hardwired silicon is disrupting the HBM market and how it impacts GPU resale value.

BuySellRam

Taalas just emerged from stealth with a claim that’s shaking the hardware world: 17,000 tokens per second on Llama 3.1 8B.

How? By physically etching the AI model directly into the silicon transistors. No HBM. No liquid cooling. Just raw, hardwired performance that is 10x faster and 20x cheaper than traditional GPU inference.

https://www.buysellram.com/blog/17000-tokens-second-is-taalas-hardwired-silicon-the-ultimate-solution-to-the-ai-memory-wall-and-hbm-shortage/

#AI #ArtificialIntelligence #AIHardware #DataCenter #MemoryWall #HBMShortage #InferenceFactory #HardcoreAI #ASIC #Taalas #NVIDIA #technology

17,000 Tokens/Second: Is Taalas’ Hardwired Silicon the Ultimate Solution to the AI Memory Wall and HBM Shortage?

Can Taalas’ 17,000 tokens/sec HC1 chip solve the AI Memory Wall? Discover why hardwired silicon is disrupting the HBM market and how it impacts GPU resale value.

BuySellRam
How Taalas "prints" LLM onto a chip? - Anurag's Blog

Applied for #TAALAS API access for a little thing I've been working on using cerebras. Hope they accept 🤞

#llama #AI

✨ Taalas HC1 raggiunge 17.000 token al secondo
Pesi nel silicio invece che nella RAM: la strategia di Taalas con HC1 per eliminare HBM, latenza e raffreddamento a liquido dall'inferenza AI.

https://gomoot.com/taalas-hc1-raggiunge-17-000-token-al-secondo/

#ai #hc1 #news #taalas #tech

Sopii kuvioon, että kun maassa on konservatiivisin #hallitus aikoihin, on valtiopäivien avajaisten jumalanpalveluksessa saarnaamassa #JariJolkkonen.

Tunnuksettoman tilaisuuden osallistujamäärä kasvaa koko ajan. Nyt puhujina olivat #CMI:n toimitusjohtaja #JanneTaalas ja #EvaBiaudet. Tilaisuuden teemana on #rauha ja #demokratia.

https://yle.fi/a/74-20141492

#eduskunta #uskonnonvapaus #uskonnottomuus #vakaumus #politiikka #yhteiskunta #uskonto #Jolkkonen #Biaudet #Taalas

Kansanedustajat valitsivat kirkon ja tunnustuksettoman juhlan välillä – kuva näyttää eron

Tunnustuksettoman avajaisjuhlan suosio on kasvanut, arvioi tilaisuuden järjestäjiin kuuluva Tero Suoniemi (vihr.).

Yle Uutiset

18 Jahre Bauzeit (statt 6), 11 Milliarden Euro Baukosten (statt 3), Einsatz #russischer Brennstäbe, Gefahr eine #SuperGAU (ähnlich wie #Tschernobyl und #Fukushima)

Das ist die Wahrheit über das neue #Atomkraftwerk in #Finnland, #Olkiluoto 3.

Der Chef der #Weltwetterorganisation #WMO aus Finnland, Petteri #Taalas, fordert Deutschland auf, seinen Atomausstieg zu überdenken.

Das merkste selber, oder?

https://www.zeit.de/politik/ausland/2023-11/klimawandel-atomausstieg-weltklimakonferenz-weltwetterorganisation-petteri-taalas

Weltwetterorganisation: WMO-Chef empfiehlt Deutschland Überdenken des Atomausstiegs

Der Chef der Weltwetterorganisation rät Deutschland, einen Ausstieg vom Atomausstieg zu erwägen. Ohne Atomkraft werde es schwierig, die Energiewende zu stemmen.

ZEIT ONLINE