How "Hardwired" AI Will Destroy Nvidia's Empire and Change the World

killbot5000 · 2026-03-15T00:52:04 1773535924

The foundation models themselves will be cheap to deploy, but we’ll still need general purpose inferencing hardware to work along side them, converting latent intermediate layers to useful, application-specific concerns. This may level off the demand for “gpu/tpu” hardware, though, by letting the biggest and most expensive layers move to silicon.

comandillos · 2026-03-14T22:13:30 1773526410

This is still far away from being viable for actually useful models, like bigger MoE ones with much larger context windows. I mean, the technology is very promising just like Cerebras, but we need to see whether they are able to keep up this with the evolution of the models to come in the next few years. Extremely interesting nevertheless.

spzb · 2026-03-14T22:14:13 1773526453

Is this a paid ad placement? I'm seeing a load of breathless "commentary" on Taalas and next to no serious discussion about whether their approach is even remotely scalable. A one-off tech demo using a comparatively ancient open source model is hardly going to be giving Jensen Huang sleepless nights.

exabrial · 2026-03-14T23:44:30 1773531870

I always thought once we have the models figured out, getting the meat of it into an FPGA was probably the logical next step. They seemed to have skipped that and are directly writing the program as ASIC (ROM). Pretty wild.

amelius · 2026-03-14T22:01:53 1773525713

It's crazy. In a few years we will be able to buy Qwen on a chip, doing 10K tokens per second.

androiddrew · 2026-03-14T22:24:18 1773527058

Yeah, well might just come on your new laptop

bradleyy · 2026-03-14T22:55:11 1773528911

Or your phone.

androiddrew · 2026-03-14T22:21:27 1773526887

Give me a 120B dense model on one of these and yeah my API use will probably drop.