I'd second this wholeheartedly Since building a custom agent setup to replace co...

dpoloncsak · 2026-01-20T20:56:44 1768942604

Yeah, one of my first projects one of my buddies asked "Why aren't you using [ChatGPT 4.0] nano? It's 99% the effectiveness with 10% the price."

I've been using the smaller models ever since. Nano/mini, flash, etc.

sixtyj · 2026-01-20T21:12:51 1768943571

Yup.

I have found out recently that Grok-4.1-fast has similar pricing (in cents) but 10x larger context window (2M tokens instead of ~128-200k of gpt-4-1-nano). And ~4% hallucination, lowest in blind tests in LLM arena.

verdverm · 2026-01-20T22:02:24 1768946544

You use stuff from xAi and Elmo?

I'm unwilling to look past Musk's politics, immorality, and manipulation on a global scale

rudhdb773b · 2026-01-20T22:18:09 1768947489

Grok is the best general purpose LLM in my experience. Only Gemini is comparable. It would be silly to ignore it, and xAI is less evil than Google these days.

naught0 · 2026-01-22T04:11:18 1769055078

When's the last time Sundar Pichai did a Hitler salute or had his creation calling itself "Mecha Hitler"?

rudhdb773b · 2026-01-22T15:25:04 1769095504

In the big picture, those events are insignificant compared to the negative impacts on society from Google's trillion dollar advertising business and the associated destruction of privacy.

naught0 · 2026-01-23T03:26:23 1769138783

fair points, but we'll have to see now that grok is in the pentagon. sky's the limit

phainopepla2 · 2026-01-20T21:03:50 1768943030

I have been benchmarking many of my use cases, and the GPT Nano models have fallen completely flat one every single except for very short summaries. I would call them 25% effectiveness at best.

verdverm · 2026-01-20T22:23:34 1768947814

Flash is not a small model, it's still over 1T parameters. It's a hyper MoE aiui

I have yet to go back to small models, waiting for the upstream feature / GPU provider has been seeing capacity issues, so I am sticking with the gemini family for now

walthamstow · 2026-01-20T21:04:59 1768943099

Flash Lite 2.5 is an unbelievably good model for the price

r_lee · 2026-01-20T20:50:55 1768942255

Plus I've found that overall with "thinking" models, it's more like for memory, not even actual perf boost, it might even be worse because if it goes even slightly wrong on the "thinking" part, it'll then commit to that for the actual response

verdverm · 2026-01-20T22:03:51 1768946631

for sure, the difference in the most recent model generations makes them far more useful for many daily tasks. This is the first gen with thinking as a significant mid-training focus and it shows

gemini-3-flash stands well above gemini-2.5-pro

PunchyHamster · 2026-01-21T21:06:52 1769029612

LLM bubble will burst the second investors figure out how much well managed local model can do

verdverm · 2026-01-22T04:54:57 1769057697

Except that

1. There is still night and day difference

2. Local is slow af

3. The vast majority of people will not run their own models

4. I would have to spend more than $200+ a month on frontier AI to come close the same price it would cost for any decent AI at home rig. Why would I not use frontier models at this point?