2026-03-25 18:56:17

Why run models locally?

Typically two main reasons - privacy and cost
Let’s explore the cost side with an example and see how it maths
Let’s say you want to run an Autoresearch loop overnight like @karpathy
If you have access to an H100, you can run 100 experiments overnight using Opus-4.6 and the API cost will likely be in the $10-25 range
But most of us are not lucky enough to have access to an H100
We can still run 100 Autoresearch experiments on a MacBook for the same $10-25, but it won’t be apples to apples
The H100 will complete 50-100x more training steps over the same timeframe
So if you want to reproduce the same number of training steps, you could end up paying $1000+ in API costs, and of course it will take much longer than overnight
This wouldn’t be very smart since you can rent an H100 for far less and get the same job done faster
But it starts to paint a picture of why you’d want to run models locally - it enables you to do experiments that would otherwise be cost prohibitive for most people
It starts to level the playing field
I’m running Qwen3.5 9B on an older pc and it now makes sense to experiment on things I otherwise wouldn’t do if I’m on the hook for the API costs
And that’s a big unlock and will only open further over time as models get better and smaller

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.