LocoBench

I have X GB of VRAM — what's the best model I can run?

VRAM-tier benchmarking

Models are grouped by what actually fits your card — not by parameter count. Full-precision small models compete head-to-head against quantised larger ones within each tier.

Consumer hardware only

No A100s. No datacenter setups. Every result comes from the GPUs on the bench — secondhand cards, real PCIe slots, honest baselines.

Quality, speed, and efficiency

Three lenses on every card and model: composite quality score, generation speed, and Pareto efficiency (bang per bit). Not just which model wins — which is worth the file size.

Floor representatives

Each tier uses the worst card that fits — not the best. Results are honest lower bounds. If it works here, it works on anything better.