VRAM-tier benchmarking
Models are grouped by what actually fits your card — not by parameter count. Full-precision small models compete head-to-head against quantised larger ones within each tier.
VRAM-tier benchmarking
Models are grouped by what actually fits your card — not by parameter count. Full-precision small models compete head-to-head against quantised larger ones within each tier.
Consumer hardware only
No A100s. No datacenter setups. Every result comes from the GPUs on the bench — secondhand cards, real PCIe slots, honest baselines.
Quality, speed, and efficiency
Three lenses on every card and model: composite quality score, generation speed, and Pareto efficiency (bang per bit). Not just which model wins — which is worth the file size.
Floor representatives
Each tier uses the worst card that fits — not the best. Results are honest lower bounds. If it works here, it works on anything better.