Developer Tools

LLM Memory Calculator

Calculate how much VRAM you need to run any LLM. Covers fp32, fp16, int8, int4 quantization and KV cache.

Model weights
140.00 GB
KV cache
5.37 GB
Activations
28.00 GB
Recommended VRAM
209 GB
Total GPU memory needed: 173.37 GB + 20% headroom = 209 GB recommended

GPU compatibility

GPUVRAMFits?
RTX 306012 GBโŒ
RTX 309024 GBโŒ
RTX 409024 GBโŒ
A100 40GB40 GBโŒ
A100 80GB80 GBโŒ
H100 80GB80 GBโŒ
H200 141GB141 GBโŒ

Memory estimates are approximations. KV cache calculation uses fp16 byte width for non-fp32 precisions. Activation memory is estimated at 20% of model weight memory. All calculations run in your browser โ€” no API calls.

guide

How Much VRAM Do You Need to Run an LLM Locally?

Calculate GPU memory requirements for any LLM. Learn how model size, quantization (fp16, int8, int4), and KV cache affect VRAM needs.

โ†’
More free toolsSee all 162 โ†’
Merge PDFsCompress ImageJSON FormatterPassword GeneratorVAT CalculatorQR Code Generator