Developer Tools
LLM Memory Calculator
Calculate how much VRAM you need to run any LLM. Covers fp32, fp16, int8, int4 quantization and KV cache.
Model weights
140.00 GB
KV cache
5.37 GB
Activations
28.00 GB
Recommended VRAM
209 GB
Total GPU memory needed: 173.37 GB + 20% headroom = 209 GB recommended
GPU compatibility
| GPU | VRAM | Fits? |
|---|---|---|
| RTX 3060 | 12 GB | โ |
| RTX 3090 | 24 GB | โ |
| RTX 4090 | 24 GB | โ |
| A100 40GB | 40 GB | โ |
| A100 80GB | 80 GB | โ |
| H100 80GB | 80 GB | โ |
| H200 141GB | 141 GB | โ |
Memory estimates are approximations. KV cache calculation uses fp16 byte width for non-fp32 precisions. Activation memory is estimated at 20% of model weight memory. All calculations run in your browser โ no API calls.
How Much VRAM Do You Need to Run an LLM Locally?
Calculate GPU memory requirements for any LLM. Learn how model size, quantization (fp16, int8, int4), and KV cache affect VRAM needs.