How It Works
Everything you need to know about GPU compatibility for AI image and video generation
What is Can My GPU Run?
Can My GPU Run is a free tool that instantly tells you which AI models your graphics card can run. Running AI image and video generation locally requires loading large neural network models into your GPU's VRAM. Different models need different amounts of memory, and not every GPU has enough.
Instead of guessing or reading through scattered documentation, select your GPU and immediately see compatibility ratings for over 120 models across image generation, video, upscaling, 3D, and audio.
How to Use
Select Your GPU
Choose your graphics card from the dropdown. We support 65+ GPUs from NVIDIA GeForce (RTX 5090 to GTX 1060), AMD Radeon (RX 9070 XT to RX 6600), and Apple Silicon (M1 to M4 Ultra).
Browse AI Models
Browse 120+ models organized by category. Use filters to narrow by type (Image, Video, 3D, Audio), search by name, or sort by rating, VRAM usage, or popularity.
Check Compatibility
Each model shows an instant S-to-F rating. Click any model for detailed format options, exact VRAM requirements, and setup recommendations for your GPU.
Understanding Ratings
Each model receives a compatibility rating based on how well it fits your GPU's VRAM capacity, compute generation, and available optimized formats.
Where Does the Data Come From?
Every rating is based on real-world data, not guesswork. VRAM requirements are sourced from official model repositories on HuggingFace and GitHub, verified against community benchmarks and real hardware tests. GPU specifications come from manufacturer datasheets.
Ratings are updated regularly as new models are released, new quantization formats become available, and community benchmarks provide better data. If you spot an inaccuracy, let us know.
VRAM & Formats Explained
AI models can be loaded in different precision formats that trade off memory usage vs. quality. Here's what each format means:
FP16 Full Precision
16-bit floating point. The standard format with maximum quality. Requires the most VRAM. Use this when your GPU has enough memory.
FP8 Half Memory
8-bit floating point. Cuts VRAM usage roughly in half with minimal visible quality loss. Supported on NVIDIA RTX 40-series and newer GPUs.
GGUF Flexible Compression
Quantized format with multiple levels (Q8, Q5, Q4, Q3). Lower levels use less VRAM. Q5 is the best balance of quality and memory for most users.
CPU Offload Last Resort
Moves part of the model to system RAM. Allows running models that exceed your VRAM but generation is significantly slower due to lower memory bandwidth.
Why This Matters
Running AI generation locally on your own hardware gives you full control over your creative process. No cloud subscriptions, no usage limits, no sending your prompts to external servers. But the hardware requirements can be confusing — model sizes range from 1GB to 24GB+ of VRAM, and the same model can run differently depending on the format you choose.
Can My GPU Run cuts through this complexity. Instead of reading GitHub READMEs and community forums to piece together whether a model works on your GPU, you get an instant answer with format recommendations tailored to your hardware.
GPU Comparison
Can't decide between two GPUs? The comparison tool lets you pick any two graphics cards and see how they stack up across all 120+ AI models. You'll see:
Side-by-Side Overview
Compatibility counts, average scores, and a clear winner declaration so you can see the overall picture at a glance.
Rating Distribution
Stacked bar charts showing how many models each GPU gets at S, A, B, C, D, and F tiers. More green = better GPU for AI workloads.
Model-by-Model Breakdown
Drill into every category and model to see exactly where one GPU outperforms the other, with visual comparison bars for each model.
To compare GPUs, select any GPU on the main page and click the Compare button, or choose two GPUs directly from the comparison page.
Frequently Asked Questions
What is VRAM and why does it matter for AI?
VRAM (Video Random Access Memory) is the dedicated memory on your graphics card. AI models must be loaded into VRAM to generate images or video. If a model requires more VRAM than your GPU has, it either won't run, will need quantization to reduce memory usage, or must partially offload to system RAM (which is much slower).
What's the difference between FP16 and FP8?
FP16 (16-bit floating point) is the standard full-precision format. FP8 (8-bit floating point) cuts memory usage roughly in half with minimal quality loss. FP8 is supported on NVIDIA RTX 40-series and newer GPUs. If your GPU supports FP8, it's often the best way to run larger models without visible quality degradation.
What is GGUF format?
GGUF is a quantization format developed by the llama.cpp community. It compresses AI models to various quality levels — Q8 (highest quality, ~50% of FP16 size), Q5 (good balance), Q4 (smaller), Q3 (smallest, lowest quality). GGUF allows models like Flux that normally need 12GB at FP16 to run on GPUs with just 4-6GB VRAM.
What is CPU offloading?
CPU offloading splits a model between your GPU's VRAM and your system's RAM. This lets you run models that exceed your VRAM capacity, but generation becomes significantly slower because system RAM bandwidth (typically 50-80 GB/s) is much lower than VRAM bandwidth (typically 200-1000 GB/s). It's a last-resort option when no quantized format fits your GPU.
How accurate are the ratings?
Ratings are based on real VRAM usage data from HuggingFace, GitHub, and community benchmarks. The algorithm considers your GPU's VRAM, compute generation, available formats, and bandwidth. While actual performance varies with resolution, batch size, and system config, the ratings provide a reliable compatibility baseline. We update the database regularly as new models and benchmarks appear.
Do I need a specific GPU brand?
No. Can My GPU Run supports NVIDIA GeForce (RTX 50-series down to GTX 1060), AMD Radeon (RX 9070 XT down to RX 6600), and Apple Silicon (M1 through M4 Ultra). NVIDIA GPUs have the broadest software support (CUDA), but AMD (ROCm) and Apple (MPS) work with many popular models. The tool accounts for platform-specific compatibility in its ratings.