RTX 5090
Prós
- Runs Nous Hermes 2 Mixtral 8x7B at Q4 natively
- 32 GB VRAM — adequate headroom
5 consumer GPUs can run Nous Hermes 2 Mixtral 8x7B at Q4 natively. Precise VRAM thresholds and benchmarks below.
Prices and availability may change · affiliate link
llama.cpp 0.2.x · CUDA 12 · ROCm 6 · updated monthly · methodology →
This model requires aFlagship GPU (48 GB+ VRAM)
Best picks by compatibility, VRAM headroom, and value — prices and availability may change.
Prós
Prós
Prós
Alguns links são links de afiliado da Amazon. Podemos receber uma comissão sem custo adicional para si. O cookie da Amazon pode durar até 24 horas após o clique.
CPU vs GPU for Nous Hermes 2 Mixtral 8x7B →
VRAM Calculator — instant compatibility check
RTX 5090
32 GB · Runs Q4 natively · Check availability
*Prices and availability may change. Some links are affiliate links.
| Quantization | VRAM needed | Disk space | Quality |
|---|---|---|---|
| FP16 (max quality) | 94 GB | 94 GB | Maximum |
| Q8 (high quality) | 47 GB | 47 GB | Near-lossless |
| Q4 (recommended) Best balance | 26 GB | 26 GB | Recommended |
| Q2 (minimum) | 13 GB | 13 GB | Quality loss |
| Developer | Nous Research |
| Parameters | 47B |
| Context window | 32,768 tokens |
| License | apache-2.0 |
| Use cases | chat, reasoning, roleplay |
| Released | 2024-01 |
Install with Ollama
ollama run nous-hermes2-mixtral:8x7b Hugging Face
NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO Nous Hermes 2 Mixtral 8x7B requires <strong class="text-primary-container">26 GB VRAM</strong> at Q4. 5 consumer GPUs meet this threshold. Below 8 GB or 24 GB you'll hit significant offload latency.
5 Q4 native · 16 offload
| GPU Unit | VRAM | Compatibility | Est. Speed | Action |
|---|---|---|---|---|
| RTX 5090 | 32GB | Optimal | — | Calculate → |
| M4 Ultra | 128GB | Optimal | 34 tok/s | Calculate → |
| M3 Ultra | 192GB | Optimal | 27 tok/s | Calculate → |
| M4 Max 48GB | 48GB | Optimal | 16 tok/s | Calculate → |
| M4 Max 36GB | 36GB | Optimal | — | Calculate → |
| RTX 4090 | 24GB | Offload | — | Calculate → |
| RTX 5080 | 16GB | Offload | — | Calculate → |
| RTX 4080 Super | 16GB | Offload | — | Calculate → |
| RTX 5070 Ti | 16GB | Offload | — | Calculate → |
| RTX 3090 | 24GB | Offload | — | Calculate → |
| RX 7900 XTX | 24GB | Offload | — | Calculate → |
| RTX 4070 Ti Super | 16GB | Offload | — | Calculate → |
| RX 7900 XT | 20GB | Offload | — | Calculate → |
| M4 Pro | 24GB | Offload | — | Calculate → |
| RX 7800 XT | 16GB | Offload | — | Calculate → |
| RX 6800 XT | 16GB | Offload | — | Calculate → |
| RTX 4060 Ti 16GB | 16GB | Offload | — | Calculate → |
| M3 Pro | 18GB | Offload | — | Calculate → |
| M2 Pro | 16GB | Offload | — | Calculate → |
| Arc A770 16GB | 16GB | Offload | — | Calculate → |
| M1 Pro | 16GB | Offload | — | Calculate → |
Best picks by compatibility, VRAM headroom, and value — prices and availability may change.
RTX 5090
32 GB VRAM
Check availability →
M4 Ultra
128 GB VRAM
Check availability →
M3 Ultra
192 GB VRAM
Check availability →
Alguns links são links de afiliado da Amazon. Podemos receber uma comissão sem custo adicional para si. O cookie da Amazon pode durar até 24 horas após o clique.
Nous Hermes 2 Mixtral 8x7B can run on CPU without a dedicated GPU — unusual for a 47B model. On an i7-13700K with llama.cpp Q4 it reaches 2 tok/s (slow but usable). With a GPU you get 4–6× more speed — check the VRAM calculator for specifics.
Which GPU is worth it? Real specs and benchmarks side by side.
GPUs that run Nous Hermes 2 Mixtral 8x7B at Q4 — sorted by AI performance score.
Alguns links são links de afiliado da Amazon. Podemos receber uma comissão sem custo adicional para si. O cookie da Amazon pode durar até 24 horas após o clique.
Similar models in the chat category with comparable VRAM footprints.
The VRAM Calculator tells you exactly which quantization your hardware can handle.
RTX 5090
Preços mudam diariamente