Question 1

How much VRAM do I need for local LLM inference?

Accepted Answer

As a rule of thumb a 4-bit quantised model needs roughly half its parameter count in GB of VRAM, plus headroom for context. A 7B model fits comfortably in 8GB; a 13B in 16GB; a 70B model needs ~40GB and is the sweet spot for a single RTX 5090 (32GB) with offload, or a single RTX 6000 Pro Blackwell (96GB) for unconstrained context. For 180B+ models we'd recommend dual RTX 6000 Pro.

Question 2

Is the RTX 5090 better than the RTX 4090 for AI workloads?

Accepted Answer

Yes. The RTX 5090 has 32GB GDDR7 (vs 24GB GDDR6X on the 4090), substantially higher memory bandwidth, and Blackwell tensor cores with FP4 support — practical AI throughput is 1.7–2.2× a 4090 on most modern frameworks, and the extra 8GB VRAM lets you run 70B-class models that simply do not fit on a 4090.

Question 3

Should I get a consumer RTX 5090 or a professional RTX 6000 Pro?

Accepted Answer

Choose the RTX 5090 if you need the best price-per-token for inference and SDXL workflows up to 70B. Choose the RTX 6000 Pro Blackwell when you need 96GB VRAM in a single card (fine-tuning, very long context, large image batches), ECC memory for unattended training, or to put two cards in one chassis. Most teams running production AI in-house go RTX 6000 Pro; most prosumers and developers go RTX 5090.

Question 4

Do you support NVLink for multi-GPU training?

Accepted Answer

Blackwell consumer cards (RTX 5080/5090) do not support NVLink. The RTX 6000 Pro Blackwell supports peer-to-peer and large-pool memory aggregation in supported chassis — we configure these in workstation cases with appropriate PSU headroom (typically 1600W Platinum) and air or hybrid cooling.

Question 5

What CPU and RAM should I pair with an AI workstation?

Accepted Answer

For single-GPU inference an Intel Core Ultra 9 285K or AMD Ryzen 9 9950X with 64GB DDR5 is plenty. For training and multi-GPU we move to AMD Threadripper PRO (24+ cores, 8-channel ECC RAM) and 128–256GB. Storage matters too — we spec at least 2TB Gen5 NVMe for hot datasets and a second 4TB+ NVMe for model weights.

Question 6

What about local AI assistants and RAG workflows?

Accepted Answer

An RTX 5090 with 32GB VRAM happily runs Ollama, LM Studio, llama.cpp, vLLM and AnythingLLM with 70B-class models at usable speeds (10–20 tok/s). Pair it with 64–128GB system RAM for vector databases (Qdrant, Chroma) and you have a fully local RAG stack with no API costs and no data leaving your premises — popular with legal, medical and engineering clients.

Question 7

Do you offer support for AI workloads after delivery?

Accepted Answer

Yes — every CREATE AI workstation includes our 5-year hardware warranty (parts and labour). On AI-tier builds we also include a one-on-one onboarding session covering driver setup, CUDA/cuDNN, your chosen framework (PyTorch / TensorFlow / Ollama / ComfyUI) and basic performance tuning. Bespoke ongoing support is available on request.

Question 8

How long does an AI workstation take to build?

Accepted Answer

Current turnaround is 7–10 working days from order. Studio-tier builds (dual RTX 6000 Pro, Threadripper PRO) typically run 10–14 working days because of additional 24-hour burn-in and validation testing.

	VRAM (GB)	AI TOPS (FP4)	Max practical LLM (Q4)	Tier
RTX 5060 Ti Blackwell · 16GB GDDR7	16	759	7B (Q4)	Entry
RTX 5070 Ti Blackwell · 16GB GDDR7	16	1,406	13B (Q4)	Mid
RTX 5080 Blackwell · 16GB GDDR7	16	1,801	13B (Q4)	Mid-High
RTX 5090 Blackwell · 32GB GDDR7	32	3,352	70B (Q4)	High
RTX 6000 Pro Blackwell Pro · 96GB GDDR7 (ECC)	96	4,000	180B (Q4) / fine-tune 13B	Pro
2× RTX 6000 Pro Blackwell Pro · 2× 96GB (192GB total)	192 (combined)	8,000	405B (Q4) / fine-tune 70B	Studio

Hand-built AI workstations for LLM, Stable Diffusion & ML.

Choose your AI workload

LLM Inference & Fine-Tuning

Stable Diffusion & Generative Image

Machine Learning & Data Science

AI GPU comparison · which card fits which model

AI workstation FAQs

How much VRAM do I need for local LLM inference?

Is the RTX 5090 better than the RTX 4090 for AI workloads?

Should I get a consumer RTX 5090 or a professional RTX 6000 Pro?

Do you support NVLink for multi-GPU training?

What CPU and RAM should I pair with an AI workstation?

What about local AI assistants and RAG workflows?

Do you offer support for AI workloads after delivery?

How long does an AI workstation take to build?

Why CREATE PCs for AI

Specified by builders who run the workload

Up to dual RTX 6000 Pro Blackwell

5-year warranty (parts + labour)

24-hour burn-in before dispatch

AI-focused onboarding

UK hand-built, fully cable-managed