UK's No 1 Custom PC Builder
Rated Excellent by our Customers
3 Year Warranty

Ginger6 G6 Cobalt Max — RTX 5090 AI Workstation with CPU Offloading

Ginger6 G6 Cobalt Max — RTX 5090 AI Workstation with CPU Offloading

Andy
Andy
★★★★★

"Kevin was brilliant at answering my questions and recommending the best machine for my budget. The computer was for my son and is now the envy of all his friends."

Johnny
Johnny
★★★★★

"Nearly three years later with no issues I called for advice. They suggested I didn't need to upgrade and offered free ways to improve performance. Just great support."

Carol
Carol
★★★★★

"Bought our second PC from Ginger6. We had a few issues setting up and they called and assisted us — professional and patient throughout. Would highly recommend."

TP
Rated Excellent
1,100+ verified reviews
★★★★★

Be the first to review this product

A maximum-specification AI workstation with RTX 5090 32GB VRAM and 128GB system RAM for DeepSpeed ZeRO and PyTorch FSDP — training models beyond single-GPU VRAM capacity.


AMD Ryzen 9 9950X (16 Cores)
128GB DDR5 RAM
Nvidia RTX 5090 32GB VRAM
2TB NVMe + 4TB NVMe
Windows 11 Pro
3-Year Warranty

£7,211.69

£6,769.99

In stock

SKU: g6-cobalt-max


Get Expert Help
Talk to Your Builder
01902 714533 Email

Description

============================================================ -->
D2X — High VRAM Workstation

G6 Cobalt Max — CPU Offloading Workstation for DeepSpeed ZeRO, PyTorch FSDP, and 30B+ Models

The G6 Cobalt Max is built for AI and ML researchers using DeepSpeed ZeRO-Offload or PyTorch FSDP CPU offloading to train models larger than 32GB VRAM on a single GPU. 128GB DDR5 system RAM provides the capacity for optimiser states, gradients, and model parameters to be offloaded from the RTX 5090 to system RAM during training — extending the effective model size range significantly beyond what 32GB VRAM alone supports. The Ryzen 9 9950X's 16 cores and DDR5 memory bandwidth keep the CPU offloading process from bottlenecking GPU utilisation. From £4200, built and stress-tested in Wolverhampton. The Cobalt Max features in our AI and ML workstations range.

The Cobalt Max and the G6 Cobalt use the same RTX 5090 32GB GPU. If your model fits within 32GB VRAM and you do not use ZeRO or FSDP CPU offloading, the Cobalt with 64GB system RAM is the correct machine — the Cobalt Max adds cost without benefit for that workflow. Call Kevin before ordering if you are uncertain whether your training pipeline requires CPU offloading.

Not sure whether you need the Cobalt Max or the Cobalt? Call Kevin on 01902 714533 — describe your training framework and model size and he will give you a straight answer.

24hr
Stress Tested
93%
Five-Star Reviews
3 Year
Warranty
Since 2001
Building Custom PCs
3-Year Warranty
Parts, return postage, and lifetime support
24-Hour Stress Test
Every workstation tested under sustained professional load
Free UK Delivery
Free mainland delivery, fully tracked
Lifetime Support
Free UK phone support for the life of your machine
SPEC OVERVIEW, CORRECT FOR

G6 Cobalt Max — Full Specification

RTX 5090 32GB, 128GB DDR5, and Ryzen 9 9950X — the maximum VRAM and maximum system RAM configuration for CPU offloading research workloads.

Processor
AMD Ryzen 9 9950X — 16 cores, 32 threads, AM5. CPU offloading is compute-intensive on the host side. DeepSpeed ZeRO-Offload and PyTorch FSDP move optimiser states and parameters to system RAM, then move them back to GPU when needed for computation. This data movement is bandwidth-intensive. The Ryzen 9 9950X's 16 cores and DDR5 memory bandwidth keep the offload process from bottlenecking GPU utilisation during training.
Memory
128GB DDR5 — the capacity required for CPU offloading in DeepSpeed ZeRO and PyTorch FSDP at scale. ZeRO-Offload moves optimiser states and gradients to system RAM, which for a 30B parameter model in fp16 requires tens of GB of system RAM beyond the model VRAM footprint. 128GB provides this capacity. Stable at rated speed on X870 — confirm with Kevin before deployment.
Graphics
Nvidia RTX 5090 32GB VRAM — 32GB holds the active model weights and activations on the GPU during the forward pass. With CPU offloading, the optimiser states and gradients are managed in system RAM between steps, extending the effective model size range significantly beyond what 32GB VRAM alone supports. Stock subject to availability — confirm with Kevin before ordering.
Primary Storage
2TB NVMe — OS, Python environment, PyTorch, DeepSpeed, and active training datasets. Fast NVMe read speeds reduce data loading time between epochs, which matters more in offloading configurations where each step involves data movement between GPU and system RAM. Keeping the training environment on a dedicated drive prevents I/O contention during long runs.
Model and Archive Drive
4TB NVMe — completed model weights, training checkpoints, fine-tuned model archives, and large dataset storage beyond the active training set. 4TB provides space for an active research archive including multiple model checkpoints and the full training dataset history without requiring external storage during ongoing work.
Motherboard
Gigabyte X870 Eagle WIFI7 (ATX, AM5) — the X870 chipset provides PCIe 5.0 bandwidth and supports 128GB DDR5 at rated speed. 128GB DDR5 stability at rated speed on AM5 requires the X870 chipset — lower-tier chipsets are less consistent at this configuration. WIFI7 for fast wireless connectivity. Confirm 128GB DDR5 stability with Kevin before deployment.
Case and Cooling
APNX C1 with 360mm Liquid Cooler and 1000W PSU — the APNX C1 provides the physical clearance and airflow capacity for the RTX 5090 in a case with strong visual character for a research lab or studio environment. The 360mm AIO keeps the Ryzen 9 9950X at rated frequency during the CPU-intensive offloading process. The 1000W Corsair RM1000e delivers stable power to the RTX 5090 at 575W TDP under sustained training load, running at roughly 80 percent of rated capacity, the efficient band for an 80+ Gold unit, with ATX 3.1 transient headroom for the card's power spikes. PSU model subject to confirmation with Kevin.
Windows
Windows 11 Pro — pre-installed and activated. WSL2 is supported for PyTorch, DeepSpeed, and Python development environments on Windows 11 Pro. Pro includes BitLocker encryption for research data and model weights, Remote Desktop for accessing the machine during overnight training runs, and domain join for institutional network environments. Drivers confirmed before dispatch.
SPECIFICATION RATIONALE

Why This Specification for CPU Offloading and Large Model Training

Every component in the Cobalt Max is chosen for the specific demands of DeepSpeed ZeRO and PyTorch FSDP offloading workloads, where system RAM capacity and CPU bandwidth determine whether the offloading process extends or limits training.

128GB system RAM: enables CPU offloading for models beyond 32GB VRAM

DeepSpeed ZeRO and PyTorch FSDP move optimiser states and gradients to system RAM, allowing models larger than 32GB VRAM to train on a single GPU. Without 128GB system RAM, the offload process itself becomes memory-limited at scale. This machine extends the effective training range significantly beyond what the standard G6 Cobalt with 64GB supports.

RTX 5090 + CPU offload: a practical path to large model training

Multi-GPU training with consumer cards does not pool VRAM — each GPU holds its own model shard. CPU offloading with a single RTX 5090 and 128GB system RAM achieves training of models beyond 32GB VRAM on a single machine, at lower complexity and cost than a multi-GPU setup. For researchers who need to extend beyond 32GB VRAM without the infrastructure overhead of multi-GPU training, this is the practical path.

Ryzen 9 9950X: offloading demands CPU memory bandwidth

CPU offloading is compute-intensive on the host side. DeepSpeed ZeRO and PyTorch FSDP move data between GPU and system RAM continuously during training — this data movement is bandwidth-intensive. The Ryzen 9 9950X's 16 cores and DDR5 memory bandwidth keep the offload process from bottlenecking GPU utilisation. An underpowered CPU extends the time the GPU spends waiting for the host between steps.

APNX C1: RTX 5090 capable with distinctive character

The APNX C1 provides RTX 5090-compatible airflow in a case with strong visual presence. For a researcher's workstation in a lab or studio, the thermal performance is the reason for the case choice — the APNX C1 handles the RTX 5090's 575W TDP under sustained training load. The appearance is incidental but the thermal performance is not.

SOFTWARE COMPATIBILITY

What the G6 Cobalt Max Handles

Confirmed software performance at the G6 Cobalt Max specification. Workload scales based on Ryzen 9 9950X, 128GB DDR5, RTX 5090 32GB, and 2TB plus 4TB NVMe.

PyTorch FSDP
CPU offload training
Models larger than 32GB VRAM trained via FSDP CPU offloading — model parameters, gradients, and optimiser states offloaded to 128GB system RAM between compute steps, extending effective training range beyond VRAM alone.
DeepSpeed ZeRO
Distributed training with offload
ZeRO-Offload moves optimiser states and gradients to system RAM, reducing GPU memory footprint and enabling training of larger models on a single RTX 5090. 128GB system RAM provides capacity for the offloaded states at 30B+ parameter scale.
Hugging Face Transformers
Large model training
30B+ parameter models trained in full precision using the Transformers training API with DeepSpeed or FSDP integration. 128GB system RAM holds the offloaded states, 32GB VRAM holds the active compute graph during the forward and backward passes.
TensorFlow
Large dataset training pipelines
Large model training pipelines with intensive CPU-side preprocessing. 128GB system RAM handles large dataset preprocessing and model state management. 16-core Ryzen 9 maintains GPU saturation during data-intensive pipeline stages.

Performance descriptors are indicative. Actual performance depends on project complexity, settings, and system configuration. Kevin can advise on the right spec for your specific workflow.

THE COBALT MAX IN CONTEXT

CPU Offloading Extends the Effective Model Size. 128GB System RAM Is What Makes It Work.

DeepSpeed ZeRO-Offload and PyTorch FSDP are techniques for training models whose total parameter, gradient, and optimiser state footprint exceeds GPU VRAM capacity. The mechanism is straightforward: during training, the states that are not needed for the current computation step — the optimiser states between parameter updates, the gradients between backward and update steps — are moved from GPU VRAM to system RAM. The GPU is freed to hold only the parameters needed for the current forward and backward pass. System RAM holds the offloaded states until they are needed again. This allows a single RTX 5090 with 32GB VRAM to participate in training workloads whose full memory footprint significantly exceeds 32GB, as long as system RAM is large enough to hold the offloaded states.

The reason 128GB system RAM is required — and not 64GB — is the scale of what is being offloaded at 30B+ parameter level. The optimiser states for a 30B parameter model in fp16 with an Adam optimiser occupy approximately 120GB in system RAM under ZeRO-2 offloading. 64GB is insufficient to hold these states alongside the operating system, the Python environment, and the data pipeline. 128GB provides the capacity for the full offloaded state at this model scale. The Ryzen 9 9950X's 16 cores and DDR5 memory bandwidth matter here too — the data movement between GPU and system RAM during training is CPU-bandwidth-bound, and an underpowered CPU on a slower memory platform introduces delays between steps that reduce effective GPU utilisation during the training loop. Kevin's conversation before the order confirms your model size, your training framework, and whether ZeRO-Offload or FSDP is in your pipeline. The 3-year warranty and post-delivery support apply from day one.

NOT SURE IF THE COBALT MAX IS THE RIGHT SPEC?

Tell Kevin:

  1. The software you use most and the version
  2. Your typical file sizes or project scales
  3. Whether you need to run multiple applications simultaneously — and which ones
  4. Your approximate budget and whether this is for one machine or a team

No charge for the conversation. No pressure to buy.

WHAT OUR CUSTOMERS SAY

93% Five-Star Reviews on Trustpilot

4.9
★★★★★
Rated Excellent • 1,100+ Reviews • 93% Five-Star
Read reviews on Trustpilot
RATED EXCELLENT
4.9
★★★★★
Trustpilot • 1,100+ Verified Reviews

93% of Ginger6 customers leave five-star reviews. A research workstation at this level needs support that remains available after delivery. Kevin builds the Cobalt Max, stress-tests it at sustained load, and is reachable when training pipelines evolve or questions arise.

See all reviews
★★★★★

"I upgraded my PC to one that was Windows 11 compatible. I have been using it for about 3 months with no problems. The service from Ginger 6 has been great."

robert maidment, Verified Google Review
★★★★★

"Placed order, and received it earlier than expected. Windows and drivers already installed so computer was good to go right out of the box. Runs perfectly, have no complaints, only good things to say! Recommended!!"

Anonymous, Verified Reviews.io Review
★★★★★

"I have been using Ginger 6 since 2014 for gaming PCs for wife and myself. That's 5 purchased in total. Never had a technical issue with any of the builds and the only reason for new purchases is technical obsolescence. Highly recommend them."

Dangerous Toast, Verified Google Review
HOW YOUR WORKSTATION IS BUILT

Built by Hand in Wolverhampton

Every G6 Cobalt Max is assembled, configured, and tested by Kevin's team. The 128GB DDR5 memory configuration is verified at rated speed before the 24-hour test begins — an additional step that is specific to this machine.

01
Spec confirmed for your offloading framework and model scale

Before the build begins, the configuration is reviewed against your training framework, model architecture, and offloading setup. If you have spoken to Kevin, the spec is confirmed against your use of DeepSpeed ZeRO or PyTorch FSDP, your model size, and your sequence lengths. The Ryzen 9 9950X and X870 Eagle WIFI7 are verified for AM5 compatibility at 128GB DDR5. RTX 5090 stock is confirmed before the build begins. The dual NVMe layout is confirmed for your dataset and checkpoint requirements. Components are staged before assembly.

02
Hand-assembled with 128GB DDR5 memory profile verification

The Cobalt Max is assembled by hand in Wolverhampton. Inside the APNX C1, cables are routed to maintain clear airflow paths to the RTX 5090 and the 360mm radiator, reduce dust build-up around both NVMe drives, and keep future maintenance accessible. BIOS settings and DDR5 memory profiles are confirmed at 128GB before the 24-hour stress test begins — this is an additional verification step specific to the 128GB configuration, ensuring the RAM runs at rated speed and the memory subsystem is stable under the bandwidth demands of the offloading process. The 1000W PSU is confirmed for stable power delivery before the test starts.

03
24-hour test at sustained RTX 5090 and CPU offloading load

Every Cobalt Max runs under sustained GPU and CPU load for a full day before it ships. The test replicates the combined demand of an overnight training run with CPU offloading — the RTX 5090 held at sustained high utilisation at its 575W operating envelope, with the Ryzen 9 9950X managing the continuous data movement between GPU and system RAM. The 128GB DDR5 subsystem stability is confirmed under this sustained bandwidth demand. Windows 11 Pro, drivers, and both NVMe drives are confirmed before packaging.

24-HOUR STRESS TEST COVERS
  • Thermal behaviour under sustained RTX 5090 training load
  • Processor and graphics stability during extended use
  • Memory responsiveness and stability at 128GB under offloading bandwidth
  • Storage performance and consistency across both NVMe drives
  • BIOS and firmware stability
  • System stability under extended use
FREQUENTLY ASKED QUESTIONS

G6 Cobalt Max — Common Questions

DeepSpeed ZeRO-Offload is a technique for training models whose total memory footprint exceeds GPU VRAM capacity. During training, the states that are not needed for the current computation step — the optimiser states between parameter updates, the gradients between the backward pass and the parameter update — are moved from GPU VRAM to system RAM. The GPU is then freed to hold only the parameters and activations needed for the current step. This allows training of models whose full state footprint significantly exceeds GPU VRAM, as long as system RAM is large enough to hold the offloaded states and the CPU is fast enough to manage the data movement without becoming the bottleneck. You need ZeRO-Offload when your model's combined parameter, gradient, and optimiser state footprint exceeds your GPU VRAM — typically when training 30B+ parameter models in fp16 on a single GPU. If your model fits within 32GB VRAM, you do not need ZeRO-Offload, and the G6 Cobalt with 64GB is the correct machine.

The answer depends on your model size and which FSDP sharding strategy you use. With FSDP full sharding and CPU offloading enabled, the optimiser states for a 30B parameter model in fp16 with an Adam optimiser occupy approximately 120GB in system RAM under ZeRO stage 2 offloading. The operating system, Python environment, data loader, and other processes add additional RAM consumption. 128GB is the practical minimum to hold the offloaded states for a 30B parameter model alongside normal system processes. For models significantly above 30B, or for complex offloading configurations where more states are offloaded simultaneously, RAM requirements increase further. Call Kevin and describe your model size, parameter count, and offloading configuration before ordering — he will confirm whether 128GB covers your specific setup.

Both machines use the same RTX 5090 32GB GPU and Ryzen 9 9950X processor. The single difference is system RAM: the G6 Cobalt has 64GB DDR5, the Cobalt Max has 128GB DDR5. 128GB is required when your training pipeline uses DeepSpeed ZeRO-Offload or PyTorch FSDP CPU offloading to train models whose full state footprint exceeds 32GB VRAM — 64GB system RAM is insufficient for the offloading process at 30B+ scale. If your model fits within 32GB VRAM and you do not use CPU offloading, the Cobalt with 64GB system RAM is the correct machine and the Cobalt Max adds cost without benefit for your workflow. The case is also different: the Cobalt uses a Corsair 5000D, the Cobalt Max uses an APNX C1.

Yes, CPU offloading adds latency compared to keeping the full training state in VRAM, because data movement between GPU and system RAM takes time that would otherwise be used for computation. The throughput reduction depends on the offloading configuration — ZeRO stage 2 with only optimiser state offloading has less overhead than full parameter offloading — and on the CPU and memory bandwidth available on the host side. The Ryzen 9 9950X's DDR5 memory bandwidth minimises the host-side latency in this data movement, but some throughput reduction compared to a pure-VRAM training configuration is unavoidable. The trade-off is that CPU offloading makes it possible to train models that would otherwise require multiple high-end GPUs or cloud instances with HBM-based VRAM. For researchers whose model genuinely exceeds 32GB VRAM, the throughput trade-off of offloading on a single RTX 5090 is typically preferable to the cost and complexity of the alternatives.

Build time is 3 to 5 working days from order confirmation, including the 24-hour stress test and the 128GB DDR5 memory profile verification applied before dispatch. Delivery to UK mainland addresses is free and fully tracked. RTX 5090 stock is subject to availability. Call Kevin on 01902 714533 before ordering to confirm current stock and the build timeline, particularly if you have a research project start date in mind.

GINGER6

Ready to Order the G6 Cobalt Max?

Ginger6 has been building custom workstations in Wolverhampton since 2001. Kevin confirms RTX 5090 stock and your offloading setup before the build, verifies the 128GB DDR5 configuration, stress-tests the machine at sustained training load, and is available after delivery. 93% five-star reviews. 3-year warranty with lifetime support.

Custom Options

£7,211.69

£6,769.99

£7,211.69

£6,769.99

* Required Fields

Specifications

Additional Information

Processor AMD Ryzen 9 9950X
Processor Type AMD Ryzen 9
No of Cores 16
Max Core Speed 5.70GHz
CPU Cooler 360mm ARGB AIO Liquid Cooler
Motherboard Gigabyte X870 EAGLE WIFI7
Case APNX Creator C1 Black
Power Supply Corsair 1000w RM1000e 80+ Gold Full Modular
Memory Size 128GB
Solid State Drive Size 2TB
2nd SSD 4TB
Graphics Nvidia RTX 5090 32GB
Graphics Card Connections Displayport (x3), HDMI
Audio Realtek ALC (HD Audio)
LAN 2.5GB LAN, Wi-Fi 7
Ethernet Realtek 2.5GbE
Wi-Fi WiFi 7 (MediaTek MT7925 rev1.0 / Realtek RTL8922AE rev1.1)
Bluetooth 5.4
Connections Rear: 2x USB-C, 1x USB 3.2 Gen2, 3x USB 3.2 Gen1, 4x USB 2.0
Front Panel Connections 2x USB-A 3.x, 1x USB-C, HD Audio + Mic
USB2 Ports 4
USB3 Ports 6
USB-C Ports 3
Operating System Windows 11 Pro
Monitors Optional (See Custom Options)
Warranty 3 Year Bronze Warranty

Reviews

  1. Be the first to review this product

WRITE YOUR REVIEW

Write Your Own Review

You're reviewing: Ginger6 G6 Cobalt Max — RTX 5090 AI Workstation with CPU Offloading

How do you rate this product? *

  1 star 2 stars 3 stars 4 stars 5 stars
Price
Value
Quality
Customer Service