Home
PCs by Budget
£3000 Gaming PCs
Ginger6 G6 Cobalt Max — RTX 5090 AI Workstation with CPU Offloading

Ginger6 G6 Cobalt Max — RTX 5090 AI Workstation with CPU Offloading

Q: What is DeepSpeed ZeRO offloading and when do I need it?

DeepSpeed ZeRO-Offload moves optimiser states and gradients from GPU VRAM to system RAM during training, freeing VRAM for the parameters and activations needed for the current step. You need it when your model's combined parameter, gradient, and optimiser state footprint exceeds your GPU VRAM — typically when training 30B+ parameter models in fp16 on a single GPU. If your model fits within 32GB VRAM, you do not need ZeRO-Offload, and the G6 Cobalt with 64GB system RAM is the correct machine.

Q: How much system RAM do I need for PyTorch FSDP with an RTX 5090?

For FSDP with CPU offloading at 30B parameter scale in fp16 with an Adam optimiser, optimiser states occupy approximately 120GB in system RAM under ZeRO stage 2 offloading. 128GB is the practical minimum alongside normal system processes. For models significantly above 30B, requirements increase further. Call Kevin and describe your model size and offloading configuration before ordering.

Andy

★★★★★

"Kevin was brilliant at answering my questions and recommending the best machine for my budget. The computer was for my son and is now the envy of all his friends."

Johnny

★★★★★

"Nearly three years later with no issues I called for advice. They suggested I didn't need to upgrade and offered free ways to improve performance. Just great support."

Carol

★★★★★

"Bought our second PC from Ginger6. We had a few issues setting up and they called and assisted us — professional and patient throughout. Would highly recommend."

Rated Excellent

1,100+ verified reviews

★★★★★

Be the first to review this product

A maximum-specification AI workstation with RTX 5090 32GB VRAM and 128GB system RAM for DeepSpeed ZeRO and PyTorch FSDP — training models beyond single-GPU VRAM capacity.

AMD Ryzen 9 9950X (16 Cores)
128GB DDR5 RAM
Nvidia RTX 5090 32GB VRAM
2TB NVMe + 4TB NVMe
Windows 11 Pro
3-Year Warranty

£7,211.69

£6,769.99

In stock

SKU: g6-cobalt-max

Quantity:

Get Expert Help

Talk to Your Builder

01902 714533 Email

Description

============================================================ -->

D2X — High VRAM Workstation

G6 Cobalt Max — CPU Offloading Workstation for DeepSpeed ZeRO, PyTorch FSDP, and 30B+ Models

The G6 Cobalt Max is built for AI and ML researchers using DeepSpeed ZeRO-Offload or PyTorch FSDP CPU offloading to train models larger than 32GB VRAM on a single GPU. 128GB DDR5 system RAM provides the capacity for optimiser states, gradients, and model parameters to be offloaded from the RTX 5090 to system RAM during training — extending the effective model size range significantly beyond what 32GB VRAM alone supports. The Ryzen 9 9950X's 16 cores and DDR5 memory bandwidth keep the CPU offloading process from bottlenecking GPU utilisation. From £4200, built and stress-tested in Wolverhampton. The Cobalt Max features in our AI and ML workstations range.

The Cobalt Max and the G6 Cobalt use the same RTX 5090 32GB GPU. If your model fits within 32GB VRAM and you do not use ZeRO or FSDP CPU offloading, the Cobalt with 64GB system RAM is the correct machine — the Cobalt Max adds cost without benefit for that workflow. Call Kevin before ordering if you are uncertain whether your training pipeline requires CPU offloading.

Not sure whether you need the Cobalt Max or the Cobalt? Call Kevin on 01902 714533 — describe your training framework and model size and he will give you a straight answer.

View Full Specification Call Kevin: 01902 714533

24hr

Stress Tested

93%

Five-Star Reviews

3 Year

Warranty

Since 2001

Building Custom PCs

3-Year Warranty

Parts, return postage, and lifetime support

24-Hour Stress Test

Every workstation tested under sustained professional load

Free UK Delivery

Free mainland delivery, fully tracked

Lifetime Support

Free UK phone support for the life of your machine

SPEC OVERVIEW, CORRECT FOR 2026

G6 Cobalt Max — Full Specification

RTX 5090 32GB, 128GB DDR5, and Ryzen 9 9950X — the maximum VRAM and maximum system RAM configuration for CPU offloading research workloads.

Processor

AMD Ryzen 9 9950X — 16 cores, 32 threads, AM5. CPU offloading is compute-intensive on the host side. DeepSpeed ZeRO-Offload and PyTorch FSDP move optimiser states and parameters to system RAM, then move them back to GPU when needed for computation. This data movement is bandwidth-intensive. The Ryzen 9 9950X's 16 cores and DDR5 memory bandwidth keep the offload process from bottlenecking GPU utilisation during training.

Memory

128GB DDR5 — the capacity required for CPU offloading in DeepSpeed ZeRO and PyTorch FSDP at scale. ZeRO-Offload moves optimiser states and gradients to system RAM, which for a 30B parameter model in fp16 requires tens of GB of system RAM beyond the model VRAM footprint. 128GB provides this capacity. Stable at rated speed on X870 — confirm with Kevin before deployment.

Graphics

Nvidia RTX 5090 32GB VRAM — 32GB holds the active model weights and activations on the GPU during the forward pass. With CPU offloading, the optimiser states and gradients are managed in system RAM between steps, extending the effective model size range significantly beyond what 32GB VRAM alone supports. Stock subject to availability — confirm with Kevin before ordering.

Primary Storage

2TB NVMe — OS, Python environment, PyTorch, DeepSpeed, and active training datasets. Fast NVMe read speeds reduce data loading time between epochs, which matters more in offloading configurations where each step involves data movement between GPU and system RAM. Keeping the training environment on a dedicated drive prevents I/O contention during long runs.

Model and Archive Drive

4TB NVMe — completed model weights, training checkpoints, fine-tuned model archives, and large dataset storage beyond the active training set. 4TB provides space for an active research archive including multiple model checkpoints and the full training dataset history without requiring external storage during ongoing work.

Motherboard

Gigabyte X870 Eagle WIFI7 (ATX, AM5) — the X870 chipset provides PCIe 5.0 bandwidth and supports 128GB DDR5 at rated speed. 128GB DDR5 stability at rated speed on AM5 requires the X870 chipset — lower-tier chipsets are less consistent at this configuration. WIFI7 for fast wireless connectivity. Confirm 128GB DDR5 stability with Kevin before deployment.

Case and Cooling

APNX C1 with 360mm Liquid Cooler and 1000W PSU — the APNX C1 provides the physical clearance and airflow capacity for the RTX 5090 in a case with strong visual character for a research lab or studio environment. The 360mm AIO keeps the Ryzen 9 9950X at rated frequency during the CPU-intensive offloading process. The 1000W Corsair RM1000e delivers stable power to the RTX 5090 at 575W TDP under sustained training load, running at roughly 80 percent of rated capacity, the efficient band for an 80+ Gold unit, with ATX 3.1 transient headroom for the card's power spikes. PSU model subject to confirmation with Kevin.

Windows

Windows 11 Pro — pre-installed and activated. WSL2 is supported for PyTorch, DeepSpeed, and Python development environments on Windows 11 Pro. Pro includes BitLocker encryption for research data and model weights, Remote Desktop for accessing the machine during overnight training runs, and domain join for institutional network environments. Drivers confirmed before dispatch.

SPECIFICATION RATIONALE

Why This Specification for CPU Offloading and Large Model Training

Every component in the Cobalt Max is chosen for the specific demands of DeepSpeed ZeRO and PyTorch FSDP offloading workloads, where system RAM capacity and CPU bandwidth determine whether the offloading process extends or limits training.

SOFTWARE COMPATIBILITY

What the G6 Cobalt Max Handles

Confirmed software performance at the G6 Cobalt Max specification. Workload scales based on Ryzen 9 9950X, 128GB DDR5, RTX 5090 32GB, and 2TB plus 4TB NVMe.

PyTorch FSDP

CPU offload training

Models larger than 32GB VRAM trained via FSDP CPU offloading — model parameters, gradients, and optimiser states offloaded to 128GB system RAM between compute steps, extending effective training range beyond VRAM alone.

DeepSpeed ZeRO

Distributed training with offload

ZeRO-Offload moves optimiser states and gradients to system RAM, reducing GPU memory footprint and enabling training of larger models on a single RTX 5090. 128GB system RAM provides capacity for the offloaded states at 30B+ parameter scale.

Hugging Face Transformers

Large model training

30B+ parameter models trained in full precision using the Transformers training API with DeepSpeed or FSDP integration. 128GB system RAM holds the offloaded states, 32GB VRAM holds the active compute graph during the forward and backward passes.

TensorFlow

Large dataset training pipelines

Large model training pipelines with intensive CPU-side preprocessing. 128GB system RAM handles large dataset preprocessing and model state management. 16-core Ryzen 9 maintains GPU saturation during data-intensive pipeline stages.

Performance descriptors are indicative. Actual performance depends on project complexity, settings, and system configuration. Kevin can advise on the right spec for your specific workflow.

THE COBALT MAX IN CONTEXT

CPU Offloading Extends the Effective Model Size. 128GB System RAM Is What Makes It Work.

DeepSpeed ZeRO-Offload and PyTorch FSDP are techniques for training models whose total parameter, gradient, and optimiser state footprint exceeds GPU VRAM capacity. The mechanism is straightforward: during training, the states that are not needed for the current computation step — the optimiser states between parameter updates, the gradients between backward and update steps — are moved from GPU VRAM to system RAM. The GPU is freed to hold only the parameters needed for the current forward and backward pass. System RAM holds the offloaded states until they are needed again. This allows a single RTX 5090 with 32GB VRAM to participate in training workloads whose full memory footprint significantly exceeds 32GB, as long as system RAM is large enough to hold the offloaded states.

The reason 128GB system RAM is required — and not 64GB — is the scale of what is being offloaded at 30B+ parameter level. The optimiser states for a 30B parameter model in fp16 with an Adam optimiser occupy approximately 120GB in system RAM under ZeRO-2 offloading. 64GB is insufficient to hold these states alongside the operating system, the Python environment, and the data pipeline. 128GB provides the capacity for the full offloaded state at this model scale. The Ryzen 9 9950X's 16 cores and DDR5 memory bandwidth matter here too — the data movement between GPU and system RAM during training is CPU-bandwidth-bound, and an underpowered CPU on a slower memory platform introduces delays between steps that reduce effective GPU utilisation during the training loop. Kevin's conversation before the order confirms your model size, your training framework, and whether ZeRO-Offload or FSDP is in your pipeline. The 3-year warranty and post-delivery support apply from day one.

WHAT OUR CUSTOMERS SAY

93% Five-Star Reviews on Trustpilot

RATED EXCELLENT

4.9

★★★★★

Trustpilot • 1,100+ Verified Reviews

93% of Ginger6 customers leave five-star reviews. A research workstation at this level needs support that remains available after delivery. Kevin builds the Cobalt Max, stress-tests it at sustained load, and is reachable when training pipelines evolve or questions arise.

See all reviews

★★★★★

"I upgraded my PC to one that was Windows 11 compatible. I have been using it for about 3 months with no problems. The service from Ginger 6 has been great."

robert maidment, Verified Google Review

★★★★★

"Placed order, and received it earlier than expected. Windows and drivers already installed so computer was good to go right out of the box. Runs perfectly, have no complaints, only good things to say! Recommended!!"

Anonymous, Verified Reviews.io Review

★★★★★

"I have been using Ginger 6 since 2014 for gaming PCs for wife and myself. That's 5 purchased in total. Never had a technical issue with any of the builds and the only reason for new purchases is technical obsolescence. Highly recommend them."

Dangerous Toast, Verified Google Review

HOW YOUR WORKSTATION IS BUILT

Built by Hand in Wolverhampton

Every G6 Cobalt Max is assembled, configured, and tested by Kevin's team. The 128GB DDR5 memory configuration is verified at rated speed before the 24-hour test begins — an additional step that is specific to this machine.

Spec confirmed for your offloading framework and model scale

Before the build begins, the configuration is reviewed against your training framework, model architecture, and offloading setup. If you have spoken to Kevin, the spec is confirmed against your use of DeepSpeed ZeRO or PyTorch FSDP, your model size, and your sequence lengths. The Ryzen 9 9950X and X870 Eagle WIFI7 are verified for AM5 compatibility at 128GB DDR5. RTX 5090 stock is confirmed before the build begins. The dual NVMe layout is confirmed for your dataset and checkpoint requirements. Components are staged before assembly.

Hand-assembled with 128GB DDR5 memory profile verification

The Cobalt Max is assembled by hand in Wolverhampton. Inside the APNX C1, cables are routed to maintain clear airflow paths to the RTX 5090 and the 360mm radiator, reduce dust build-up around both NVMe drives, and keep future maintenance accessible. BIOS settings and DDR5 memory profiles are confirmed at 128GB before the 24-hour stress test begins — this is an additional verification step specific to the 128GB configuration, ensuring the RAM runs at rated speed and the memory subsystem is stable under the bandwidth demands of the offloading process. The 1000W PSU is confirmed for stable power delivery before the test starts.

24-hour test at sustained RTX 5090 and CPU offloading load

Every Cobalt Max runs under sustained GPU and CPU load for a full day before it ships. The test replicates the combined demand of an overnight training run with CPU offloading — the RTX 5090 held at sustained high utilisation at its 575W operating envelope, with the Ryzen 9 9950X managing the continuous data movement between GPU and system RAM. The 128GB DDR5 subsystem stability is confirmed under this sustained bandwidth demand. Windows 11 Pro, drivers, and both NVMe drives are confirmed before packaging.

24-HOUR STRESS TEST COVERS

Thermal behaviour under sustained RTX 5090 training load
Processor and graphics stability during extended use
Memory responsiveness and stability at 128GB under offloading bandwidth
Storage performance and consistency across both NVMe drives
BIOS and firmware stability
System stability under extended use

FREQUENTLY ASKED QUESTIONS

G6 Cobalt Max — Common Questions

Q what is deepspeed zero offloading and when do i need it +

DeepSpeed ZeRO-Offload is a technique for training models whose total memory footprint exceeds GPU VRAM capacity. During training, the states that are not needed for the current computation step — the optimiser states between parameter updates, the gradients between the backward pass and the parameter update — are moved from GPU VRAM to system RAM. The GPU is then freed to hold only the parameters and activations needed for the current step. This allows training of models whose full state footprint significantly exceeds GPU VRAM, as long as system RAM is large enough to hold the offloaded states and the CPU is fast enough to manage the data movement without becoming the bottleneck. You need ZeRO-Offload when your model's combined parameter, gradient, and optimiser state footprint exceeds your GPU VRAM — typically when training 30B+ parameter models in fp16 on a single GPU. If your model fits within 32GB VRAM, you do not need ZeRO-Offload, and the G6 Cobalt with 64GB is the correct machine.

Q how much system ram do i need for pytorch fsdp with an rtx 5090 +

The answer depends on your model size and which FSDP sharding strategy you use. With FSDP full sharding and CPU offloading enabled, the optimiser states for a 30B parameter model in fp16 with an Adam optimiser occupy approximately 120GB in system RAM under ZeRO stage 2 offloading. The operating system, Python environment, data loader, and other processes add additional RAM consumption. 128GB is the practical minimum to hold the offloaded states for a 30B parameter model alongside normal system processes. For models significantly above 30B, or for complex offloading configurations where more states are offloaded simultaneously, RAM requirements increase further. Call Kevin and describe your model size, parameter count, and offloading configuration before ordering — he will confirm whether 128GB covers your specific setup.

Q what is the difference between the g6 cobalt and the g6 cobalt max +

Both machines use the same RTX 5090 32GB GPU and Ryzen 9 9950X processor. The single difference is system RAM: the G6 Cobalt has 64GB DDR5, the Cobalt Max has 128GB DDR5. 128GB is required when your training pipeline uses DeepSpeed ZeRO-Offload or PyTorch FSDP CPU offloading to train models whose full state footprint exceeds 32GB VRAM — 64GB system RAM is insufficient for the offloading process at 30B+ scale. If your model fits within 32GB VRAM and you do not use CPU offloading, the Cobalt with 64GB system RAM is the correct machine and the Cobalt Max adds cost without benefit for your workflow. The case is also different: the Cobalt uses a Corsair 5000D, the Cobalt Max uses an APNX C1.

Q does cpu offloading slow down training compared to fitting the model in vram +

Yes, CPU offloading adds latency compared to keeping the full training state in VRAM, because data movement between GPU and system RAM takes time that would otherwise be used for computation. The throughput reduction depends on the offloading configuration — ZeRO stage 2 with only optimiser state offloading has less overhead than full parameter offloading — and on the CPU and memory bandwidth available on the host side. The Ryzen 9 9950X's DDR5 memory bandwidth minimises the host-side latency in this data movement, but some throughput reduction compared to a pure-VRAM training configuration is unavoidable. The trade-off is that CPU offloading makes it possible to train models that would otherwise require multiple high-end GPUs or cloud instances with HBM-based VRAM. For researchers whose model genuinely exceeds 32GB VRAM, the throughput trade-off of offloading on a single RTX 5090 is typically preferable to the cost and complexity of the alternatives.

Q how long does it take to build and deliver the g6 cobalt max +

Build time is 3 to 5 working days from order confirmation, including the 24-hour stress test and the 128GB DDR5 memory profile verification applied before dispatch. Delivery to UK mainland addresses is free and fully tracked. RTX 5090 stock is subject to availability. Call Kevin on 01902 714533 before ordering to confirm current stock and the build timeline, particularly if you have a research project start date in mind.

GINGER6

Ready to Order the G6 Cobalt Max?

Ginger6 has been building custom workstations in Wolverhampton since 2001. Kevin confirms RTX 5090 stock and your offloading setup before the build, verifies the 128GB DDR5 configuration, stress-tests the machine at sustained training load, and is available after delivery. 93% five-star reviews. 3-year warranty with lifetime support.

Configure & Buy Call Kevin: 01902 714533 Email [email protected]

Custom Options

£7,211.69

£6,769.99

£7,211.69

£6,769.99

Case *

Processor (CPU) *

Graphics *

Operating System *

Keyboard, Mouse & Headset Bundles *

Headset *

Speakers *

Warranty *

Build Time *

* Required Fields

Specifications

Additional Information

Processor	AMD Ryzen 9 9950X
Processor Type	AMD Ryzen 9
No of Cores	16
Max Core Speed	5.70GHz
CPU Cooler	360mm ARGB AIO Liquid Cooler
Motherboard	Gigabyte X870 EAGLE WIFI7
Case	APNX Creator C1 Black
Power Supply	Corsair 1000w RM1000e 80+ Gold Full Modular
Memory Size	128GB
Solid State Drive Size	2TB
2nd SSD	4TB
Graphics	Nvidia RTX 5090 32GB
Graphics Card Connections	Displayport (x3), HDMI
Audio	Realtek ALC (HD Audio)
LAN	2.5GB LAN, Wi-Fi 7
Ethernet	Realtek 2.5GbE
Wi-Fi	WiFi 7 (MediaTek MT7925 rev1.0 / Realtek RTL8922AE rev1.1)
Bluetooth	5.4
Connections	Rear: 2x USB-C, 1x USB 3.2 Gen2, 3x USB 3.2 Gen1, 4x USB 2.0
Front Panel Connections	2x USB-A 3.x, 1x USB-C, HD Audio + Mic
USB2 Ports	4
USB3 Ports	6
USB-C Ports	3
Operating System	Windows 11 Pro
Monitors	Optional (See Custom Options)
Warranty	3 Year Bronze Warranty

Reviews

Be the first to review this product

WRITE YOUR REVIEW

Write Your Own Review

You're reviewing: Ginger6 G6 Cobalt Max — RTX 5090 AI Workstation with CPU Offloading

How do you rate this product? *

*Nickname
*Summary of Your Review
*Review

	1 star	2 stars	3 stars	4 stars	5 stars
Price
Value
Quality
Customer Service

Ginger6 G6 Cobalt Max — RTX 5090 AI Workstation with CPU Offloading

Description

G6 Cobalt Max — CPU Offloading Workstation for DeepSpeed ZeRO, PyTorch FSDP, and 30B+ Models

G6 Cobalt Max — Full Specification

Why This Specification for CPU Offloading and Large Model Training

What the G6 Cobalt Max Handles

CPU Offloading Extends the Effective Model Size. 128GB System RAM Is What Makes It Work.

93% Five-Star Reviews on Trustpilot

Built by Hand in Wolverhampton

G6 Cobalt Max — Common Questions

Ready to Order the G6 Cobalt Max?

Custom Options

Specifications

Additional Information

Reviews

WRITE YOUR REVIEW

Write Your Own Review

You're reviewing: Ginger6 G6 Cobalt Max — RTX 5090 AI Workstation with CPU Offloading

How do you rate this product? *

Pages:

Information

Address

Contact

Ginger6 G6 Cobalt Max — RTX 5090 AI Workstation with CPU Offloading

Description

G6 Cobalt Max — CPU Offloading Workstation for DeepSpeed ZeRO, PyTorch FSDP, and 30B+ Models

G6 Cobalt Max — Full Specification

Why This Specification for CPU Offloading and Large Model Training

128GB system RAM: enables CPU offloading for models beyond 32GB VRAM

RTX 5090 + CPU offload: a practical path to large model training

Ryzen 9 9950X: offloading demands CPU memory bandwidth

APNX C1: RTX 5090 capable with distinctive character

What the G6 Cobalt Max Handles

CPU Offloading Extends the Effective Model Size. 128GB System RAM Is What Makes It Work.

Tell Kevin:

93% Five-Star Reviews on Trustpilot

Built by Hand in Wolverhampton

G6 Cobalt Max — Common Questions

Ready to Order the G6 Cobalt Max?

Custom Options

Specifications

Additional Information

Reviews

WRITE YOUR REVIEW

Write Your Own Review

You're reviewing: Ginger6 G6 Cobalt Max — RTX 5090 AI Workstation with CPU Offloading

How do you rate this product? *

Pages:

Information

Address

Contact