700+Gb Fully Enclosed 10x GPU Mobile AI Build
The "Cat-Proof" MoE Monster: Running Deepseek & Kimi K2 on a Dual-Gen 5090/3090 Cluster
The Mission I was tasked with building a local powerhouse for a professional graphic designer. The goal? A system capable of running extra-large MoE models (specifically Deepseek and Kimi K2) while simultaneously crushing lengthy video generation and rapid high-detail image synthesis.
The Constraints (and the "Cat Factor") We had three major constraints:
Budget Efficiency: We wanted maximum potency without hitting the diminishing returns of enterprise gear. Going full RTX 6000 Ada/Generation would have doubled the cost for gains the artist didn't strictly need.
Mobility: The system needs to be moved between rooms easily.
The Enclosure: The system lives under the same roof as multiple cats. Open-air mining frames were ruled out immediately—I needed a physical barrier between tens of thousands of dollars in hardware and curious paws.
The Solution: A Thermaltake Core W200 "Hack" Most people suggest mining racks on wheels for these builds, but they are ugly, wobbly, and dangerous for pets. I found a better solution: the Thermaltake Core W200.
While intended as a dual-system enclosure, I utilized it differently:
I installed the motherboard upside-down in the secondary compartment.
This orientation allowed me to run risers into the main compartment, creating a massive cavity to mount the dense GPU cluster.
While it requires tight cable management and a few zip ties to secure the GPUs, it eliminates the "jank" of a mining rack and provides a sturdy, movable chassis.
The Hardware Strategy We saved roughly $10k by avoiding an all-Enterprise stack, instead opting for a hybrid consumer-flagship approach:
2x RTX 5090s: These do the heavy lifting for the artist's creative workflow (video/image gen), providing the time savings that only the 50-series can offer.
Multiple RTX 3090s: These serve as the massive VRAM buffer required to load the heavy MoE models.
Power Config: To manage thermals, I power-limit the 3090s to 200–250W and the 5090s to 500W.
RAW POWERRR!
The Caveat Because three of the RTX 3090s are AIO hybrids, I had to mount their radiators on the main compartment's fan rails. This physically obstructs the door slightly, meaning the glass panel must remain cracked open to allow for exhaust.
Note: If these were blower or air-cooled cards, the case could be fully sealed 100% of the time.
Performance & Acoustics With 12x 140mm fans installed, airflow is surprisingly excellent.
Thermals: GPU temps stay well within operation range under load.
Noise: Despite the fan count and high-power GPUs, the acoustics are impressive. Without a decibel meter, I can't give exact numbers, but it doesn't sound much louder than a standard high-end gaming rig.
This build proves you don't need a server rack or a wobbly mining frame to run massive local LLMs. With a little creativity regarding case orientation, you can have a (mostly) enclosed, cat-proof, and mobile supercomputer.
2 Comments