AI Server

EMARQUE AI Server

EMARQUE built

EMARQUE AI Server

Name: EMARQUE EMARQUE AI Server
Brand: EMARQUE AI
SKU: ai-server
Availability: InStock

8× NVIDIA RTX PRO 6000 Blackwell Server Edition (96 GB GDDR7 each) in a 4U PCIe rackmount reference platform. Available through EMARQUE.

EMARQUE AI Server — built by EMARQUE in Malaysia

768GB GDDR7 max

8PCIe Gen5 cards

4UAir-cooled

Key features

Configuration overview.

Manufacturer-defined features from the published datasheet.

Eight cards, one chassis

Eight NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs in a single 4U rackmount — 768 GB of GDDR7 across PCIe Gen5, enough headroom to serve 70B-class models to 100–500 concurrent users without leaving the rack.

Datacentre-grade chassis, EMARQUE delivered

Giga Computing G493-class reference platform — the same hardware that ships into hyperscaler labs — supplied through EMARQUE with local commissioning, warranty handling, and Tier-1 support in Malaysia.

Built to run multi-tenant

MIG-style partitioning per card lets you carve the server into isolated inference workloads — separate teams, separate models, separate quotas on shared hardware. No noisy-neighbour fights, full observability via the BMC.

Storage and network that match the GPUs

Up to 60 TB U.2 NVMe on PCIe Gen5 keeps the eight GPUs fed; dual 25 GbE on-board with optional 100 GbE or InfiniBand for scale-out clusters. No PCIe contention, no storage starvation under sustained load.

Operates inside Malaysian DC reality

Validated for 200–240 V AC on dual hot-swap 2 kW Titanium PSUs, redundant cooling, ASPEED BMC for remote management — runs reliably in Malaysian DC environments without bespoke power conditioning.

Configurable without a redesign

GPU count (2 / 4 / 8), memory (256 GB – 2 TB), storage (8 – 60 TB), networking (25 / 100 GbE / InfiniBand), and CPU choice (AMD EPYC or Intel Xeon) — all selectable at quote without changing the chassis or rebuilding the BOM.

Architecture

Under the hood.

The four sub-systems that determine real-workload behaviour. We tune each before delivery.

GPU complex

Up to 8 × NVIDIA RTX PRO 6000 Blackwell Server Edition (96 GB GDDR7 each)
24,064 CUDA cores · 752 5th-gen Tensor cores · 188 4th-gen RT cores per GPU
FP4 / FP6 / FP8 inference acceleration (5th-gen Tensor)
PCIe Gen5 ×16 per card — no NVLink on this generation of RTX PRO

CPU + memory

Dual AMD EPYC 9745 (128 cores Zen 5 'Turin') or 9755
Alternative: Dual Intel Xeon Platinum 8570 / Granite Rapids
Up to 256 PCIe Gen5 lanes total across both sockets
Up to 2 TB ECC DDR5-6400 across 24 DIMM slots (RDIMM)

Storage & networking

Up to 60 TB U.2 / U.3 NVMe (PCIe Gen5)
RAID 0/1/10 across NVMe; optional hot-swap SAS/SATA bays for archive
Dual 25 GbE on-board (SFP28 or RJ45)
Optional 100 GbE Mellanox / Broadcom NICs · InfiniBand HDR for cluster scale-out

Power, cooling, management

Redundant 2 × 2000 W 80+ Titanium PSUs (200–240 V AC)
Hot-swap fans across the chassis
ASPEED BMC with Redfish API, iKVM, signed firmware updates
Manufacturer factory QA per Giga Computing standard

Next step

Tell us your workload. EMARQUE sizes the AI Server and sends a quote.

Configure AI Server Compare systems Contact us

Supported workloads

Reference workload categories.

Workload categories documented in the manufacturer's reference materials. Sizing is confirmed with your technical team during scoping.

Departmental RAG

Internal knowledge chat for 500 users

Run a private RAG stack with a 70B-class model against your document store, code repos, and ticketing system. Eight GPUs split four ways gives four logical inference endpoints sized for 100+ concurrent users each — the right shape for finance, legal, engineering, and ops to share one server.

Production inference

24/7 multi-tenant model serving

TGI / vLLM / Triton serving multiple fine-tuned variants of a base model. PCIe Gen5 isolation per card means no NVLink coherence overhead — the right architecture when each request fits on one GPU and you want predictable per-tenant throughput rather than tightly coupled training.

Vision + voice

Real-time multi-stream processing

Eight independent cards parallelise across camera or audio streams cleanly — object detection on 50 4K streams, speech-to-text on 200 concurrent calls, or pose estimation across a factory floor. Each stream gets a dedicated GPU slice with consistent latency.

Mixed workload

Inference today, fine-tune at night

Run production inference on six cards during business hours, reallocate to LoRA / QLoRA fine-tuning runs on all eight overnight. The BMC + Redfish API makes the rebalance scriptable; no separate dev cluster required.

Full spec sheet

Every line documented at quotation.

Configurable. Final BOM, GPU mix, RAM and storage, and networking topology are confirmed in writing at quotation.

Reference platform: Giga Computing G493-class (4U PCIe) — Supermicro / ASUS equivalents on request
GPU: Up to 8 × NVIDIA RTX PRO 6000 Blackwell Server Edition · 96 GB GDDR7 with ECC
GPU memory bandwidth: 1.6 TB/s per card
GPU TDP: 300 W passive (server-optimised) — 600 W active variant available
GPU interconnect: PCIe Gen5 ×16 per card (no NVLink — independent cards)
Total GPU memory pool: Up to 768 GB across 8 cards
CPU: Dual AMD EPYC 9745 (128 cores) or Intel Xeon Platinum 8570
Memory: Up to 2 TB ECC DDR5-6400 (24 channels, dual-socket)
Primary storage: Up to 60 TB U.2 / U.3 NVMe (PCIe Gen5, RAID 0/1/10)
Bulk storage: Up to 200 TB enterprise HDD (RAID Z2) — optional
Networking: Dual 25 GbE on-board (SFP28) · 100 GbE optional
Power: Redundant 2 × 2000 W 80+ Titanium PSUs, hot-swappable (200–240 V AC)
Form factor: 4U rackmount, 19" rails included
Cooling: Air-cooled with redundant hot-swap fans
Management: ASPEED BMC with Redfish API, iKVM, signed firmware
OS: Ubuntu Server 24.04, RHEL 9, Windows Server 2025
Warranty: Manufacturer warranty (Giga Computing / OEM) · NVIDIA GPU warranty

FAQ

Common questions about AI Server

What is the NVIDIA RTX PRO 6000 Blackwell Server Edition?

Server-optimised variant of the RTX PRO 6000 Blackwell — same Blackwell silicon and 96 GB GDDR7 memory, with passive cooling (300 W TDP) designed to be cooled by the server chassis airflow. Targets enterprise rackmount deployments where the active 600 W variant would be impractical.

What reference platform does EMARQUE use?

EMARQUE supplies the AI Server on the Giga Computing G493-class 4U PCIe reference platform validated for the RTX PRO 6000 SE thermal envelope. Equivalent reference designs from Supermicro or ASUS are available on request. Final BOM is documented at quotation.

Is this configuration customisable?

Yes. GPU count (2 / 4 / 8 cards), memory (256 GB – 2 TB), storage (8 – 60 TB NVMe), networking (25 GbE / 100 GbE / InfiniBand), and OS are all configurable. CPU choice (AMD EPYC vs Intel Xeon) is customer-selectable within the manufacturer's published configuration matrix.

What's the warranty?

Manufacturer warranty applies — Giga Computing / OEM warranty on the chassis and components; NVIDIA's warranty entitlement on the RTX PRO 6000 SE GPUs. EMARQUE handles local warranty claim processing through the manufacturer's authorised channel.

What support does EMARQUE provide locally?

EMARQUE handles in-country delivery, customs, commissioning, acceptance testing, and Tier-1 support response in Malaysia. Tier-2/3 escalation routes to the manufacturer per the warranty entitlement. Optional service contracts for extended response SLAs and on-site engineering visits can be added.

If I need NVLink-coherent multi-GPU or HGX-class, what's the upgrade path?

Step into NVIDIA DGX B200 or NVIDIA DGX B300 (or HGX B200 / B300 OEM platforms from Dell, Giga Computing, or Supermicro). Those configurations are presented on the respective NVIDIA DGX product pages — each shows both NVIDIA-branded DGX systems and HGX OEM alternatives.

Also in this class

NVIDIA DGX B200

Foundation for the AI Center of Excellence — eight NVIDIA B200 Tensor Core GPUs in a unified, air-cooled DGX system.

Request configuration & quotation.

Manufacturer specifications and warranty terms apply. EMARQUE issues a formal quotation through your Key Account Manager.

02Talk to EMARQUE

Tell us about your workload.

Model size, concurrency, latency budget, deployment site. EMARQUE returns a quote in MYR within one Malaysian business day, sized to the workload — not the salesperson’s quota.

Request a quote Contact sales

01
Key Account Manager
+6012 627 2280
02
Request for Quotation
business@emarque.co

EMARQUE AI Server

Configuration overview.

Eight cards, one chassis

Datacentre-grade chassis, EMARQUE delivered

Built to run multi-tenant

Storage and network that match the GPUs

Operates inside Malaysian DC reality

Configurable without a redesign

Under the hood.

Reference workload categories.

Internal knowledge chat for 500 users

24/7 multi-tenant model serving

Real-time multi-stream processing

Inference today, fine-tune at night

Every line documented at quotation.

Common questions about AI Server

Request configuration & quotation.

Tell us about your workload.

Key Account Manager

Request for Quotation