Skip to content
EMARQUE.AI
    HomeProductsAI Work 100
AvailableEMARQUE built

AI Work 100

Entry-level dual-GPU AI workstation — single or dual NVIDIA RTX 5090 (32 GB GDDR7) on AMD Ryzen 9 9950X. EMARQUE-assembled in Malaysia.

From RM 12,000 — RM 28,000 depending on configuration
AI Work 100 — built by EMARQUE in Malaysia
2RTX 5090 GPUs
192 GBDDR5 max
8 TBGen5 NVMe
Key features

Configuration overview.

Manufacturer-defined features from the published datasheet.

Two RTX 5090s on a desk

Single or dual NVIDIA RTX 5090 GPUs — 32 GB of GDDR7 each, 64 GB pooled — enough headroom to load and serve a 70B-class quantized model entirely on local hardware. No cloud round-trip, no per-token billing, no data leaving the office.

AMD Ryzen 9 9950X — 16 cores at desk volume

Sixteen Zen 5 cores at 5.7 GHz boost handle data preprocessing, vector indexing, and parallel evaluations without bottlenecking the GPUs. Same CPU class as production servers, in a chassis that draws under 1.2 kW under load.

Quiet under sustained load

Validated to sub-45 dBA idle and sub-55 dBA under multi-hour training runs — comparable to a high-end gaming rig, half the volume of a server chassis. Runs in shared office space without acoustic complaints from neighbours.

Up to 192 GB DDR5 + 8 TB Gen5 NVMe

Memory and storage headroom large enough to hold the model, the working dataset, and the experiment cache simultaneously. RAID 0/1 NVMe configuration means model load times measure in seconds, not minutes.

EMARQUE-assembled in Malaysia

Built and QA-tested locally by the EMARQUE Lab — multi-point QA on CPU / GPU / memory / disk, benchmark report shipped with the unit. Warranty and parts handled in-country; no international RMA queue.

Standard 240 V wall outlet

No special electrical work — runs from a standard Malaysian 13 A wall socket. UPS recommended for graceful shutdown but not required. Sits under a desk or on a side table; no rack, no DC, no facilities request to procurement.

Architecture

Under the hood.

The four sub-systems that determine real-workload behaviour. We tune each before delivery.

GPU + compute
  • 1 or 2 × NVIDIA RTX 5090 (Blackwell, 32 GB GDDR7 each, PCIe Gen5 ×16)
  • 21,760 CUDA cores · 680 5th-gen Tensor cores · 170 4th-gen RT cores per GPU
  • AMD Ryzen 9 9950X — 16 cores / 32 threads Zen 5, 5.7 GHz boost
  • Up to 192 GB DDR5-6400 across 4 DIMM slots (non-ECC consumer platform)
Storage + networking
  • Up to 8 TB Gen5 NVMe (RAID 0 or 1) for primary
  • Optional 32 TB SATA SSD or HDD for bulk dataset storage
  • 2.5 GbE on-board · optional 10 GbE PCIe add-in card
  • USB-C 40 Gbps for external Thunderbolt storage or capture
Power, cooling, acoustics
  • 1200 W 80+ Platinum PSU (single 13 A wall outlet, 240 V AC)
  • Air-cooled with thermal validation under dual-GPU load
  • Sub-45 dBA idle · sub-55 dBA under sustained inference
  • Tower chassis — sits at the desk, no rack required
Next step

Tell us your workload. EMARQUE sizes the AI Work 100 and sends a quote.

Is this for you?

Client guidance — AI Work 100.

How EMARQUE scopes this system: who it suits, when to pick it, when to pick something else, and what we add beyond the hardware.

Who it's for

Individual researchers, developers, and small teams who want a real on-prem AI workstation under the desk — EMARQUE-built, locally supported, without server-class price or noise.

Choose this when

  • You're running 7B–70B class open-weight models locally for a small team.
  • Single or dual RTX 5090 (32 GB GDDR7 each) gives you enough headroom and you don't need NVLink coherence.
  • Office colocation — a quiet tower workstation on a standard wall outlet.

Pick something else when

  • You need to serve more than ~10 concurrent users in production — step up to AI PRO 500.
  • Your fine-tunes need 4+ GPUs or you want NVLink coherent memory — that's AI PRO 500 or AI Server territory.
  • The workload is heavily I/O-bound on bulk storage rather than GPU memory.

Best-fit workloads

  • Individual developers and researchers running 7B–70B class models locally
  • Small-team RAG prototypes — up to 10 concurrent users on a 13B model
  • LoRA / QLoRA fine-tuning on consumer-class GPUs
  • Cost-sensitive entry into on-prem AI — quiet office deployment

Model & memory fit

Two RTX 5090s give 64 GB of GPU memory across both cards. Comfortably runs 7B–13B at FP16, and 30B–70B with INT4 / INT8 quantization on a single card. Tensor parallelism across the two cards is over PCIe (no NVLink).

Deployment shape

Tower workstation on a standard 240 V wall outlet. Air-cooled, sub-55 dBA under load. 2.5 GbE on-board with optional 10 GbE add-in. No special HVAC or rack needed.

Alongside the rest of the lineup

Sits between DGX Spark (single-superchip, ≤ 70B quantized) and AI PRO 500 (multi-GPU pedestal). Cheaper than DGX Spark on a per-GPU-memory basis when dual RTX 5090 is selected; step up to AI PRO 500 when departmental concurrency grows.

Upgrade path

Step up to AI PRO 500 when you need 4 GPUs, ECC memory, or quieter office operation under sustained departmental load. Step to the EMARQUE AI Server (RTX PRO 6000 SE config) for production rackmount deployments.

What EMARQUE adds beyond the hardware

  • EMARQUE-assembled in Klang Valley with thermal validation on the dual-GPU configuration.
  • Customer-selectable GPU count (1 or 2) and storage capacity.
  • Pre-installed runtime (Ollama, vLLM, or your choice) validated before delivery.
  • Local warranty handling and locally-stocked parts replacement.
Supported workloads

Reference workload categories.

Workload categories documented in the manufacturer's reference materials. Sizing is confirmed with your technical team during scoping.

Individual researcher

Local fine-tuning + inference on 70B-class models

Load a quantized 70B model, run LoRA / QLoRA adapter training on a domain dataset, evaluate the result — all on the dual-GPU pool. Iterate as fast as your patience, not as fast as your cloud credit limit refreshes.

Small-team RAG

Internal chat for up to 10 concurrent users

Run a private RAG stack — embedder, vector store, 13B–34B chat model — for a team of 5–10 people querying internal docs. Throughput holds steady; the company's IP never reaches a third-party API.

Prototype before scaling

Validate the on-prem AI thesis before buying a server

Build the RAG pipeline, prove the latency targets, demonstrate the cost model — then scale into the AI PRO 500 or AI Server when team and traffic justify it. The AI Work 100 is the cheapest path to operational confidence.

Developer rig

Daily-driver AI coding assistant + experiments

Run a local code-completion model (StarCoder 2, Qwen Coder, DeepSeek Coder) alongside the IDE — sub-100ms suggestions with no cloud dependency. Spare GPU cycles power evening experiments without renting a cloud GPU.

Full spec sheet

Every line documented at quotation.

Configurable. Final BOM, GPU mix, RAM and storage, and networking topology are confirmed in writing at quotation.

GPU
1 or 2 × NVIDIA RTX 5090 (32 GB GDDR7 each)
CPU
AMD Ryzen 9 9950X — 16 cores / 32 threads
Memory
Up to 192 GB DDR5-6400 (4 DIMM)
Primary storage
Up to 8 TB Gen5 NVMe (RAID 0 / 1)
Bulk storage
Up to 32 TB SATA SSD or HDD optional
Cooling
Air-cooled with dual-GPU thermal validation
Acoustics
Sub-45 dBA idle, sub-55 dBA under sustained load
OS
Ubuntu 24.04 LTS, Windows 11 Pro
Networking
2.5 GbE on-board, 10 GbE optional add-in
Form factor
Tower workstation, standard 240 V wall outlet
FAQ

Common questions about AI Work 100

Can it really run a 70B model?

Yes, with quantization — Llama 3 70B at 4-bit (Q4_K_M GGUF) fits in ~40 GB of VRAM and runs well on dual RTX 5090s. Full FP16 70B does not fit; for that you want the AI PRO 500 with RTX PRO 6000 Blackwell. The AI Work 100 is sized for 7B–34B at full precision and 70B quantized.

Single GPU or dual? Which should I buy?

Single 5090 for individual research on ≤34B models or 70B quantized at lower throughput. Dual 5090 doubles inference throughput, enables larger context windows via tensor parallelism, and is required for serving 5+ concurrent users on a 13B+ model. EMARQUE recommends starting dual if the budget supports it — the upgrade cost later includes labour and re-validation.

How does this compare to a cloud GPU at $1.50/hour?

A dual-5090 AI Work 100 amortises against ~18 months of continuous cloud GPU rental at typical Malaysian developer usage. The break-even shifts shorter when you count data egress, idle-time billing, and the time cost of provisioning. Most teams see payback in 9–12 months; air-gapped workloads see it immediately because cloud isn't an option.

Is the consumer-grade Ryzen 9 + non-ECC memory a problem?

For inference and short-run fine-tuning, no — modern consumer DDR5 with on-die ECC and the Ryzen 9 9950X's IO die have proven reliable for AI workloads under EMARQUE's multi-point QA. If you require server-grade ECC RDIMMs and PCIe lane count for sustained 24/7 multi-user serving, the AI PRO 500 on Threadripper PRO is the right step up.

What software comes pre-installed?

Ubuntu 24.04 LTS or Windows 11 Pro at the customer's choice, with NVIDIA driver stack, CUDA 12.6+, PyTorch, vLLM, llama.cpp, and a benchmark suite pre-installed and validated against your stated workload at delivery. Each unit ships with a written benchmark report against your reference model.

Warranty and support?

Standard EMARQUE warranty (3-year parts and labour on EMARQUE-assembled components; NVIDIA's warranty on the GPUs). Priority phone + email support from EMARQUE in Malaysia, with optional Priority Care add-on for on-site response in Klang Valley.

Upgrade path if my team grows?

Two paths: (1) add a second AI Work 100 and load-balance across them at the application layer, or (2) consolidate into an AI PRO 500 (single workstation, up to 4 × RTX PRO 6000 Blackwell, ECC memory, dedicated AI-optimised cooling). EMARQUE handles trade-in and migration; the AI Work 100 retains residual value for a secondary research or developer role.

Request configuration & quotation.

Manufacturer specifications and warranty terms apply. EMARQUE issues a formal quotation through your Key Account Manager.

02Talk to EMARQUE

Tell us about your workload.

Model size, concurrency, latency budget, deployment site. EMARQUE returns a quote in MYR within one Malaysian business day, sized to the workload — not the salesperson’s quota.

  1. 01

    Key Account Manager

    +6012 627 2280
  2. 02

    Request for Quotation

    business@emarque.co