LAB ONLINE · RACK A4 · 8× H100 · DRAW 2.4kW
Where the machines live.
An operational AI lab, wide open. GPUs spinning in a London data centre, a fleet of agents running experiments, every kernel launch and cache hit tailed in public. This page is the front door. Everything below is live.
// 00 · primary rack
Rack A4 · 8× H100
The pool that runs the inference batch. Each card pulls ~312 W under load. The rack peaks at 2.4 kW with all eight at full tilt. Direct-to-chip liquid cooling, A + B redundant feeds.
// 01 · provisioned hardware
The metal underneath
What the lab runs on. Not stock photos. These are the actual specs the inference pool is sized against.
// 02 · the floor
Data centre · LON-04 · floor 2
6 rows · 30 racks · 240 U usable. Cold aisle 18°C, hot aisle 32°C. PDUs at each end run A+B redundant feeds.
// 03 · live telemetry
Inference signals
Six core readouts from the inference pipeline. Refresh cadence ≈ 1s. Bands tuned to a typical 8-GPU pool under normal demand.
// 04 · runtime trace
System output, tailed
A continuous read on what the lab is doing right now: agent turns, GPU jobs, edge cache, moderation review.
// 05 · compute fleet
32 node host map
Per-node temperature and load. Click any tile for full diagnostics. Critical-state cells pulse red and auto-page the on-call console.
32 nodes · 88% utilised
// 06 · network
Edge topology & flow
Inference traffic fans out across 308 Cloudflare colos. Live RTT, packet flow, and active session count.
// 07 · experiments
Active research
Long-running experiments inside the lab. Each runs on a dedicated GPU slice. Results stream to the artefact store.
Recursive critique tower
Two-model debate loop. Researcher proposes, critic refutes, three rounds, judge ranks. Tracking inter-rater agreement vs. round depth.
SDXL stylebank explorer
Sweeping 64 controlnet conditioning combos × 12 schedulers. Sampling 4 images per combo. Saving thumbnail grids for human review.
Whisper diarisation eval
Comparing v3-large turbo against pyannote-3.1 on a 14-speaker meeting corpus. Measuring DER, JER, and word-attribution accuracy.
Embedding cache half-life
Production trace replay against a TTL'd embed cache. Looking at hit-rate vs. evict policy: LRU, LFU, ARC, random.
Tool-router fine-tune
LoRA on Llama-8b → tool-call routing classifier. Train on 1.2M synthetic dispatches. Eval on hand-labelled 4k set.
Lab-wide drift watch
Continuous K-S test on response-length and tool-choice distributions. Alerts if today's drift exceeds 2σ from a 30-day baseline.
// 08 · agent fleet
Autonomous workers
Six long-running agents share a memory store and a tool registry. Each one owns a slice of recurring work.
Researcher
Hunts source material across web, papers, and the org's own docs, then produces digestible briefs with citations.
Editor
Takes drafts and applies the house style: tone, hierarchy, claim-checking. Flags every unsourced sentence.
Planner
Breaks an arbitrary goal into a runnable plan with explicit dependencies. Owns the kanban for the lab itself.
Critic
Tries to break whatever the rest of the fleet ships. Adversarial probes, regression replays, sanity-checks.
Librarian
Owns memory hygiene. Compacts the long-term store, deduplicates, tags new entries, evicts stale ones.
Forecaster
Time-series brain. Watches the metrics and predicts trouble: thermal events, quota burn, traffic spikes.
// engage
Pick a surface to open up.
This is a working lab, not a brochure. Every section above is a live readout. The deeper systems live behind their own dashboards.