Rig Architecture
A GeForce NOW "rig" is a Windows VM running on bare-metal NVIDIA GPU servers. Each rig hosts a single game session at a time.
Hardware Layer
Rigs run on NVIDIA's nvmetal bare-metal infrastructure. The example rig from the backup:
| Property | Value |
|---|---|
| Hostname | np-ams6-br2-016-b.np-ams6.nvmetal.net |
| Zone | NP-AMS-06 (Amsterdam) |
| GPU | NVIDIA L40 (full) |
| Instance Type | gl40g_1.br25_2xlarge |
| Platform | NGN Platform v2.1 |
| Hypervisor | Xen |
| OS | Windows 11 (QCOW2 image) |
Virtualization
Rigs run as Xen VMs. The framework includes:
- LocalXen / RemoteXen — VM lifecycle management (start, stop, restart)
- PCI passthrough — GPU pinned directly to VM
- Xen tools —
xenstore_client.exefor reading VM metadata - Hostname sync from Xen VM name via
wmic computersystem
Zone Architecture
Zones are geographic clusters of rigs:
Zone: NP-AMS-06 (Amsterdam)
├── Provision Managers (PM): 10.192.17.9, 10.192.96.9, ...
├── Game Seat Gateway (GSG): gsg.np-ams-06.svc.cluster.local:443
├── DNS Cache: 10.223.136.74
├── KMS Host: consumerkms.nvidiangn.net:1688
├── Seat Pool: GS-gl40g_1.br25_2xlarge
└── Logging Server: 10.223.251.221Zone Properties
Stored in Redis and accessible via asgard_util_zone_properties:
| Section | Contents |
|---|---|
ZoneProperties | Zone name, mode (gaming/pro), region |
NetworkTopology | IP ranges, subnets, routing |
GameMachine | GPU type, instance config |
NATVM | NAT virtual machine settings |
Machine Roles
| Role | Description |
|---|---|
awsseats | Game seat VMs |
natvm | NAT/gateway VMs |
pm | Provision Manager nodes |
redis | State database nodes |
storage | Storage servers |
State Database
Every rig connects to a Redis instance (port 6399) for:
- Configuration storage
- Session state tracking
- Zone property caching
- Service registration
Seat Pool & Instance Types
The seat pool GS-gl40g_1.br25_2xlarge determines rig capabilities:
| Instance Type | GPU Fraction | Use Case |
|---|---|---|
gl40g_1.*_2xlarge | Full L40 | Premium tier (4K, HDR, 120fps) |
gl40g_2.*_large | Half L40 | Standard tier |
gl40g_4.*_small | Quarter L40 | Free tier |
ga10g_2.*_large | Half A10G | Standard tier |
gt10_2.*_medium | Half T10 | Legacy tier |
Half/quarter GPUs have restrictions like:
- AV1 encoding disabled on half GPUs
- Video encoder perf check skipped
- Lower resolution caps
L1 Validation Tests
Before a rig accepts sessions, l1test.py validates:
- Environment variables (
AG_LOGS,AG_HOME) - Required services installed:
nvcloudinitKioskPwdChangerseatinitserviceNvContainerLocalSystem
- User accounts (
kiosk,xen) - GPU present and drivers installed
- Network connectivity (DHCP/ping)
- Rollback state clean
Crash Detection & Recovery
A Windows Scheduled Task (CrashDetector.xml) runs detector.exe at boot:
- Monitors for crash dumps:
nvstreamer.exe,nvcontainer.exe,NvRtcStreamer - Collects
.dmpfiles from:C:\ProgramData\NVIDIA Corporation\Crashdumps\C:\Windows\MEMORY.DMPC:\Windows\minidump\C:\asgard\logs\AutoOnboarder\
- Reports crashes via telemetry events:
NGS_NvStreamerCrashNGS_NvContainerCrashNGS_KernelCrashNGS_CTMTCrashNGS_TasCrash
Windows Licensing
Rigs use KMS activation:
KMS Host: consumerkms.nvidiangn.net:1688The rearmWindows.py script:
- Checks license grace period via
slmgr.vbs - Rearms when remaining time < 15 days
- Disables Windows Activation Technologies (
WatAdminSvc) - Triggers reboot after rearm
Performance Monitoring
GridPerf Templates
Windows Performance Monitor templates (GridPerfMonTemplate.xml) track:
- Process metrics:
nvstreamer,NvRtcStreamer,NvGridSvc,apptracer - GPU counters: Temperature, clock speeds, utilization, frame buffer usage
- System: Memory, disk, CPU, network
GPU Performance Counters
Custom NVIDIA counter manifest (nvPerfProvider.man):
| Counter | Unit |
|---|---|
| Temperature | Degrees Celsius |
| Graphics Clock | MHz |
| Memory Clock | MHz |
| Graphics Utilization | % |
| Frame Buffer Utilization | % |
| Video Utilization | % |
| Fan/Cooler Level | % |