Skip to content

Rig Architecture

A GeForce NOW "rig" is a Windows VM running on bare-metal NVIDIA GPU servers. Each rig hosts a single game session at a time.

Hardware Layer

Rigs run on NVIDIA's nvmetal bare-metal infrastructure. The example rig from the backup:

PropertyValue
Hostnamenp-ams6-br2-016-b.np-ams6.nvmetal.net
ZoneNP-AMS-06 (Amsterdam)
GPUNVIDIA L40 (full)
Instance Typegl40g_1.br25_2xlarge
PlatformNGN Platform v2.1
HypervisorXen
OSWindows 11 (QCOW2 image)

Virtualization

Rigs run as Xen VMs. The framework includes:

  • LocalXen / RemoteXen — VM lifecycle management (start, stop, restart)
  • PCI passthrough — GPU pinned directly to VM
  • Xen toolsxenstore_client.exe for reading VM metadata
  • Hostname sync from Xen VM name via wmic computersystem

Zone Architecture

Zones are geographic clusters of rigs:

Zone: NP-AMS-06 (Amsterdam)
├── Provision Managers (PM): 10.192.17.9, 10.192.96.9, ...
├── Game Seat Gateway (GSG): gsg.np-ams-06.svc.cluster.local:443
├── DNS Cache: 10.223.136.74
├── KMS Host: consumerkms.nvidiangn.net:1688
├── Seat Pool: GS-gl40g_1.br25_2xlarge
└── Logging Server: 10.223.251.221

Zone Properties

Stored in Redis and accessible via asgard_util_zone_properties:

SectionContents
ZonePropertiesZone name, mode (gaming/pro), region
NetworkTopologyIP ranges, subnets, routing
GameMachineGPU type, instance config
NATVMNAT virtual machine settings

Machine Roles

RoleDescription
awsseatsGame seat VMs
natvmNAT/gateway VMs
pmProvision Manager nodes
redisState database nodes
storageStorage servers

State Database

Every rig connects to a Redis instance (port 6399) for:

  • Configuration storage
  • Session state tracking
  • Zone property caching
  • Service registration

Seat Pool & Instance Types

The seat pool GS-gl40g_1.br25_2xlarge determines rig capabilities:

Instance TypeGPU FractionUse Case
gl40g_1.*_2xlargeFull L40Premium tier (4K, HDR, 120fps)
gl40g_2.*_largeHalf L40Standard tier
gl40g_4.*_smallQuarter L40Free tier
ga10g_2.*_largeHalf A10GStandard tier
gt10_2.*_mediumHalf T10Legacy tier

Half/quarter GPUs have restrictions like:

  • AV1 encoding disabled on half GPUs
  • Video encoder perf check skipped
  • Lower resolution caps

L1 Validation Tests

Before a rig accepts sessions, l1test.py validates:

  1. Environment variables (AG_LOGS, AG_HOME)
  2. Required services installed:
    • nvcloudinit
    • KioskPwdChanger
    • seatinitservice
    • NvContainerLocalSystem
  3. User accounts (kiosk, xen)
  4. GPU present and drivers installed
  5. Network connectivity (DHCP/ping)
  6. Rollback state clean

Crash Detection & Recovery

A Windows Scheduled Task (CrashDetector.xml) runs detector.exe at boot:

  • Monitors for crash dumps: nvstreamer.exe, nvcontainer.exe, NvRtcStreamer
  • Collects .dmp files from:
    • C:\ProgramData\NVIDIA Corporation\Crashdumps\
    • C:\Windows\MEMORY.DMP
    • C:\Windows\minidump\
    • C:\asgard\logs\AutoOnboarder\
  • Reports crashes via telemetry events:
    • NGS_NvStreamerCrash
    • NGS_NvContainerCrash
    • NGS_KernelCrash
    • NGS_CTMTCrash
    • NGS_TasCrash

Windows Licensing

Rigs use KMS activation:

KMS Host: consumerkms.nvidiangn.net:1688

The rearmWindows.py script:

  • Checks license grace period via slmgr.vbs
  • Rearms when remaining time < 15 days
  • Disables Windows Activation Technologies (WatAdminSvc)
  • Triggers reboot after rearm

Performance Monitoring

GridPerf Templates

Windows Performance Monitor templates (GridPerfMonTemplate.xml) track:

  • Process metrics: nvstreamer, NvRtcStreamer, NvGridSvc, apptracer
  • GPU counters: Temperature, clock speeds, utilization, frame buffer usage
  • System: Memory, disk, CPU, network

GPU Performance Counters

Custom NVIDIA counter manifest (nvPerfProvider.man):

CounterUnit
TemperatureDegrees Celsius
Graphics ClockMHz
Memory ClockMHz
Graphics Utilization%
Frame Buffer Utilization%
Video Utilization%
Fan/Cooler Level%

admindesk.top — Reversed & documented from Asgard rig backups and GCIS plugin binaries.