Engineering

How We Gave Every Sandbox a GPU Without SR-IOV

How we delivered hardware-class GPU acceleration to thousands of ephemeral sandboxes without relying on SR-IOV — and what we learned about scheduling, isolation, and density along the way.

Alexander Spring

•

6 min read

•

Jun 14, 2026

SR-IOV is how datacenter GPUs share themselves across virtual machines. Consumer cards don’t have it. We got around it anyway. Here’s how, and why we had to.

How we got here

Driver rents out real browsers. Plain unmodified Chrome, real Windows, real graphics, IPs leased from real ISPs. Our customers use them for brand protection, marketplace monitoring, ecommerce and product data, and background-screening across the web.

That means running on some of the most heavily protected sites on the internet: Shopee, Naver, Shein, and marketplaces like them. Sites where anything pretending to be a real computer gets caught in seconds.

Most tools attempt realism in software. Patched browsers, spoofed fingerprints, residential proxies. We go the other way. A real machine never has to convince anyone of anything. Its browser, OS, graphics, and network all tell the same story.

Our Gen 1 architecture ran around 60 browsers per server, all on one Windows OS. It worked because nothing was mixed. One locale, one time zone, one region for the whole box. A real machine in New York has a NY time and a NY ISP address.

So Gen 1 had a rule: one box, one locale. The browsers held up, hitting SOTA success rates. But the architecture had four problems:

Geography. One country per box. Serving a new market meant racking new hardware in that market.
Resources. Chrome is hungry. Sixty browsers fighting over one OS meant a heavy session could slow down all the others.
Isolation. Browsers shared a kernel, an OS, and its state. A crash or a leak in one place could touch everything.
Scale. ~60 browsers was the ceiling. A shared OS doesn’t get denser, it gets more fragile as more Chrome sessions fight for resources.

The fix is obvious on paper, and it fixes all four at once: run dozens of small, fully separate Windows machines per server, each with its own location and network. Except every one of those machines needs a real GPU. Software rendering is an instant giveaway, and if WebGL output doesn’t match a real GPU, you get blocked.

That’s where everyone hits the wall.

Three ways to put a GPU in a VM

Passthrough (DDA). Detach the card from the host and hand the whole thing to one VM. One card, one VM. Every additional VM gets software rendering and gets caught.
Partitioning (SR-IOV). The card splits itself into slices and the hypervisor deals them to guests. The clean way to share. Also a datacenter feature. Consumer cards don’t have it. If you pay up, a datacenter GPU is itself a tell: real shoppers don’t browse on server hardware. On top of that, GPU vendors charge hefty license fees for this feature.
Paravirtualization (GPU-PV). The host keeps the card and the driver. Each VM gets a virtual adapter that forwards graphics calls to the host, where the real driver runs them. The card is never split, so SR-IOV stops mattering. Same plumbing WSL2 and Windows Sandbox use.

Option 3 works on consumer cards. The catch: GPU-PV is bare. The VM gets a graphics adapter and nothing else. It doesn’t even have a driver. You have to copy the host’s driver files into it, and keep them matched every time the host updates.

The tooling that exists for this was built for one thing: a sandbox on your own desktop. One user, one or two VMs, most of the effort spent giving it a screen. Microsoft says the same in their docs: fine for one VM you own, not for hosting customers.

We needed the exact opposite. No screen, just correct rendering. Hundreds of VMs per box instead of one trusted user. GPU driver updates pushed to every live VM at once. And a different country, network, and locale per VM, something the single-user world never needed.

That’s where Gen 2 comes in.

Cells

We call our unit a cell. A complete, throwaway Windows machine. It boots in seconds, carries its own identity and network in the country you assign, and tears down clean when done. One server runs hundreds of them.

The lifecycle is backwards from a normal VM. No GPU exists at creation. The cell boots first. The shared GPU is then attached to the live machine afterwards. Instead of handing the card to one VM, the host lends it to all of them.

A cell thinks it has a normal GPU. In reality, every piece of graphics work it does is forwarded down to the host, the real card does the work, and the result comes back up. The cell never touches the hardware, but it gets the card at full strength: video decoding, the complete feature set, everything a real machine would have. Not the stripped-down virtual display that detectors look for.

Getting that working for one VM is the easy part. Making it survive a fleet is the rest of Gen 2. Every cell carries a copy of the host’s graphics driver, and the two have to match. The day the host updates, hundreds of live cells need the same update or their graphics break.

Cells also never need a screen, so we threw out everything display-related and kept only the part that renders.

What the website sees

The first question anyone asks: doesn’t the virtual adapter give you away?

No. The cell’s graphics calls run on the host’s real driver and real hardware. Query the GPU from inside a cell and you get the actual card. WebGL renderer string, Direct3D adapter, canvas output, video decode. All produced by a genuine consumer GPU, because it is a genuine consumer GPU.

There’s no virtual GPU string to catch, because there’s no virtual GPU doing the rendering.

What cells share

Each cell is a real VM boundary: own kernel, own network stack, own locale, own exit IP. The one thing every cell shares is the host GPU driver.

The payoff

Gen 2 closed all four Gen 1 problems at once.

Geography. One box now presents real identities across many countries at the same time.
Resources. Every Chrome has its own machine with a hard resource boundary.
Isolation. A crash is one bad cell, reset in a few seconds. Egress is enforced per cell at the OS level.
Scale. Around 60 browsers per server became hundreds. Same consumer cards we already owned.

Wrapping up

Three ways to put a GPU in a VM. Passthrough serves one guest. SR-IOV serves many, on datacenter cards with license fees. Paravirtualization serves many on consumer cards, but ships built for a single local sandbox.

This architecture distributes a single consumer GPU into hundreds of real environments, each with coherent driver state, its own network, and its own country.

Everyone else is trying to make fake browsers look more real. We made browsers that pass at scale, because they’re real.

Alexander Spring

Founder

Alex works on browsers and web data APIs at Driver. His focus is on the research and development of innovative infrastructure solution for web automation.

Launch browsers with confidence

Real Chrome. Real hardware. Real IPs. One Endpoint.

Get Started

Launch browsers with confidence

Real Chrome. Real hardware. Real IPs. One Endpoint.

Get Started