If there is one thing that is common across all devices that natively run SteamOS, it’s that the graphics chips inside them don’t have access to a great deal of VRAM. That’s a bit of a problem with some games, but thanks to the genius of a handful of coders, a solution has been rolled out in the latest 3.8.20 beta version of Valve’s operating system.
As noted by TechPowerUp, one of the coders in question is Natalie Vock, who describes their role as “an independent contractor for Valve on RADV, the community-developed open-source Vulkan driver for AMD GPUs.” Earlier this year, Vock published a blog post on a solution to issues with how Linux handles VRAM allocation and usage, so it’s not in the least bit surprising that Valve implemented this in SteamOS.
As I’m sure you all know, discrete GPUs have a pool of memory dedicated entirely to them (aka VRAM). However, that’s not the only memory they get to play around with, and there is an additional pool within the system memory called a Graphics Translation Table (GTT for short). Although this is managed by the OS, it is fully visible to the GPU.
The main problem with the GTT is that the GPU accesses it over the device’s PCIe interface, which has way less bandwidth (plus worse overall latencies) than the dedicated VRAM. That means if the data the GPU is requested is located within the GTT, then the performance at that stage will tank.
Games and other applications handle this by ensuring that what the GPU actually needs resides in the VRAM, but if there’s more data in total than there’s available in the dedicated pool, the operating system will move stuff about (i.e. carry out an eviction process) to keep important stuff in VRAM and the excess in the GTT.
(Image credit: Gamers Nexus)
You can probably see why evictions are the real problem here. First of all, the OS has no idea what data is or isn’t important; as graphics memory is virtualised for the GPU, it doesn’t know either. Secondly, constantly shifting data around between memory pools is a recipe for a performance stall, so an ‘allocation strategy’ was developed for Linux’s memory manager.
“Instead of specifying VRAM as the only acceptable domain to place the allocation in, every VRAM allocation request would specify both “VRAM” and “GTT” as possible memory domains,” explains Vock. “The kernel would interpret this as VRAM being preferred, but if there was no space, GTT was an acceptable fallback and the kernel wouldn’t try to kick out other VRAM memory to make space.”
However, as Vock discovered while testing VRAM usage with Cyberpunk 2077 on a GPU with 8 GB of dedicated memory, the OS was using 1,370 MB of GTT, even though the game was only using 6,105 MB. Other background applications were using some of the VRAM too, but Linux essentially wasn’t being firm enough on ensuring that almost all of the VRAM was allocated to Cyberpunk 2077.
The solution? Linux control groups (cgroups for short). This is a feature of the Linux kernel that gives you fine control over how resources can be allocated, prioritized, and managed for user-defined groups of tasks or processes. With cgroups, you can ‘protect’ memory from being evicted or set limits to how much memory can be used before evictions kick in.
What was missing was a VRAM cgroup controller, but thanks to a group effort by Vock, Maarten Lankhorst from Intel, and Maxime Ripard from Red Hat Linux, the resulting kernel patch (aided by dmemcg-booster) was the fix Vock was aiming for. They also went on to develop some additional kernel patches to help the kernel understand gaming scenarios better, from a memory perspective.
(Image credit: Future)
Their blog clearly shows the improvement: Cyberpunk 2077’s VRAM usage went from 6,105 to 7,395 MB, and the GTT dropped from 1,370 to 650 MB. With more of the data now in the appropriate place, the game is now less susceptible to stutters and long pauses.
Now, you might be wondering if a Steam Deck is going to greatly benefit from this. The answer is probably not, and that’s because it uses an integrated GPU, so both ‘VRAM’ and the GTT are hosted within the system memory. There might be some gains to be had from not having to shift memory pointers to locations outside of the dedicated graphics memory, but they’ll be minor at best.
But for the Steam Machine and any other device running SteamOS with a discrete GPU, the work of Vock et al. is an absolute boon. Other than trying out the new beta OS, the only thing left for us to do is wish that Microsoft had the same level of dedication to solving its memory management issues.
