Nvidia’s presentation about Shader Execution Reordering during the RTX 40 Series announcement probably didn’t mean much to most gamers. It was a nice bullet point that promised better performance, and expands upon the list of new things that next gen graphics cards will support. However, when you actually look at what it is, what it does, and how it enables future improvements, it’s actually pretty huge for game development.
So to explain why it’s huge, let’s talk about the way that processors and GPUs work. To put it simply, GPUs are really good at doing the same thing in multiple simultaneous instances. If you throw 1,000 of the same tasks at a GPU, it will get it done faster than if you threw 500 individual tasks at it. Think of it like this, in a factory does it make sense to have each worker with a drill screw in every type of screw, having to change his drill bit between Phillips head, flat head, and hex bits? Or would it make sense to have one worker assigned to each so that they can just screw in one part after another without needing to constantly swap out equipment?
Nvidia Shader Execution Reordering in games
It’s the same principle with GPUs. To give an example in videogames, having one texture that you repeat 1000 times on a level will load much more quickly than having 1,000 individual textures. So, let’s bring it back to what NVIDIA is doing with Shader Execution Reordering.
Let’s go back to that factory. SER would be like a foreman in a factory whose job it is to group similar tasks together. Phillips head parts with Phillips head, Flat head with Flat head, etc. The point is to make sure that each task is being handled as quickly as possible.
This being done with ray tracing is very important, because it’s not something a developer could do on its own. There are millions of tasks being done with ray tracing per minute, so no one is capable of ordering it manually. In other parts of game development, like the texture example from before, a developer could manage things. You could set a tiling texture where it makes sense, like for bricks on a wall, so that you manually have 1,000 of the same texture repeating rather than having 1,000 unique brick textures.
What NVIDIA Shader Execution Reordering then does is make it so that the GPU orders the ray tracing tasks itself. This translates to the end user as twice the shader performance and 25% better framerates with ray tracing turned on. And this is a technology that can be improved upon in the future. You will see a point where RTX could realistically be used in high-framerate competitive games surprisingly soon. Mainstream feasibility is always an important factor in any real-time effect, which is why SER is actually a big deal for ray tracing in gaming.