I want to implement a flexible Ubershader system, with deferred shading. My current idea is to create shaders out of modules, which deal with certain features, such as FlatTexture, BumpTexture, Displacement Mapping, etc. There are also little modules which decode color, do tone mapping, etc. This has the advantage that I can replace certain types of modules if the GPU doesn't support them, so I can adapt to the current GPU capabilities. I am not sure if this design is good. I fear I could make a bad design choice, now, and later pay for it.
My question is where do I find resources, examples, articles about how to implement a shader management system effectively? Does anyone know how the big game engines do this?
Answer
A semi-common approach is to make what I call shader components, similar to what I think you're calling modules.
The idea is similar to a post-processing graph. You write chunks of shader code that includes both the necessary inputs, the generated outputs, and then the code to actually work on them. You have a list which denotes which shaders to apply in any situation (whether this material needs a bump mapping component, whether the deferred or forward component is enabled, etc.).
You can now take this graph and generate shader code from it. This mostly means "pasting" the chunks' code into place, with the graph having ensured they're in the necessary order already, and then pasting in the shader inputs/outputs as appropriate (in GLSL, this means defining your "global" in, out, and uniform variables).
This is not the same as an ubershader approach. Ubershaders are where you put all the code needed for everything into a single set of shader, maybe using #ifdefs and uniforms and the like to turn features on and off when compiling or running them. I personally despise the ubershader approach, but some rather impressive AAA engines use them (Crytek in particular comes to mind).
You can handle the shader chunks in several ways. The most advanced way - and useful if you plan to support GLSL, HLSL, and the consoles - is to write a parser for a shader language (probably as close to HLSL/Cg or GLSL as you can for maximum "understandability" by your devs) that can then be used for source-to-source translations. Another approach is to just wrap up shader chunks in XML files or the like, e.g.
output.color = vec4(input.color.r, 0, 0, 1);
]]>
Note with that approach you could make multiple code sections for different APIs or even version the code section (so you can have a GLSL 1.20 version and a GLSL 3.20 version). Your graph can even automatically exclude shader chunks which have no compatible code section so you can get semi-graceful degradation on older hardware (so something like normal mapping or whatever is just excluded on older hardware that can't support it without the programmer needing to do a bunch of explicit checks).
The XMl sample can then generate something similar to (apologies if this is invalid GLSL, it's been a while since I've subjected myself to that API):
layout (location=0) in vec4 input_color;
layout (location=0) out vec4 output_color;
struct Input {
vec4 color;
};
struct Output {
vec4 color;
}
void main() {
Input input;
input.color = input_color;
Output output;
// Source: example.shader
#line 5
output.color = vec4(input.color.r, 0, 0, 1);
output_color = output.color;
}
You could be a little smarter and generate more "efficient" code, but honestly any shader compiler that isn't total crap is going to remove the redundancies from that generated code for you. Maybe newer GLSL lets you put the file name in #line
commands now, too, but I know older versions are very deficient and don't support that.
If you have multiple chunks, their inputs (which aren't supplied as an output by an ancestor chunk in the tree) are concatenated into the input block, as are outputs, and the code is just concatenated. A little extra work is done to ensure stages match up (vertex vs fragment) and that vertex attribute input layouts "just work". Another nice benefit with this approach is that you can write explicit uniform and input attribute binding indices which are unsupported in older versions of GLSL and handle these in your shader generation/binding library. Likewise you can use the metadata in setting up your VBOs and glVertexAttribPointer
calls to ensure compatibility and that everything "just works."
Unfortunately there is no good cross-API library like this already. Cg comes kinda close, but it has crap support for OpenGL on AMD cards and can be exceedingly slow if you use any but the most basic code generation features. The DirectX effects framework work too but of course has zero support for any language besides HLSL. There are some incomplete/buggy libraries for GLSL that mimic the DirectX libraries but given their state the last time I checked I'd just write my own.
The ubershader approach just means defining "well-known" preprocessor directives for certain features and then recompiling for different materials with different configuration. e.g., for any material with a normal map you can define USE_NORMAL_MAPPING=1
and then in your pixel-stage ubershader just have:
#if USE_NORMAL_MAPPING
vec4 normal;
// all your normal mapping code
#else
vec4 normal = normalize(in_normal);
#endif
A big problem here is handling this for precompiled HLSL, where you need to precompile all combinations in use. Even with GLSL you need to be able to properly generate a key of all the preprocessor directives in use to avoid recompiling/caching identical shaders. Using uniforms can reduce the complexity but unlike the preprocessor uniforms don't reduce instruction count and can still have some minor impact on performance.
Just to be clear, both approaches (as well as just manually writing a ton of variations of shaders) are all used in the AAA space. Use whichever works best for you.
No comments:
Post a Comment