I am trying to find a way to organize my evergrowing number of shader techniques/functions (I am coding in sm_3.0). One way is to do this:
float4 PS_Crossroads(PS_INPUT input, uniform bool left_right) : COLOR0
{
if (left_right)
GoLeft();
else
GoRight();
...
}
technique LEFT
{
pass Pass1
{
VertexShader = compile vs_3_0 VertSh();
PixelShader = compile ps_3_0 PS_Crossroads(true);
}
}
technique RIGHT
{
pass Pass1
{
VertexShader = compile vs_3_0 VertSh();
PixelShader = compile ps_3_0 PS_Crossroads(false);
}
}
My question is will this cause a branch at runtime or will the compiler be smart enough to split it into two separate techniques?
Thanks.
P.S. Ok I won't post this as answer, because we already have an answer by someone much more experienced than me, but from my humble tests (on different PCs) and my humble Google research, this approach (using uniform constants) is efficient and indeed forces the compiler to create different shader versions for each branch, removing flow controll completely from the finall compiled shader. I tested the compiled shader codes with Nvidia NSight.
Answer
In HLSL, creating separate techniques that pass different compile-time values into a shader function will definitely generate efficient code. Optimizing away control flow due to compile-time constants is implemented in Microsoft's HLSL bytecode compiler, which means it doesn't matter which GPU or drivers you have; the optimization is already done by the time the shader gets to the driver's shader compiler. (Unless you disable optimization in the HLSL compiler - in which case, the optimization may very well still be done at the driver level.)
In GLSL, where there is no IHV-independent compiler and bytecode language, things are less certain. Shader compilers are generally pretty aggressive about optimizing away compile-time constants in any form, and a good mature GLSL compiler will have no trouble with this. But there are some platforms/drivers where the GLSL compiler doesn't do a good job, which led Unity developer Aras Pranckevičius to develop a standalone GLSL optimizer.
No comments:
Post a Comment