I'm a bit confused. The official documentation (http://msdn.microsoft.com/en-us/library/windows/desktop/bb509588(v=vs.85).aspx) says that ddx(input) is the partial derivative of the input with respect to the "screen-space x-coordinate."
My calculus is fine, but how can it tell what the input is coming from? If I passed it a function, fine, I can imagine what it would do, but the derivative of a number is always zero...?
Is it just a scaling? Like, this would get scaled to one tenth of its size at this distance, so it returns one tenth? But then it doesn't need an input...
Can someone give a quick explanation of what's actually happening?
Answer
Internally, GPUs never run one instance of a pixel shader at a time. At the finest level of granularity, they are always running 32-64 pixels at the same time using a SIMD architecture. Within this, the pixels are further organized into 2x2 quads, so each group of 4 consecutive pixels in the SIMD vector corresponds to a 2x2 block of pixels on screen.
Derivatives are calculated by taking differences between the pixels in a quad. For instance, ddx
will subtract the values in the pixels on the left side of the quad from the values on the right side, and ddy
will subtract the bottom pixels from the top ones. The differences can then be returned as the derivative to all four pixels in the quad.
Since the pixel shader is running in SIMD, it's guaranteed that the corresponding value is in the same register at the same time for all the pixels in the quad. So whatever expression or value you put into ddx
or ddy
, it will be evaluated in all four pixels of the quad, then the values from different pixels subtracted as described above.
So taking the derivative of a constant value will give zero (as you'd expect from calculus, right?) because it's the same constant value in all four pixels.
Also note that there are "coarse" and "fine" derivatives, ddx_coarse
/ddy_coarse
and ddx_fine
/ddy_fine
. An explanation of the distinction is given here. Just plain ddx
/ddy
are aliases for the coarse versions.
BTW, the reason this functionality exists is that GPUs internally have to take derivatives of texture coordinates in order to do mipmap selection and anisotropic filtering. Since the hardware needs the capability anyway (you can use any arbitrary expression for texture coordinates in a shader), it was easy enough to also expose it to shader programmers directly.
No comments:
Post a Comment