Wednesday, March 30, 2016

opengl - 3D Camera Rotation


Please, forgive me, but I need help and I've been stuck on this for a few weeks now, I'm making no progress and everywhere I go and I see a different answer, everything I try doesn't work. I've had enough tips and advice, now I really just need someone to give me the answer for me to work backwards from because I can't understand this.



What has made this subject most confusing is the way everyone uses a different set of conventions or rules, and their answers are based on their own conventions without defining what they are.


So here is the set of conventions I've formed based on what seems to be most common and logical:



  1. Right Hand Rule for axis.

  2. Positive Y is up, Positive Z is towards the viewer, Positive X is to the right.

  3. Row Major matrixes, transposed when sent to shaders.



    • Pitch: rotation about the X axis

    • Yaw: rotation about the y axis


    • Roll: rotation about the z axis



  4. Rotation order: Roll, Pitch, Yaw (is this correct? can someone check me on this?)

  5. Positive rotation values, looking down from positive end of an axis, results in clockwise rotation.

  6. Default direction for 0 rotation across all axis is a vector pointing down to negative Y.


.. given those conventions (by all means correct me if they are wrong!), how does one:



  • Write a LookAt function? (lookAt(vector position, vector eyefocus, vector up))


  • Calculate a rotation matrix. (rotation(x, y, z))


I've tried answering these two questions myself at least over the past 3 weeks, I've re-written my LookAt & Rotation Matrix function at least 30 times, I've tested dozens of methods and read through material I've seen on hundreds of websites and read many answered questions, copied other people's code, and nothing I've made so far has worked, everything has produced the wrong result. Some of which have produced some hilariously bizarre outputs not even close to correct rotation.


I've been working on this every night with the exception of last night because I was getting so frustrated with the repeated failure that I had to stop and take a break.


Please, just show me what the correct method is so I can work backwards from that and figure out how it works, I'm just not getting the correct answer and this is driving me a little crazy!


I'm writing in Java but I'll take code written in any language, most of my 3D rendering code is actually working quite brilliantly, it's just the maths I can't understand.


UPDATE: SOLVED


Thankyou for your help! I now have a working LookAt function that I actually understand and I couldn't be happier (if anyone would like to see it by all means ask).


I did try again at creating a rotation matrix based off pitch/yaw/roll variables and it again seemed to fail, but I've decided to dump attempting to use euler angles for the freelook camera as it seems to be ill-suited for the role, instead I'm going to create a quaternion class, might have better luck going down that path, otherwise I'll resort to using the pitch/yaw as spherical coordinates and rely on the new working LookAt function for rotation.


If anyone else is facing a similar problem and wants to ask me questions, feel free to.



At least I'm not stuck anymore, thanks for the help!



Answer



What you are looking for can be found in this very good explanation: http://www.songho.ca/opengl/gl_transform.html


But since I found it sort of confusing without hand holding I will try to explain it here.


At this point you need to consider 5 coordinate systems and how they relate to each other. These are the window coordinates, the normalized device coordinates, the eye coordinates, the world coordinates and the object coordinates.


The window coordinates can be seen as the "physical" pixels on your screen. They are the coordinates that the windowing system refers to and if you operate in your monitors native resolution, these are actually individual pixels. The window coordinate system are 2D integers and is relative to your window. Here the x+ is left and y+ is down with the origin at the top left corner. You encounter these when you for example call glViewport.


The second set are the normalized device coordinates. These refer to the space setup by the active view port. The visible area of the view port goes from -1 to +1 and thus has the origin in the center. The x+ is left and the y+ is up. You also have the z+ is "out" of the scene. This is what you describe in 1.


You have no control how you get from the normalized device coordinates to the window coordinates, this is done implicitly for you. The only control you have is through glViewport or similar.


When working with openGL, your final result will always be in normalized device coordinates. As a result you need to worry how to get your scene rendered in these. If you set the projection and model-view matrix to the identity matrix you can directly draw in these coordinates. This is for example done when applying full screen effects.


The next is the eye coordinates. This is the world as seen from the camera. As a result the origin is in the camera and the same axis aliments like the device coordinates apply.



To get from the eye coordinates to the device coordinates you build the projection matrix. The simplest is the orthographic projection that just scales the values appropriately. The perspective projection is more complicated and involves simulation perspective.


Finally you have the world coordinate system. This is the coordinate system in which your world is defined and your camera is part of this world. Here it is important to note that the axis orientations is just as you define them. If you prefer z+ as up, that is totally fine.


To get from world coordinates to eye coordinates you define the view matrix. This can be done with something like lookAt. What this matrix does is "move" the world so that the camera is at the origin and looking down the z- axis.


To compute the view matrix is surprisingly simple, you need to unto the camera's transformation. You basically need to formulate the following matrix:


$$ M = \begin{matrix} x[1] & y[1] & z[1] & -p[1] \\ x[2] & y[2] & z[2] & -p[2] \\ x[3] & y[3] & z[3] & -p[3] \\ 0 & 0 & 0 & 1 \end{matrix} $$


The x, y and z vectors can be directly takes from the camera. In the case from look at you would derive them from the target, eye(center) and up values. Like so:


$$ z = normalize(eye - target) \\ x = normalize(up \times z) \\ y = z \cdot x $$


But if you happen to have these values just lying around you can just take them as they are.


Getting p is a bit more tricky. It is not the position in world coordinates but the position in camera coordinates. A simple workaround here is to initialize two matrixes, one with only x, y and z and a second one with -eye and multiply them together. The result is the view matrix.


For how this may look in code:



mat4 lookat(vec3 eye, vec3 target, vec3 up)
{
vec3 zaxis = normalize(eye - target);
vec3 xaxis = normalize(cross(up, zaxis));
vec3 yaxis = cross(zaxis, xaxis);

mat4 orientation(
xaxis[0], yaxis[0], zaxis[0], 0,
xaxis[1], yaxis[1], zaxis[1], 0,
xaxis[2], yaxis[2], zaxis[2], 0,

0, 0, 0, 1);

mat4 translation(
1, 0, 0, 0,
0, 1, 0, 0,
0, 0, 1, 0,
-eye[0], -eye[1], -eye[2], 1);

return orientation * translation;
}


full code


And finally for the sake of completeness, you also have the object coordinate system. This is the coordinate system in which meshes are stored. With the help of the model matrix the mesh coordinates are converted into the world coordinate system. In practice the model and view matrices are combined into the so called model-view matrix.


No comments:

Post a Comment

Simple past, Present perfect Past perfect

Can you tell me which form of the following sentences is the correct one please? Imagine two friends discussing the gym... I was in a good s...