I see questions come up quite often that have this underlying issue, but they're all caught up in the particulars of a given feature or tool. Here's an attempt to create a canonical answer we can refer users to when this comes up - with lots of animated examples! :)
Let's say we're making a first-person camera. The basic idea is it should yaw to look left & right, and pitch to look up & down. So we write a bit of code like this (using Unity as an example):
void Update() {
float speed = lookSpeed * Time.deltaTime;
// Yaw around the y axis using the player's horizontal input.
transform.Rotate(0f, Input.GetAxis("Horizontal") * speed, 0f);
// Pitch around the x axis using the player's vertical input.
transform.Rotate(-Input.GetAxis("Vertical") * speed, 0f, 0f);
}
or maybe
// Construct a quaternion or a matrix representing incremental camera rotation.
Quaternion rotation = Quaternion.Euler(
-Input.GetAxis("Vertical") * speed,
Input.GetAxis("Horizontal") * speed,
0);
// Fold this change into the camera's current rotation.
transform.rotation *= rotation;
And it mostly works, but over time the view starts to get crooked. The camera seems to be turning on its roll axis (z) even though we only told it to rotate on the x and y!
This can also happen if we're trying to manipulate an object in front of the camera - say it's a globe we want to turn to look around:
The same problem - after a while the North pole starts to wander away to the left or right. We're giving input on two axes but we're getting this confusing rotation on a third. And it happens whether we apply all our rotations around the object's local axes or the world's global axes.
In many engines you'll also see this in the inspector - rotate the object in the world, and suddenly numbers change on an axis we didn't even touch!
So, is this an engine bug? How do we tell the program we don't want it adding extra rotation?
Does it have something to do with Euler angles? Should I use Quaternions or Rotation Matrices or Basis Vectors instead?
No this isn't an engine bug or an artifact of a particular rotation representation (those can happen too, but this effect applies to every system that represents rotations, quaternions included).
You've discovered a real fact about how rotation works in three-dimensional space, and it departs from our intuition about other transformations like translation:
When we compose rotations on more than one axis, the result we get isn't just the total/net value we applied to each axis (as we might expect for translation). The order in which we apply the rotations changes the result, as each rotation moves the axes on which the next rotations get applied (if rotating about the object's local axes), or the relationship between the object and the axis (if rotating about the world's axes).
The changing of axis relationships over time can confuse our intuition about what each axis is "supposed" to do. In particular, certain combinations of yaw and pitch rotations give the same result as a roll rotation!
You can verify that each step is rotating correctly about the axis we requested - there's no engine glitch or artifact in our notation interfering with or second-guessing our input - the spherical (or hyperspherical / quaternion) nature of rotation just means our transformations "wrap around" onto each other. They may be orthogonal locally, for small rotations, but as they pile up we find they're not globally orthogonal.
This is most dramatic and clear for 90-degree turns like those above, but the wandering axes creep in over many small rotations too, as demonstrated in the question.
So, what do we do about it?
If you already have a pitch-yaw rotation system, one of the quickest ways to eliminate unwanted roll is to change one of the rotations to operate on the global or parent transformation axes instead of the object's local axes. That way you can't get cross-contamination between the two - one axis remains absolutely controlled.
Here's the same sequence of pitch-yaw-pitch that became a roll in the example above, but now we apply our yaw around the global Y axis instead of the object's
So we can fix the first-person camera with the mantra "Pitch Locally, Yaw Globally":
void Update() {
float speed = lookSpeed * Time.deltaTime;
transform.Rotate(0f, Input.GetAxis("Horizontal") * speed, 0f, Space.World);
transform.Rotate(-Input.GetAxis("Vertical") * speed, 0f, 0f, Space.Self);
}
If you're compounding your rotations using multiplication, you'd flip the left/right order of one of the multiplications to get the same effect:
// Yaw happens "over" the current rotation, in global coordinates.
Quaternion yaw = Quaternion.Euler(0f, Input.GetAxis("Horizontal") * speed, 0f);
transform.rotation = yaw * transform.rotation; // yaw on the left.
// Pitch happens "under" the current rotation, in local coordinates.
Quaternion pitch = Quaternion.Euler(-Input.GetAxis("Vertical") * speed, 0f, 0f);
transform.rotation = transform.rotation * pitch; // pitch on the right.
(The specific order will depend on the multiplication conventions in your environment, but left = more global / right = more local is a common choice)
This is equivalent to storing up the net total yaw and total pitch you want as float variables, then always applying the net result all at once, constructing a single new orientation quaternion or matrix from these angles alone (provided you keep totalPitch
clamped):
// Construct a new orientation quaternion or matrix from Euler/Tait-Bryan angles.
var newRotation = Quaternion.Euler(totalPitch, totalYaw, 0f);
// Apply it to our object.
transform.rotation = newRotation;
or equivalently...
// Form a view vector using total pitch & yaw as spherical coordinates.
Vector3 forward = new Vector3(
Mathf.cos(totalPitch) * Mathf.sin(totalYaw),
Mathf.sin(totalPitch),
Mathf.cos(totalPitch) * Mathf.cos(totalYaw));
// Construct an orientation or view matrix pointing in that direction.
var newRotation = Quaternion.LookRotation(forward, new Vector3(0, 1, 0));
// Apply it to our object.
transform.rotation = newRotation;
Using this global/local split, the rotations don't have a chance to compound and influence each other, because they're applied to independent sets of axes.
The same idea can help if it's an object in the world that we want to rotate. For an example like the globe, we'd often want to invert it and apply our yaw locally (so it always spins around its poles) and pitch globally (so it tips toward/away from our view, rather than toward/away from Australia, wherever it's pointing...)
Limitations
This global/local hybrid strategy isn't always the right fix. For example, in a game with 3D flight/swimming, you might want to be able to point straight up / straight down and still have full control. But with this setup you'll hit gimbal lock - your yaw axis (global up) becomes parallel to your roll axis (local forward), and you have no way to look left or right without twisting.
What you can do instead in cases like this is to use pure local rotations like we started with in the question above (so your controls feel the same no matter where you're looking), which will initially let some roll creep in - but then we correct for it.
For example, we can use local rotations to update our "forward" vector, then use that forward vector together with a reference "up" vector to construct our final orientation. (Using, for example, Unity's Quaternion.LookRotation method, or manually constructing an orthonormal matrix from these vectors) By controlling the up vector, we control the roll or twist.
For the flight/swimming example, you'll want to apply these corrections gradually over time. If it's too abrupt, the view can lurch in a distracting way. Instead, you can use the player's current up vector and hint it toward the vertical, frame-by-frame, until their view levels out. Applying this during a turn can sometimes be less nauseating than twisting the camera while the player's controls are idle.