Friday, April 26, 2019

exporting bind and keyframe bone poses from blender to use in OpenGL


EDIT: I decided to reformulate the question in much simpler terms to see if someone can give me a hand with this.


Basically, I'm exporting meshes, skeletons and actions from blender into an engine of sorts that I'm working on. But I'm getting the animations wrong. I can tell the basic motion paths are being followed but there's always an axis of translation or rotation which is wrong. I think the problem is most likely not in my engine code (OpenGL-based) but rather in either my misunderstanding of some part of the theory behind skeletal animation / skinning or the way I am exporting the appropriate joint matrices from blender in my exporter script.


I'll explain the theory, the engine animation system and my blender export script, hoping someone might catch the error in either or all of these.


The theory: (I'm using column-major ordering since that's what I use in the engine cause it's OpenGL-based)



  • Assume I have a mesh made up of a single vertex v, along with a transformation matrix M which takes the vertex v from the mesh's local space to world space. That is, if I was to render the mesh without a skeleton, the final position would be gl_Position = ProjectionMatrix * M * v.

  • Now assume I have a skeleton with a single joint j in bind / rest pose. j is actually another matrix. A transform from j's local space to its parent space which I'll denote Bj. if j was part of a joint hierarchy in the skeleton, Bj would take from j space to j-1 space (that is to its parent space). However, in this example j is the only joint, so Bj takes from j space to world space, like M does for v.

  • Now further assume I have a a set of frames, each with a second transform Cj, which works the same as Bj only that for a different, arbitrary spatial configuration of join j. Cj still takes vertices from j space to world space but j is rotated and/or translated and/or scaled.



Given the above, in order to skin vertex v at keyframe n. I need to:



  1. take v from world space to joint j space

  2. modify j (while v stays fixed in j space and is thus taken along in the transformation)

  3. take v back from the modified j space to world space


So the mathematical implementation of the above would be: v' = Cj * Bj^-1 * v. Actually, I have one doubt here.. I said the mesh to which v belongs has a transform M which takes from model space to world space. And I've also read in a couple textbooks that it needs to be transformed from model space to joint space. But I also said in 1 that v needs to be transformed from world to joint space. So basically I'm not sure if I need to do v' = Cj * Bj^-1 * v or v' = Cj * Bj^-1 * M * v. Right now my implementation multiples v' by M and not v. But I've tried changing this and it just screws things up in a different way cause there's something else wrong.



  • Finally, If we wanted to skin a vertex to a joint j1 which in turn is a child of a joint j0, Bj1 would be Bj0 * Bj1 and Cj1 would be Cj0 * Cj1. But Since skinning is defined as v' = Cj * Bj^-1 * v , Bj1^-1 would be the reverse concatenation of the inverses making up the original product. That is, v' = Cj0 * Cj1 * Bj1^-1 * Bj0^-1 * v



Now on to the implementation (Blender side):


Assume the following mesh made up of 1 cube, whose vertices are bound to a single joint in a single-joint skeleton:


enter image description here


Assume also there's a 60-frame, 3-keyframe animation at 60 fps. The animation essentially is:



  • keyframe 0: the joint is in bind / rest pose (the way you see it in the image).

  • keyframe 30: the joint translates up (+z in blender) some amount and at the same time rotates pi/4 rad clockwise.

  • keyframe 59: the joint goes back to the same configuration it was in keyframe 0.


My first source of confusion on the blender side is its coordinate system (as opposed to OpenGL's default) and the different matrices accessible through the python api.



Right now, this is what my export script does about translating blender's coordinate system to OpenGL's standard system:


    # World transform: Blender -> OpenGL
worldTransform = Matrix().Identity(4)
worldTransform *= Matrix.Scale(-1, 4, (0,0,1))
worldTransform *= Matrix.Rotation(radians(90), 4, "X")

# Mesh (local) transform matrix
file.write('Mesh Transform:\n')
localTransform = mesh.matrix_local.copy()
localTransform = worldTransform * localTransform

for col in localTransform.col:
file.write('{:9f} {:9f} {:9f} {:9f}\n'.format(col[0], col[1], col[2], col[3]))
file.write('\n')

So if you will, my "world" matrix is basically the act of changing blenders coordinate system to the default GL one with +y up, +x right and -z into the viewing volume. Then I also premultiply (in the sense that it's done by the time we reach the engine, not in the sense of post or pre in terms of matrix multiplication order) the mesh matrix M so that I don't need to multiply it again once per draw call in the engine.


About the possible matrices to extract from Blender joints (bones in Blender parlance), I'm doing the following:




  • For joint bind poses:


    def DFSJointTraversal(file, skeleton, jointList):


    for joint in jointList:
    bindPoseJoint = skeleton.data.bones[joint.name]
    bindPoseTransform = bindPoseJoint.matrix_local.inverted()

    file.write('Joint ' + joint.name + ' Transform {\n')
    translationV = bindPoseTransform.to_translation()
    rotationQ = bindPoseTransform.to_3x3().to_quaternion()
    scaleV = bindPoseTransform.to_scale()
    file.write('T {:9f} {:9f} {:9f}\n'.format(translationV[0], translationV[1], translationV[2]))

    file.write('Q {:9f} {:9f} {:9f} {:9f}\n'.format(rotationQ[1], rotationQ[2], rotationQ[3], rotationQ[0]))
    file.write('S {:9f} {:9f} {:9f}\n'.format(scaleV[0], scaleV[1], scaleV[2]))

    DFSJointTraversal(file, skeleton, joint.children)
    file.write('}\n')


Note that I'm actually grabbing the inverse of what I think is the bind pose transform Bj. This is so I don't need to invert it in the engine. Also note I went for matrix_local, assuming this is Bj. The other option is plain "matrix", which as far as I can tell is the same only that not homogeneous.





  • For joint current / keyframe poses:


    for kfIndex in keyframes:
    bpy.context.scene.frame_set(kfIndex)
    file.write('keyframe: {:d}\n'.format(int(kfIndex)))
    for i in range(0, len(skeleton.data.bones)):
    file.write('joint: {:d}\n'.format(i))

    currentPoseJoint = skeleton.pose.bones[i]
    currentPoseTransform = currentPoseJoint.matrix


    translationV = currentPoseTransform.to_translation()
    rotationQ = currentPoseTransform.to_3x3().to_quaternion()
    scaleV = currentPoseTransform.to_scale()
    file.write('T {:9f} {:9f} {:9f}\n'.format(translationV[0], translationV[1], translationV[2]))
    file.write('Q {:9f} {:9f} {:9f} {:9f}\n'.format(rotationQ[1], rotationQ[2], rotationQ[3], rotationQ[0]))
    file.write('S {:9f} {:9f} {:9f}\n'.format(scaleV[0], scaleV[1], scaleV[2]))

    file.write('\n')



Note that here I go for skeleton.pose.bones instead of data.bones and that I have a choice of 3 matrices: matrix, matrix_basis and matrix_channel. From the descriptions in the python API docs I'm not super clear which one I should choose, though I think it's the plain matrix. Also note I do not invert the matrix in this case.


The implementation (Engine / OpenGL side):


My animation subsystem does the following on each update (I'm omitting parts of the update loop where it's figured out which objects need update and time is hardcoded here for simplicity):


static double time = 0;
time = fmod((time + elapsedTime),1.);
uint16_t LERPKeyframeNumber = 60 * time;
uint16_t lkeyframeNumber = 0;
uint16_t lkeyframeIndex = 0;
uint16_t rkeyframeNumber = 0;
uint16_t rkeyframeIndex = 0;


for (int i = 0; i < aClip.keyframesCount; i++) {
uint16_t keyframeNumber = aClip.keyframes[i].number;
if (keyframeNumber <= LERPKeyframeNumber) {
lkeyframeIndex = i;
lkeyframeNumber = keyframeNumber;
}
else {
rkeyframeIndex = i;
rkeyframeNumber = keyframeNumber;

break;
}
}

double lTime = lkeyframeNumber / 60.;
double rTime = rkeyframeNumber / 60.;
double blendFactor = (time - lTime) / (rTime - lTime);

GLKMatrix4 bindPosePalette[aSkeleton.jointsCount];
GLKMatrix4 currentPosePalette[aSkeleton.jointsCount];


for (int i = 0; i < aSkeleton.jointsCount; i++) {
F3DETQSType& lPose = aClip.keyframes[lkeyframeIndex].skeletonPose.joints[i];
F3DETQSType& rPose = aClip.keyframes[rkeyframeIndex].skeletonPose.joints[i];

GLKVector3 LERPTranslation = GLKVector3Lerp(lPose.t, rPose.t, blendFactor);
GLKQuaternion SLERPRotation = GLKQuaternionSlerp(lPose.q, rPose.q, blendFactor);
GLKVector3 LERPScaling = GLKVector3Lerp(lPose.s, rPose.s, blendFactor);

GLKMatrix4 currentTransform = GLKMatrix4MakeWithQuaternion(SLERPRotation);

currentTransform = GLKMatrix4TranslateWithVector3(currentTransform, LERPTranslation);
currentTransform = GLKMatrix4ScaleWithVector3(currentTransform, LERPScaling);

GLKMatrix4 inverseBindTransform = GLKMatrix4MakeWithQuaternion(aSkeleton.joints[i].inverseBindTransform.q);
inverseBindTransform = GLKMatrix4TranslateWithVector3(inverseBindTransform, aSkeleton.joints[i].inverseBindTransform.t);
inverseBindTransform = GLKMatrix4ScaleWithVector3(inverseBindTransform, aSkeleton.joints[i].inverseBindTransform.s);

if (aSkeleton.joints[i].parentIndex == -1) {
bindPosePalette[i] = inverseBindTransform;
currentPosePalette[i] = currentTransform;

}

else {
bindPosePalette[i] = GLKMatrix4Multiply(inverseBindTransform, bindPosePalette[aSkeleton.joints[i].parentIndex]);
currentPosePalette[i] = GLKMatrix4Multiply(currentPosePalette[aSkeleton.joints[i].parentIndex], currentTransform);
}

aSkeleton.skinningPalette[i] = GLKMatrix4Multiply(currentPosePalette[i], bindPosePalette[i]);
}


Finally, this is my vertex shader:


#version 100

uniform mat4 modelMatrix;
uniform mat3 normalMatrix;
uniform mat4 projectionMatrix;
uniform mat4 skinningPalette[6];
uniform lowp float skinningEnabled;



attribute vec4 position;
attribute vec3 normal;
attribute vec2 tCoordinates;
attribute vec4 jointsWeights;
attribute vec4 jointsIndices;

varying highp vec2 tCoordinatesVarying;
varying highp float lIntensity;

void main()

{
tCoordinatesVarying = tCoordinates;

vec4 skinnedVertexPosition = vec4(0.);
for (int i = 0; i < 4; i++) {
skinnedVertexPosition += jointsWeights[i] * skinningPalette[int(jointsIndices[i])] * position;
}

vec4 skinnedNormal = vec4(0.);
for (int i = 0; i < 4; i++) {

skinnedNormal += jointsWeights[i] * skinningPalette[int(jointsIndices[i])] * vec4(normal, 0.);
}


vec4 finalPosition = mix(position, skinnedVertexPosition, skinningEnabled);
vec4 finalNormal = mix(vec4(normal, 0.), skinnedNormal, skinningEnabled);

vec3 eyeNormal = normalize(normalMatrix * finalNormal.xyz);
vec3 lightPosition = vec3(0., 0., 2.);
lIntensity = max(0.0, dot(eyeNormal, normalize(lightPosition)));




gl_Position = projectionMatrix * modelMatrix * finalPosition;
}

The result is that the animation displays wrong in terms of orientation. That is, instead of bobbing up and down it bobs in and out (along what I think is the Z axis according to my transform in the export clip). And the rotation angle is counterclockwise instead of clockwise.


If I try with a more than one joint, then it's almost as if the second joint rotates in it's own different coordinate space and does not follow 100% its parent's transform. Which I assume it should from my animation subsystem which I assume in turn follows the theory I explained for the case of more than one joint.


Any thoughts?




No comments:

Post a Comment

Simple past, Present perfect Past perfect

Can you tell me which form of the following sentences is the correct one please? Imagine two friends discussing the gym... I was in a good s...