This post is about bridging the marker-less face expression tracking software Faceshift with the 2D animation engine Live2D. Communication between the two software is the network via TCP/IP port 33433. Faceshift stream its capture data on this default port. There is also a client sample application which demonstrate how to use their generic c++ parser to read the binary stream. On the client side, the DirectX sample demo of Live2D will be used as the both environment match the windows c++ ecosystem.
Though the stream provide data on the mesh and vertex structure, our need is only focused on the 3D transformation of the main face feature. The class handling this data is the fsTrackingData class, found in the fsbinarystream header of the sample project. Here are what we will use:
Head rotation as a pseudo-quaternion (a x,y,z,w simple structure).
Head translation as a 3D vector (x and y origin at center of the viewport)
Eyes pitch and yaw as two float (in degrees)
On the Live2D sample side, we would need to tweak the update function of the LAppModel class to inject the streamed transformation. Note that the Live2D SDK allow us to smooth any raw data by using their addToParamFloat method. This method seems to handle input data with a tween like smoothing which we can control further with a weight. All we have to do next is to convert the quaternion rotation to Euler angle and we have our range setup for the head.
If we want to be able to follow eye blinking and mouth movement we will have to include the marker position buffer also streamed in the same frame.
The Live2D SDK documentation has no detail about the range or unit of the parameters added to the animation manager. So calibrating this bridge rely on trial and error to have a working scene. But basically a real case scenario would rely on the use of a game IA like engine to manage the stream data and a default natural background animation, based on state.
From a real world production point of view we may wonder if the training setup needed by the Faceshift software could not be shortcut using an openCV Face Expression tracking algorithm. This doesn’t rely on profile for face expression capture so it should be the scenario to opt for any digital signage installation mapping a random audience face.
For more details on the bridge implementation please use our contact page here.