Multi-Layer Skeleton Fitting

The body model used is a multi-layer kinematic skeleton. Layer 1 consists of 10 bone segments connected by 11 joints. In total there are 24 degrees of freedom on layer 1. Layer 2 refines the layer-1 structure by more detailed arm and leg representations. Each arm and leg segment introduces two new degrees of freedom. One is the elbow and knee angle, the other the rotation around the layer-1 segment. The elbow and knee angles are fully determined by the cosine theorem at each time step. The volumetric extents of the arms and legs are modeled by means of point samples taken from cylindrical volumes around the layer-2 arm and leg segments, henceforth they are called cylinder samples.


Figure: 1: Skeleton layer 1 (l), Skeleton layer 2 (r)

Fitting the skeleton to the motion data is a 3 step process. Starting with the visual hull in the previous time step t, the computed 3D feature locations at time t and the model parameters at time t-1 the algorithm recovers the joint parameters at time t (see Fig. 2).

In a first step the orientation of the upper body is determined. This is achieved by interpreting the torso voxels as a 3D data set for which the eigenvectors of the covariance matrix are computed. These vectors, the principal components, are orthogonal and oriented along the directions of maximal variation. These directions correspond to the spine direction, the direction of the link between the shoulders and a third direction orthogonal to the first two. This way, the torso orientation and shoulder positions can be found.


Figure: 3: Principal components of torso voxels (l), skeleton fitted to principal components (r)

In the next step the layer-1 skeleton is fitted to the tracked 3D hand, head and knee locations. The lengths of the layer-1 arm and leg segments are appropriately rescaled.

The layer-2 elbow and knee angles are determined by the cosine theorem. The 4 remaining rotational degrees around the layer 1 segments are determined by means of a volume registration. Within a symmetric search neighborhood, the value of a match score function is sampled. This function reports a numerical score for the quality of fit between the cylinder samples and the visual hull voxels.

Results


Figure: 4.1: 3 example frames with skeleton layer 3 fitted to the current motion state of the person. The red spheres mark the 3D locations of tracked body parts.


Figure: 4.2: Example visual hull with fitted layer-2 skeleton rendered into a virtual model of the acquisition room.

In contrast to the earlier system version, the detection of upper body orientation and shoulder position enables the recovery of less constrained motion sequences. The layer-2 skeleton fitting by means of a quality of fit overlap function allows the system to find out arm and leg positions at sub-voxel resolution. This means that the approach can handle a coarse visual hull quantization. Detailed descriptions of experimental results can be found in Sect. Publications.

The following sequence shows the results of our model fitting with PCA based torso orientation computation applied to an example motion sequence : movie_flight. Further results are seen in movie2


Figure: 2: Model fitting algorithm overview