FACEngine ID Pose Estimation Accuracy

 
A process unique to Animetrics, our Avatar Generation consists of producing a 3-dimensional model of a subject using highly complex mathematical algorithms analyze 2-dimensional images and, through a process for smooth deformations, generate realistic, accurate, fully structured and defined 3-dimensional avatars. The generated 3-dimensional model can be easily reoriented to vastly improve identification accuracy with off-pose data. Animetrics tests and validates the accuracy of this avatar generation and pose normalization by employing synthetic images generated through the manual landmarking of facial imagery whcih is rendered at a predetermined pose. We are then able to analyze the projected feature coordinates of the synthetic image, feed those values back into the avatar generation module and compare the know original results with the newly generated values.
 
Figure 1 below shows results depicting the comparison of a subset of the anatomically defined fiducial marks superimposed over the manually defined features. Green points are taken from the single merged geometry, projected into the projective plane. Red points show the manually defined features. We have studied the rigid motion estimation by generating simulated databases from the FRGC with known rigid motions of up to 30 degrees in X, Y, and Z directions. The figure below shows the accuracy of the rigid motion estimation generated. We observe on the order of 10−2 degree bias in Y and Z rotations throughout the full range of simulated rotations, and sub degree bias in X rotations, with under 2 degree standard deviation in X and under .25 degree standard deviation in Y and Z.
 

 


Figure 1: Standard deviation error ellipses shown to scale on the exploded heads depicted the RMSE of the generated avatar geometries. An avatar was generated for each image. Shown via the red ellipses are the RMSEs of the fiducial points in the image plane and compared to the manual features.

 
Evaluation

From an initial dataset of 1190 images, 1174 images were successfully detected with 2DIFA. Out of 1174 images used for validation of the Fine Pose Estimation, only 5 images produced a failed result (a rotation error of > 22° on any axis), and an additional 28 images produced degraded but acceptable results (a rotation error of >10° and ≤22° on any axis), for an overall success rate of 97.2%. The success rate against the entire dataset, including failures to properly localize the head is 95.9%.
 
4 of the 5 fine pose estimation failures occurred in the mobile camera dataset, and all 5 failures were in Y-axis estimation. As we can see from the histogram in Figure 2 below, there was a far greater distribution of rotations in the Y-axis in the mobile data compared to the FRGC datasets.




Figure 2: Histogram of basline Y-rotation (yaw) distribution of three of the datasets used in the Rigid Motion Estimation Evaluation

Individual Datasets

Dataset

The Data is high resolution, carefully collected data which, as expected, produced highly precise results. For the data, there were zero failures to detect the face. Rotation estimation for every image on all three axes was within an acceptable range of +/-10°.

Table 1: Detailed Results for Rotation Estimation

Dataset

Failure / Degraded

Head Detection Success

Fine Pose Estimation Success

X Std. Dev. °

Y Std. Dev. °

Z Std. Dev. °

Controlled 
(608)

0

100%

100%

4.38

1.71

0.66

Figure 3 below shows the distribution of errors in the rotation estimation of the data. The percentages for each bin are shown below the bin labels for each axis. Bins to the right of 0 are positive rotations, and left of 0 are negative rotations.



Figure 3: Histogram of rotation errors along each axis, X (pitch), Y (yaw), and Z (roll).

 
Uncontrolled Dataset

The uncontrolled data produced 16 failures to detect the face in the image out of 508 total images. This is mitigated by the fact that 13 of 16 failures were on heads which were within +/-10% of the designed scale range of the Coarse Pose Estimator.



Figure 4: Two images of the same subject in the same hallway from the FRGC Uncontrolled data.  The image on the right produces a successful detection whereas the image on the left produces a failure.  Based on the image width of 2272 pixels, the left image is well below the 6.25% scale of 142 pixels.

 
The fine pose estimation of the FRGC uncontrolled data produced positive results.  Excluding the 16 head detection failures, there was only one instance where the automatic rigid motion estimation produced a result more than 22 degrees from the ground truth result (in this case, the Y-axis).  That particular image also produced threshold errors (>10°) in the X and Z axes.  Out of the 491 remaining images, there were 15 instances of variation from the ground truth results greater than 10° (but < 22°) on one axis per instance (6 in X, 8 in Y and 1 in Z).  Excluding the failed results, the standard deviation of the 491 successes & partial errors produced standard deviation error rates in line with the FRGC controlled data.
 
Table 2: Uncontrolled Results for Rotation Estimation
 

Dataset

Failure

Partially
Degraded

Head Det. Rate

Fine Pose Estimation Success / Nonfailure

X Std. Dev. °

Y Std. Dev. °

Z Std. Dev. °

Uncontrolled 
(492/508)

1

15

96.9%

96.5% / 99.8%

3.12

2.38

1.19

Figure 5 below shows the distribution of errors in the rotation estimation of the Uncontrolled data.  The percentages for each bin are shown below the bin labels for each axis.  Bins to the right of 0 are positive rotations, and left of 0 are negative rotations.  Error percentages listed above each histogram are for variances that fell between 10° and 22° from the ground truth result.



Figure 5: Histogram of rotation errors along each axis, X (pitch), Y (yaw), and Z (roll).  The percentages for each bin are shown below the bin labels for each axis.  Bins to the right of 0 are positive rotations, and left of 0 are negative rotations.

 
Mobile Camera Data

A mobile camera was used to capture additional data. The camera had a resolution of 640 x 480 and sent imagery as an MPEG4 stream which was later rendered out to individual JPEG images. This data produced a larger number of errors including 4 instances in Y where the rotation error was greater than 22 degrees.  The data collected (although all inside) was taken without concern for ambient lighting conditions, open windows, etc., as well as limited concern for head rotation, and is in that regard more “uncontrolled” than the FRGC Uncontrolled dataset.  For example, Figure 6 below shows an example of increased pose variation presented in the mobile data:


Figure 6: An example of the mobile data used in the evaluation.  In this particular image, the ground truth data provided rotations of X: -1.40°, Y: -29.23°, Z: -0.68° compared to automatically detected rotations of X: -0.85°, Y: -22.04°, Z: 0.16.  The image on the right displays a pose-normalized rendering of the face in the original image on the left.

 
In spite of the additional failures, X and Z rotations remained stable.  There was a fairly significant increase in rotation error in the Y axis as shown in the table below.  However, nearly 95% of the images analyzed in this set returned at least partially successful results.
 
Table 3: Mobile Camera Results for Rotation Estimation
 

Dataset

Failure

Partially
Degraded

Head Detection Success

Fine Pose Estimation Success / Nonfailure

X Std. Dev. °

Y Std. Dev. °

Z Std. Dev. °

Mobile
(74/74)

4 (5.4%)

13 (17.6%)

100%

94.6% / 82.4%

3.55

4.65

2.10

The figure below shows the distribution of errors in the rotation estimation of the Uncontrolled data.  The percentages for each bin are shown below the bin labels for each axis.  Bins to the right of 0 are positive rotations, and left of 0 are negative rotations.  Error percentages listed above each histogram are for variances that fell between 10° and 22° from the ground truth result.



Figure 7: Histogram of rotation errors along each axis, X (pitch), Y (yaw), and Z (roll) for the mobile data. The percentages for each bin are shown below the bin labels for each axis. Bins to the right of 0 are positive rotations, and left of 0 are negative rotations.




Copyright ©2006 - ANIMETRICS, INC - All Rights Reserved