Benchmark Results
Facial Recognition
Facial recognition accuracy (NIST Standards)
The table below shows Face SDK quality metrics based on the National Institute of Standards and Technology (NIST) Standards:
| NIST Face Recognition Vendor Test (FRVT) 1:1 | Score | 
| VISA* True Acceptance Rate (@FAR 1E-6) | 99.70% | 
| Mugshot* True Acceptance Rate (@FAR 1E-5) | 99.76% | 
| Border* True Acceptance Rate (@FAR: 1E-6) | 99.75% | 
| WILD* True Acceptance Rate (@FAR 1E-4) | 96.94% | 
Click here to see the description of image types from the table*
- VISA – Full Frontal image type. The images have a size of 252x300 pixels. The mean interocular distance (IOD) is 69 pixels.
- Mugshot – Full Frontal image type. The images are of variable sizes. The mean IOD is 113 pixels.
- Border – The images are taken with a camera, which is oriented by an attendant toward a cooperating subject. This is done under time constraints so there are role, pitch and yaw angle variations. Background illumination is sometimes strong, so the face is under-exposed. There is some perspective distortion due to close range images. Some faces are partially cropped.
- Wild – The images include many photojournalism-style images. Resolution varies very widely. The images are very unconstrained with wide yaw and pitch pose variation. Faces can be occluded, including hair and hands.
Facial recognition accuracy on extended LFW dataset with Wild images
For this test we used extended LFW dataset (LFW dataset + our internal dataset). The set of mismatched pairs was increased and LFW errors were fixed to get accurate measurements at low FAR.
| FAR | 9.300 TAR (%) | 12.30 TAR (%) | 12.50 TAR (%) | 12.100 TAR (%) | 12.1000 TAR (%) | 
|---|---|---|---|---|---|
| 1e-4 | 98.7 | 98.9 | 99.2 | 99.4 | 99.5 | 
| 1e-5 | 98.0 | 98.1 | 98.7 | 99.1 | 99.4 | 
| 1e-6 | 96.6 | 96.7 | 97.9 | 98.5 | 99.3 | 
| 1e-7 | 93.2 | 94.3 | 96.3 | 97.5 | 98.9 | 
| 1e-8 | 32.2 | 88.0 | 89.5 | 94.1 | 97.7 | 
| 1e-9 | 10.5 | 82.0 | 80.9 | 84.9 | 93.9 | 
ROC-curve

Template extraction speed using CPU and GPU
Desktop
| Method | GPU (NVIDIA GTX 1070) | CPU (Core i5-9400 4.0GHz) | 
| 12v1000 | 47 ms | 442 ms | 
| 9v300 | 10 ms | 292 ms | 
| 12v100 | 8 ms | 49 ms | 
| 12v50 | 6 ms | 21 ms | 
| 12v30 | 5 ms | 12 ms | 
Mobile
| Method | CPU (Qualcomm Snapdragon 845) | 
| 12v1000 | 7968ms | 
| 9v300 | 1960ms | 
| 12v100 | 801ms | 
| 12v50 | 270ms | 
| 12v30 | 150ms | 
Note: The speed test was performed using Google Pixel 3.
Facial recognition speed for CPU
- Intel Xeon E5-2683 v4 GHz*
- CPU Core i7 4.5 GHz*
| Recognition method | Template generation (ms) | Accelerated Matching 1:N (ms) | Matching 1:1 (ms) | ||
|---|---|---|---|---|---|
| N = 104 | N = 106 | N = 107 | |||
| 6.7 | 40 (45**) | 0,25 | 12,1 | 126 | 0,04 | 
| 7.7 | 170 (180**) | 0,25 | 12,1 | 126 | 0,04 | 
| 8.7 | 20 (20**) | 0,25 | 12,1 | 126 | 0,04 | 
| 9.30 | 55 | 0,18 | 12,0 | 117 | 0,04 | 
| 9.300 | 402 | 0,18 | 12,0 | 117 | 0,04 | 
| 9.1000 | 1092 | 0,18 | 12,0 | 117 | 0,04 | 
| 9.30mask | 38 | 0,18 | 12,0 | 117 | 0,04 | 
| 9.300mask | 263 | 0,18 | 12,0 | 117 | 0,04 | 
| 9.1000mask | 469 | 0,18 | 12,0 | 117 | 0,04 | 
| 10.30 | 44 | 0,18 | 12,0 | 117 | 0,04 | 
| 10.100 | 71 | 0,18 | 12,0 | 117 | 0,04 | 
| 10.1000 | 1061 | 0,18 | 12,0 | 117 | 0,04 | 
| 11.1000 | 1396 | 0,22 | 15,0 | 151 | 0,04 | 
| 12.30 | 23*** | 0,20 | 15,0 | 120 | 0,04 | 
| 12.50 | 37*** | 0,20 | 15,0 | 120 | 0,04 | 
| 12.100 | 79*** | 0,20 | 15,0 | 120 | 0.04 | 
| 12.1000 | 647*** | 0,40 | 20,0 | 170 | 0,04 | 
| Recognition method | Template generation (ms) | Accelerated Matching 1:N (ms) | Matching 1:1 (ms) | ||
|---|---|---|---|---|---|
| N = 104 | N = 106 | N = 107 | |||
| 6.7 | 40 (45**) | 0,25 | 12,1 | 126 | 0,04 | 
| 7.7 | 170 (180**) | 0,25 | 12,1 | 126 | 0,04 | 
| 8.7 | 20 (20**) | 0,25 | 12,1 | 126 | 0,04 | 
| 9.30 | 30 | 0,18 | 12,0 | 117 | 0,04 | 
| 9.300 | 260 (125***) | 0,18 | 12,0 | 117 | 0,04 | 
| 9.1000 | 730 (305***) | 0,18 | 12,0 | 117 | 0,04 | 
| 9.30mask | 20 | 0,18 | 12,0 | 117 | 0,04 | 
| 9.300mask | 160 (79***) | 0,18 | 12,0 | 117 | 0,04 | 
| 9.1000mask | 290 (144***) | 0,18 | 12,0 | 117 | 0,04 | 
| 10.30 | 24 (16***) | 0,18 | 12,0 | 117 | 0,04 | 
| 10.100 | 40 (24***) | 0,18 | 12,0 | 117 | 0,04 | 
| 10.1000 | 690 (355***) | 0,18 | 12,0 | 117 | 0,04 | 
| 11.1000 | 865 (425***) | 0,22 | 15,0 | 151 | 0,04 | 
| 12.30 | 10*** | 0,20 | 15,0 | 120 | 0,04 | 
| 12.50 | 18*** | 0,20 | 15,0 | 120 | 0,04 | 
| 12.100 | 41*** | 0,20 | 15,0 | 120 | 0.04 | 
| 12.1000 | 412*** | 0,40 | 20,0 | 170 | 0,04 | 
* – characteristics specified in this table are given for a single-core CPU.
** – template extraction time when processing_less_memory_consumption was set to true in the FacerecService.createRecognizer call for creating a Recognizer.
*** – template extraction time using the AVX2 instruction set (see Facial Recognition).
Note:
- Accelerated search time is given for k=1. As for larger values ofk, the time will increase up to the search time without acceleration.
- Accelerated search is implemented only for the recognition methods 6.5, 6.6, 6.7, 7.3, 7.6, 7.7, 8.6, 8.7, 9.30, 9.300, 9.1000, 10v30, 10v100, 10v1000, 11v1000, 12.30, 12.50 , 12.100, 12.1000.
- To achieve this speed, the templates in the index must be located in order of creation (by using the Recognizer.processingorRecognizer.loadTemplatemethod).
- To achieve higher speed, use GPU (see GPU Usage).
Memory Characteristics
| Recognition method | Serialized template size (Bytes) | Template size in RAM (Bytes) | Memory consumption* (MB) | 
|---|---|---|---|
| 6.7 | 536 | 636 | 105 (85**) | 
| 7.7 | 536 | 636 | 195 (163**) | 
| 8.7 | 536 | 636 | 52 (40**) | 
| 9.30 | 280 | 380 | 155 | 
| 9.300 | 280 | 380 | 210 | 
| 9.1000 | 280 | 380 | 290 | 
| 10.30 | 280 | 380 | 160 | 
| 10.100 | 280 | 380 | 180 | 
| 10.1000 | 280 | 380 | 270 | 
| 11.1000 | 296 | 396 | 480 | 
| 12.30 | 230 | 330 | 160 | 
| 12.50 | 261 | 361 | 170 | 
| 12.100 | 337 | 437 | 180 | 
| 12.1000 | 400 | 500 | 440 | 
* – the amount of memory consumed doesn't depend on the number of the Recognizer objects created by this method
** – memory consumption when processing_less_memory_consumption was set to true in the FacerecService.createRecognizer call for Recognizer creation
Face Detection
Face Detection speed
- CPU Intel Xeon E5-2683 v4 (Single-Core)
- GPU (NVIDIA GTX 1080 Ti)
- CPU Core i7 4.5 GHz (Single-Core)
| Configuration file | Capture time (ms) | |||||
|---|---|---|---|---|---|---|
| 640x480, 1 face | 640x480, 4 faces | 1280x720, 1 face | 1280x720, 4 faces | 1920x1080, 1 face | 1920x1080, 4 faces | |
| common_capturer_blf_fda_auto.xml | 9-43 | 17-52 | 17-50 | 17-55 | 33-69 | 25-57 | 
| common_capturer_blf_fda_back.xml | 43 | 52 | 50 | 55 | 67 | 57 | 
| common_capturer_blf_fda_front.xml | 9 | 17 | 17 | 17 | 33 | 25 | 
| common_capturer_refa_fda_a.xml | 1001 | 996 | 777 | 792 | 864 | 858 | 
| common_capturer_uld_fda.xml (min_size=150) | 19 | 28 | 24 | 29 | 36 | 30 | 
| common_capturer_uld_fda.xml (min_size=90) | 88 | 97 | 93 | 100 | 109 | 102 | 
| common_capturer_uld_fda.xml (min_size=50) | 377 | 385 | 375 | 385 | 402 | 397 | 
| safety_city_q1.xml | 99 | 107 | 100 | 112 | 110 | 117 | 
| safety_city_q2.xml | 376 | 383 | 379 | 385 | 399 | 393 | 
| remote_identification_q1.xml | 1000 | 1001 | 773 | 791 | 863 | 858 | 
| remote_identification_q2.xml | 44 | 51 | 50 | 55 | 66 | 56 | 
| access_control_system_one_face_q1.xml | 44 | 52 | 50 | 56 | 66 | 56 | 
| access_control_system_one_face_q2.xml | 43 | 52 | 50 | 55 | 67 | 55 | 
| access_control_system_one_face_q3.xml | 88 | 96 | 94 | 98 | 107 | 102 | 
| access_control_system_several_faces_q1.xml | 1004 | 1007 | 773 | 788 | 863 | 858 | 
| access_control_system_several_faces_q2.xml | 87 | 95 | 92 | 98 | 106 | 101 | 
| common_capturer4_fda.xml | 23 | 35 | 49 | 108 | 94 | 143 | 
| common_capturer4_fda_with_angles.xml | 481 | 697 | 414 | 768 | 312 | 410 | 
| common_capturer4_mesh.xml | 27 | 60 | 57 | 142 | 102 | 175 | 
| common_capturer4_mesh_with_angles.xml | 492 | 732 | 424 | 788 | 321 | 443 | 
| common_capturer4_fda_singleface.xml | 31 | - | 60 | - | 112 | - | 
| common_capturer4_mesh_singleface.xml | 34 | - | 68 | - | 120 | - | 
| Configuration file | Capture time (ms) | |||||
|---|---|---|---|---|---|---|
| 640x480, 1 face | 640x480, 4 faces | 1280x720, 1 face | 1280x720, 4 faces | 1920x1080, 1 face | 1920x1080, 4 faces | |
| common_capturer_blf_fda_auto.xml | 4-5 | 10-12 | 6-8 | 13-14 | 17-20 | 24-27 | 
| common_capturer_blf_fda_back.xml | 5 | 12 | 8 | 14 | 20 | 27 | 
| common_capturer_blf_fda_front.xml | 4 | 10 | 6 | 13 | 17 | 24 | 
| common_capturer_refa_fda_a.xml | 236 | 240 | 229 | 235 | 170 | 176 | 
| common_capturer_uld_fda.xml (min_size=150) | 4 | 10 | 5 | 11 | 13 | 20 | 
| common_capturer_uld_fda.xml (min_size=90) | 14 | 21 | 17 | 23 | 26 | 34 | 
| common_capturer_uld_fda.xml (min_size=50) | 27 | 34 | 27 | 35 | 47 | 49 | 
| Configuration file | Capture time (ms) | |||||
|---|---|---|---|---|---|---|
| 640x480, 1 face | 640x480, 4 faces | 1280x720, 1 face | 1280x720, 4 faces | 1920x1080, 1 face | 1920x1080, 4 faces | |
| common_capturer4_fda.xml | 13 | 25 | 34 | 49 | 81 | 103 | 
| common_capturer4_fda_with_angles.xml | 282 | 387 | 260 | 356 | 273 | 370 | 
| common_capturer4_mesh.xml | 18 | 47 | 39 | 72 | 87 | 735 | 
| common_capturer4_mesh_with_angles.xml | 291 | 415 | 268 | 383 | 281 | 398 | 
| common_capturer_blf_fda_auto.xml | 6-30 | 12-36 | 8-32 | 14-38 | 19-44 | 26-51 | 
| common_capturer_blf_fda_back.xml | 30 | 36 | 32 | 38 | 44 | 51 | 
| common_capturer_blf_fda_front.xml | 6 | 12 | 8 | 14 | 19 | 26 | 
| common_capturer_refa_fda_a.xml | 644 | 650 | 512 | 518 | 580 | 586 | 
| common_capturer_uld_fda.xml (min_size=150) | 12 | 18 | 13 | 19 | 21 | 28 | 
| common_capturer_uld_fda.xml (min_size=90) | 58 | 70 | 60 | 73 | 77 | 91 | 
| common_capturer_uld_fda.xml (min_size=50) | 253 | 272 | 253 | 273 | 281 | 302 | 
| common_capturer4_fda_singleface.xml | 16 | - | 51 | - | 123 | - | 
| common_capturer4_mesh_singleface.xml | 23 | - | 58 | - | 129 | - | 
Note: Actual capture time may vary depending on the image content.
Capturer configuration files
Click here to see the list of the capturer configuration files
| File | Detector | Set of points | Angles (roll/yaw/pitch) | Description and use case | 
| common_capturer4_fda.xml | lbf | fda | [-30;30][-60;60][-60;60] | Frontal face detector. | 
| common_capturer4_fda_with_angles.xml | lbf | fda | [-90;90][-60;60][-60;60] | Frontal face detector. Adapted for a wide range of head rotation angles. | 
| common_capturer4_fda_with_angles_noise.xml | lbf | fda | [-90;90][-60;60][-60;60] | Frontal face detector. Adapted for a wide range of head rotation angles. Suitable for images with high noise level. | 
| common_capturer4_fda_singleface.xml | lbf | fda | [-30;30][-60;60][-60;60] | Only one frontal face is detected. | 
| common_capturer4_fda_singleface_with_angles.xml | lbf | fda | [-90;90][-75;75][-60;60] | Only one frontal face is detected. The detector is adapted for a wide range of head rotation angles. | 
| common_capturer4_fda_singleface_with_angles_noise.xml | lbf | fda | [-90;90][-75;75][-60;60] | Only one frontal face is detected. The detector is adapted for a wide range of head rotation angles. Suitable for images with high noise level. | 
| common_capturer4_lbf.xml | lbf | doublelbf | [-30;30][-60;60][-60;60] | Frontal face detector. | 
| common_capturer4_lbf_singleface.xml | lbf | doublelbf | [-30;30][-60;60][-60;60] | Only one frontal face is detected. | 
| common_capturer4_mesh.xml | lbf | mesh | [-30;30][-60;60][-60;60] | Frontal face detector that allows you to get a 3D face mask. | 
| common_capturer4_mesh_with_angles.xml | lbf | mesh | [-90;90][-60;60][-60;60] | Frontal face detector adapted for a wide range of head rotation angles that allows you to get a 3D face mask. | 
| common_capturer4_mesh_with_angles_noise.xml | lbf | mesh | [-90;90][-60;60][-60;60] | Frontal face detector adapted for a wide range of head rotation angles and suitable for images with high noise level. The detector allows you to get a 3D face mask. | 
| common_capturer4_mesh_singleface.xml | lbf | mesh | [-30;30][-60;60][-60;60] | Only one frontal face is detected. The detector allows you to get a 3D face mask. | 
| common_capturer4_mesh_singleface_with_angles.xml | lbf | mesh | [-90;90][-75;75][-60;60] | Only one frontal face is detected. The detector is adapted for a wide range of head rotation angles and allows you to get a 3D face mask. | 
| common_capturer4_mesh_singleface_with_angles_noise.xml | lbf | mesh | [-90;90][-75;75][-60;60] | Only one frontal face is detected. The detector is adapted for a wide range of head rotation angles, suitable for images with high noise level and allows you to get a 3D face mask. | 
| common_capturer_blf_fda_front.xml | blf | fda | [-70;70][-90;90][-70;70] | Detection of large face images (the face should take up most of the frame size). Suitable for detection of masked faces. | 
| common_capturer_blf_fda_back.xml | blf | fda | [-70;70][-90;90][-70;70] | Detection of several faces or small face images. Suitable for detection of masked faces. | 
| common_capturer_blf_fda_auto.xml | blf | fda | [-70;70][-90;90][-70;70] | Detection of large and small face images (the parameters `resolution_width` and `min_face_size` should be specified in the configuration file). Suitable for detection of masked faces. | 
| common_capturer_refa_fda_a.xml | refa | fda | [-70;70][-90;90][-70;70] | Face detector recommended for use in expert systems. The detector provides face detection with the largest coverage of rotation angles and maximum quality (including masked faces). | 
| common_capturer_uld_fda.xml.xml | uld | fda | [-70;70][-90;90][-70;70] | Detection of large and small face images. Suitable for detection of masked faces. | 
| common_video_capturer_fda.xml | lbf | fda | [-30;30][-60;60][-60;60] | Frontal face video tracker (RGB only). | 
| common_video_capturer_lbf.xml | lbf | singlelbf | [-30;30][-60;60][-60;60] | Frontal face video tracker (RGB only). | 
| common_video_capturer_mesh.xml | lbf | mesh | [-30;30][-60;60][-60;60] | Frontal face video tracker (RGB only) that allows you to get a 3D face mask. | 
| fda_tracker_capturer.xml | lbf | fda | [-30;30][-60;60][-60;60] | Frontal face video tracker. | 
| fda_tracker_capturer.w.xml | lbf | fda | [-30;30][-60;60][-60;60] | Frontal face video tracker that can be used in case of insufficient lighting. Note that false detections can occur a bit more often in this case. | 
| fda_tracker_capturer_mesh.xml | lbf | fda | [-30;30][-60;60][-60;60] | Frontal face video tracker that allows you to get a 3D face mask. | 
| fda_tracker_capturer_fake_detector.xml | lbf | fda | [-30;30][-60;60][-60;60] | Detection speed is higher because only fitter is used (no detector). Suitable only if a face takes up most of the image size. | 
| fda_tracker_capturer_blf.xml | blf | fda | [-30;30][-60;60][-60;60] | Frontal face video tracker. Suitable for detection of masked faces. | 
| fda_tracker_capturer_refa_a.xml | refa | fda | [-70;70][-90;90][-70;70] | Frontal face video tracker. Recommended for use in expert systems. The tracker provides face detection with the largest coverage of rotation angles and maximum quality (including masked faces). | 
| fda_tracker_capturer_uld_fda.xml.xml | uld | fda | [-70;70][-90;90][-70;70] | Frontal face video tracker that can be used to detect faces of different size. Suitable for detection of masked faces. | 
| manual_capturer_fda.xml | lbf | fda | [-30;30][-60;60][-60;60] | Eye points should be manually specified. The remaining points are calculated based on the eye points. | 
| manual_capturer_mesh.xml | lbf | mesh | [-30;30][-60;60][-60;60] | Eye points should be manually specified. The remaining points are calculated based on the eye points that allows you to get a 3D face mask. | 
| video_worker_fdatracker_refa_fda.xml | refa | fda | [-70;70][-90;90][-70;70] | Face detector recommended for use in expert systems. The detector provides face detection with the largest coverage of rotation angles and maximum quality (including masked faces). | 
Processing Blocks
Face detection speed
- CPU Intel Xeon E5-2683 v4 (Single-Core)
| Type | Modification | Version | Detection time (ms) | ||
|---|---|---|---|---|---|
| 640x480, 4 faces | 1280x720, 4 faces | 1920x1080, 4 faces | |||
| FACE_DETECTOR | uld (min_size=150) | 1 | 7 | 8 | 9 | 
| uld (min_size=100) | 1 | 37 | 40 | 43 | |
| uld (min_size=50) | 1 | 193 | 192 | 203 | |
| ssyv | 1 | 153 | 153 | 157 | |
| ssyv | 2 | 47 | 49 | 54 | |
| ssyv | 3 | 96 | 99 | 104 | |
| blf_front | 1 | 4 | 6 | 11 | |
| blf_back | 1 | 12 | 14 | 20 | |
| HUMAN_BODY_DETECTOR | ssyv | 1 | 238 | 241 | 244 | 
| OBJECT_DETECTOR | ssyx | 1 | 2027 | 2023 | 2028 | 
Face estimation speed
- CPU Intel Xeon E5-2683 v4 (1 Single-Core)
| Type | Modification | Version | Estimation time (ms) | ||
|---|---|---|---|---|---|
| LIVENESS_ESTIMATOR | v4 | 1 | 733 | ||
| 1 | 41 | ||||
| EMOTION_ESTIMATOR | heavy | 1 | 29 | ||
| AGE_ESTIMATOR | heavy | 1 | 2 | ||
| heavy | 2 | 6 | |||
| light | 1 | 2 | |||
| light | 2 | 6 | |||
| GENDER_ESTIMATOR | heavy | 1 | 2 | ||
| heavy | 2 | 6 | |||
| light | 1 | 2 | |||
| light | 2 | 6 | |||
| MASK_ESTIMATOR | light | 1 | 2 | ||
| light | 2 | 2 | |||
| QUALITY_ASSESSMENT_ESTIMATOR | assessment | 1 | 65 | ||
| estimation | 1 | 97 | |||
| HUMAN_POSE_ESTIMATOR | heavy | 1 | 195 | ||
| FACE_FITTER | tddfa_faster | 1 | 3 | ||
| tddfa | 1 | 7 | |||
| mesh | 1 | 7 | |||
Estimation accuracy of processing blocks
- QUALITY_ASSESSMENT_ESTIMATOR
- LIVENESS_ESTIMATOR
- AGE_ESTIMATOR
- GENDER_ESTIMATOR
| Modification | Percentage of worst samples discarded by total_score | FNMR | FPR | 
| assessment | 1 | 0.0097 | 1.98E-07 | 
| 5 | 0.0087 | 2.00E-07 | |
| 10 | 0.0081 | 2.11E-07 | |
| estimation | 1 | 0.0079 | 1.93E-07 | 
| 5 | 0.0062 | 1.93E-07 | |
| 10 | 0.0059 | 1.97E-07 | 
| Modification | BPCER | APCER | 
| 0.19 | 0.27 | |
| v4 | 0.028 | 0.133 | 
| Modification | Version | Accuracy, y/o (mean Average Error) | 
| heavy | 1 | 4.7 | 
| 2 | 3.5 | |
| light | 1 | 5.5 | 
| 2 | 4.9 | 
| Modification | Version | Accuracy, % | 
| heavy | 1 | 96 | 
| 2 | 97 | |
| light | 1 | 95 | 
| 2 | 96 | 
Estimation of Face Attributes
| Face Attributes | Accuracy | 
| Gender | 95% | 
| Emotion | 80% | 
| Age | 3.95 y/o (mean Average Error) | 
| No mask (face without a mask) | 99% | 
| Has mask (face with a mask) | 97% |