Face detection
Overview
The face detection process in Face SDK consists of face detection and determination of face landmarks.
Face detection
Face detection in Face SDK is performed using a set of detectors. Detector is an algorithm of the libfacerec library that uses neural networks to detect faces in images. The result of the detector is the coordinates of a bounding rectangle (bbox) around the detected face.

Detectors
The following modifications are currently available:
| Modification | Version | Face SDK version | Default parameters | Detection time CPU (ms)* | ||
|---|---|---|---|---|---|---|
| 640x480 | 1280x720 | 1920x1080 | ||||
| uld | 1 | 3.19 | precision_level=1, confidence_threshold=0.7, coarse_confidence_threshold=0.3 | 7 | 7 | 8 | 
| precision_level=2, confidence_threshold=0.7, coarse_confidence_threshold=0.3 | 37 | 38 | 40 | |||
| precision_level=3, confidence_threshold=0.7, coarse_confidence_threshold=0.3 | 194 | 187 | 197 | |||
| ssyv | 1 | 3.19 | confidence_threshold=0.5, iou_threshold=0.5 | 151 | 150 | 152 | 
| 2 | 46 | 46 | 47 | |||
| 3 | 96 | 94 | 96 | |||
| 4 | 3.20 | 1517 | 1506 | 1502 | ||
| ssyv_light | 1 | 3.24 | 11 | 11 | 12 | |
| blf_front | 1 | 3.19 | confidence_threshold=0.67, iou_threshold=0.5 | 3 | 5 | 9 | 
| blf_back | 1 | 11 | 13 | 18 | ||
"ssyv" is modification by default.
Detector parameters
- confidence_threshold— detection confidence threshold.
- iou_threshold— parameter. The metric determines whether two bboxes refer to the same face. For example, with a threshold of 0.5, two bboxes with an IOU greater than 0.5 are considered to belong to the same face.
- coarse_confidence_threshold(uld only) — coarse detection confidence threshold. During the detection the detector creates a set of bboxes, for each of them the- confidencevalue is specified (number from 0 to 1, shows the degree of confidence that there is a face in the bbox). The bboxes with confidence_thresholds are fed to the nms algorithm, which determines the intersections (matches) between the bboxes. The- coarse_confidence_thresholdparameter allows to cut off bboxes with low- confidence, which reduces the number of calculations performed by the nms-algorithm.
- precision_level(uld only) — defines the level of precision. The value is from the range [1, 3], the higher the value the higher the accuracy and the lower the speed. The default value is 1.
Examples of detectors operation in different conditions are presented below. You can customize detection thresholds (confidence_threshold) and other parameters when creating a processing block.
Click here to expand the table
| BLF (confidence_threshold=0.6) | ULD (precision_level=3, confidence_threshold=0.7) | 
|   |   | 
|   |   | 
Click here to expand the table
| ULD (confidence_threshold=0.4) | ULD (confidence_threshold=0.7) | 
|   |   | 
|   |   | 
|   |   | 
Face detector specification
- Input
- Output
- The input Context must contain an image in binary format
{
    "image" : {
        "format": "NDARRAY",
        "blob": "data pointer",
        "dtype": "uint8_t",
        "shape": [height, width, channels]
    }
}
- Once the processing block is running, an array of objects will be added, each containing the coordinates of the bounding rectangle, the detection confidence, the class and the identifier in that array
{
    "image" : {},
    "objects": [{
        "id": {"type": "long", "minimum": 0},
        "class": "face",
        "confidence": {"double",  "minimum": 0,  "maximum": 1},
        "bbox": [x1, y2, x2, y2]
    }]
}
Face landmarks
There are three sets of face landmarks: fda, mesh, tddfa.
- The tddfa set contains 68 facial points.
- The mesh set contains 470 3D facial points. We recommend to use it to get a 3D face mask.
The following modifications are currently available:
| Modification | Version | Face SDK version | Detection time CPU (ms)* | Detection time GPU (ms)** | 
|---|---|---|---|---|
| fda | 1 | 3.23 | 3 | 3 | 
| tddfa_faster | 1 | 3.19 | 2 | 1 | 
| tddfa | 1 | 3.19 | 6 | 2 | 
| mesh | 1 | 3.19 | 6 | 3 | 
** - GPU (NVIDIA GTX 10xx series)
The default modification is tddfa_faster.
Fitter specification
- Input
- Output
- The input Context must contain an image in binary format and objectsarray from Face Detector.
{
    "image" : {
    "format": "NDARRAY",
        "blob": "data pointer",
        "dtype": "uint8_t",
        "shape": [height, width, channels]
    },
    "objects": [{
        "id": {"type": "long", "minimum": 0},
        "class": "face",
        "confidence": {"double",  "minimum": 0,  "maximum": 1},
        "bbox": [x1, y2, x2, y2]
    }]
}
- After running the processing block, each object will be added: 21 key face points, points from tddfa or mesh set.
{
    "keypoints": {
        "left_eye_brow_left":   {"proj" : [x, y]},
        "left_eye_brow_up":     {"proj" : [x, y]},
        "left_eye_brow_right":  {"proj" : [x, y]},
        "right_eye_brow_left":  {"proj" : [x, y]},
        "right_eye_brow_up":    {"proj" : [x, y]},
        "right_eye_brow_right": {"proj" : [x, y]},
        "left_eye_left":        {"proj" : [x, y]},
        "left_eye":             {"proj" : [x, y]},
        "left_eye_right":       {"proj" : [x, y]},
        "right_eye_left":       {"proj" : [x, y]},
        "right_eye":            {"proj" : [x, y]},
        "right_eye_right":      {"proj" : [x, y]},
        "left_ear_bottom":      {"proj" : [x, y]},
        "nose_left":            {"proj" : [x, y]},
        "nose":                 {"proj" : [x, y]},
        "nose_right":           {"proj" : [x, y]},
        "right_ear_bottom":     {"proj" : [x, y]},
        "mouth_left":           {"proj" : [x, y]},
        "mouth":                {"proj" : [x, y]},
        "mouth_right":          {"proj" : [x, y]},
        "chin":                 {"proj" : [x, y]},
        "points": ["proj": [x, y]]
    }
}
Example of face detection and estimation of face landmarks
Create Processing Blocks
Create a detector and fitter processing block object using the FacerecService.createProcessingBlock method, passing a Context container with set parameters as an argument.
- C++
- Python
- Flutter
auto detectorConfigCtx = service->createContext();
detectorConfigCtx["unit_type"] = "FACE_DETECTOR";
detectorConfigCtx["modification"] = "ssyv";
auto fitterConfigCtx = service->createContext();
fitterConfigCtx["unit_type"] = "FACE_FITTER";
pbio::ProcessingBlock faceDetector = service->createProcessingBlock(detectorConfigCtx);
pbio::ProcessingBlock faceFitter = service->createProcessingBlock(fitterConfigCtx);
detectorConfigCtx = {
    "unit_type": "FACE_DETECTOR",
    "modification": "ssyv",
}
fitterConfigCtx = {
    "unit_type": "FACE_FITTER"
}
faceDetector = service.create_processing_block(detectorConfigCtx)
faceFitter = service.create_processing_block(fitterConfigCtx)
Map<String, dynamic> configCtx = {
    "unit_type": "FACE_DETECTOR",
    "modification": "ssyv",
};
ProcessingBlock faceDetector = service.createProcessingBlock(configCtx);
Processing Block configurable parameters
Run face detection
Pass the Context with a binary image into the Detector Processing Block:
- C++
- Python
- Flutter
ioData["image"] = imgCtx;
faceDetector(ioData)
ioData["image"] = imageCtx
faceDetector(ioData)
ioData["image"].placeValues(imageContext);
Context ioData = faceDetector.process(ioData);
The result of face detection is stored by the passed Context container according to the specification of the Processing Block.
Run fitting of face landmarks
Pass the Context container received after the Face Detector:
- C++
- Python
- Flutter
faceFitter(ioData)
faceFitter(ioData)
Context ioData = faceFitter.process(ioData);
The resulting Context can be passed to methods for estimating the age, gender, quality, and Liveness (Face Estimation) and to Recognizer.processing to create a template (See Face Recognition).