Face Capturing
Overview
Face capturing includes the following stages: face detection, fitting of face landmarks and calculating of head rotation angles.
Face Detection
Face detection in Face SDK is performed using a set of detectors. The detector is a libfacerec library algorithm that applies neural networks to detect faces in images and video streams. The detection result is the coordinates of the bounding box (bbox) around the detected face.

After the face is detected, to optimize further operations, the image is automatically cropped according to the calculated coordinates of the bounding box (bbox). Fitting of anthropometric points and calculating of head rotation angles are performed on the cropped image (face crop).
Detectors
Currently, the following detectors are available:
- LBF: An outdated detector, not recommended for use;
- BLF: A detector that provides higher quality and faster detection than LBF for faces of a medium size and larger (including masked faces). On Android you can use GPU acceleration (enabled by default);
- REFA: A detector that is slower than LBF and BLF detectors. But at the same time it guarantees a better detection quality for faces of various sizes (including masked faces). Recommended for use in expert systems;
- ULD: A new detector that is faster than REFA. This detector lets you detect faces of various sizes (including masked faces).
In tables below you can see how detectors work in various conditions. The detection thresholds (score_threshold), the minimum size of a detected face (min_size), and other parameters can be configured in the capturer configuration files located in the conf folder of the Face SDK distribution kit (See the Capturer Configuration Parameters section).
Click here to expand the table
| BLF (score_threshold=0.6) | REFA (min_size=0.2, score_threshold=0.89) | ULD (min_size=10, score_threshold=0.7) | 
|  |  |  | 
|  |  |  | 
Click here to expand the table
| ULD (score_threshold=0.4) | ULD (score_threshold=0.7) | REFA (score_threshold=0.89) | 
|  |  |  | 
|  |  |  | 
|  |  |  | 
Fitting of Face Landmarks
Face landmarks in Face SDK are determined using fitters. The fitter is a special algorithm of the libfacerec library, that positions a set of anthrometric points with 2D/3D coordinates linked to a specific detected face. Several types of fitters that use different sets of face landmarks are described below.

Face Landmarks
Note: You can learn how to display face landmarks and head rotation angles in our tutorial.
Four sets of face landmarks: esr, singlelbf, doublelbf, fda, mesh.
- The esr set is our first set that was the only set available in previous SDK versions. The esr set contains 47 points.
- The singlelbf and doublelbf provide higher accuracy than esr. The singlelbf set contains 31 points. The doublelbf set contains 101 points. Actually, the doublebf set consists of two concatenated sets – the last 31 points of doublelbf duplicate the singlelbf set (in the same order).
- The fda provides high accuracy in a wide range of facial angles (up to the full profile), in contrast to the previous sets. So we recommend that you use detectors with this set. However, recognition algorithms still require face samples to be close to frontal. The fda set contains 21 points.
- At the moment, the mesh set is the newest. It contains 470 3D points of a face. We recommend that you use this set to get a 3D face mesh.
| fda set of points. RawSample.getLeftEye returns point 7. RawSample.getRightEye returns point 10 | esr set of points. RawSample.getLeftEye returns point 16. RawSample.getRightEye returns point 17 | 
|---|---|
|  |  | 
| singlelbf set of points. RawSample.getLeftEye returns point 29. RawSample.getRightEye returns point 30 | first 70 points of doubleldb set of points (the rest 31 points are taken from singlelbf). RawSample.getLeftEye returns point 68. RawSample.getRightEye returns point 69 | 
|---|---|
|  |  | 
| mesh set of points. RawSample.getLeftEye returns point 468. | RawSample.getRightEye returns point 469 | 
|---|---|
|  |  | 
Iris Landmarks
In addition to the standard set of face landmarks, you can get iris landmarks - an extended set of eye points, which includes points of pupils and eyelids. The returned vector contains 40 points for the left and right eyes in the order shown in the image below. For each eye 20 points are returned: the first 5 points refer to the pupil (its center and points on the circle), the remaining 15 points form the contour of the eyelids. A rendering example is available in demo (C++/Java/C#).

Head Rotation Angles
Calculating head rotation angles relative to the observation axis. The result of this stage is three head rotation angles: pitch, yaw, roll. The accuracy of these angles depends on the set of anthropometric points used.

Face Tracking (optionally)
To track faces in video streams, you need to use trackers. The tracker is an algorithm of the libfacerec library that allows tracking face positions from frame to frame. As a result, a person's track is formed (a sequence of video stream frames, which belong to the same person). Each track is assigned a unique identifier (track_id) that doesn't change until the face is lost.
Face Capturing
Creating Capturer Object
To detect faces, first of all, you need to create the Capturer class object. When creating a class object, capturer configuration file should be used. It specifies the type of detector and the type of set of anthropometric points (For more details, see Capturer Configuration files). Also, in the capturer configuration file, you can configure other detection parameters that affect the quality and speed of the entire algorithm. The detector and the set of anthropometric points used are given in the name of the capturer configuration file. For example: common_capturer_blf_fda_front.xml: the blf detector, the fda set of points.
For tracking you can use capturer configuration files of two types:
- common_video_capturer - Provides higher speed, but lower quality compared to fda_tracker_capturer.
- fda_tracker_capturer - Provides higher quality, but lower speed compared to common_video_capturer.
All available configuration files are stored in conf folder of Face SDK distribution kit.
Example 1
Create Capturer object using FacerecService.createCapturer method and specify a name of capturer configuration file as an argument.
- C++
- C#
- Java
- Python
pbio::Capturer::Ptr capturer = service->createCapturer("common_capturer4.xml");
Capturer capturer = service.createCapturer("common_capturer4_lbf.xml");
final Capturer capturer = service.createCapturer(service.new Config("common_capturer4_lbf.xml"));
capturer = service.create_capturer(Config("common_capturer4_lbf.xml"))
Example 2
If you need to override some parameter values of capturer configuration file when creating Capturer, follow the steps below:
- Create FacerecService.Configobject, specify the name of configuration file as an argument.
- Override the parameter values using Config.overrideParametermethod.
- Create Capturerobject usingFacerecService.createCapturermethod, passFacerecService.Configas an argument.
- C++
- C#
- Java
- Python
pbio::FacerecService::Config capturer_config("common_capturer4.xml");
capturer_config.overrideParameter("min_size", 200);
pbio::Capturer::Ptr capturer = service->createCapturer(capturer_config);
FacerecService.Config capturerConfig = new FacerecService.Config("common_capturer4_lbf.xml");
capturerConfig.overrideParameter("min_size", 200);
Capturer capturer = service.createCapturer(capturerConfig);
FacerecService.Config capturerConfig = service.new Config("common_capturer4_lbf.xml");
capturerConfig.overrideParameter("min_size", 200);
final Capturer capturer = service.createCapturer(capturerConfig);
capturer_config = Config("common_capturer4_lbf.xml")
capturer_config.override_parameter("min_size", 200)
capturer = service.createCapturer(capturer_config)
Example 3
Some parameter values can be changed in already created Capturer object using Capturer.setParameter method.
- C++
- C#
- Java
- Python
pbio::Capturer::Ptr capturer = service->createCapturer("common_capturer4.xml");
capturer->setParameter("min_size", 200);
capturer->setParameter("max_size", 800);
// capturer->capture(...);
// ...
capturer->setParameter("min_size", 100);
capturer->setParameter("max_size", 400);
// capturer->capture(...);
Capturer capturer = service.createCapturer("common_capturer4_lbf.xml");
capturer.setParameter("min_size", 200);
capturer.setParameter("max_size", 800);
// capturer.capturer(...);
// ...
capturer.setParameter("min_size", 100);
capturer.setParameter("max_size", 400);
// capturer.capture(...);
Capturer capturer = service.createCapturer(service.new Config("common_capturer4_lbf.xml"));
capturer.setParameter("min_size", 200);
capturer.setParameter("max_size", 800);
// capturer.capturer(...);
// ...
capturer.setParameter("min_size", 100);
capturer.setParameter("max_size", 400);
// capturer.capture(...);
capturer = service.create_capturer(Config("common_capturer4_lbf.xml"))
capturer.set_parameter("min_size", 200)
capturer.set_parameter("max_size", 800)
# capturer.capturer(...)
# ...
capturer.set_parameter("min_size", 100)
capturer.set_parameter("max_size", 400)
# capturer.capture(...)
Capturer Configuration Parameters
Click here to see the list of parameters inside the capturer configuration files that can be changed using the FacerecService.Config.overrideParameter object
- coarse_score_threshold: Coarse detection confidence threshold. During detection, the detector creates a set of bboxes, each of which has a- scorevalue (a number from 0 to 1, indicating the degree of confidence that a face is in the bbox). Bboxes with scores are processed by the nms algorithm, which determines intersections (matches) between bboxes. The- coarse_score_thresholdparameter allows cutting off bboxes with a low- score, which reduces the number of calculations performed by the nms algorithm.
- score_threshold: Detection confidence threshold.
- max_processed_widthand- max_processed_height: (for trackers) The parameter limits the size of the image passed to the internal detector of new faces.
- min_sizeand- max_size: Minimum and maximum size of a face for detection (for trackers: the size is defined for an image already downscaled according to the restrictions- max_processed_widthand- max_processed_height).
- min_neighbors: An integer detector parameter, used to reject false detections. You can change this parameter based on the situation. For example, you can increase the value if a large number of false detections is observed, or decrease the value if a large number of faces is not detected. If you aren't sure, we recommend you not change this parameter.
- use_advanced_multithreading: improves performance when working in multithreaded mode.
- nms_iou_threshold: Analogue of the- min_neighborsparameter, used in most current detectors.
- min_detection_period: (for tracking) A real number that means the minimum time (in seconds) between two runs of the internal detector. A zero value means ‘no restrictions’. The parameter is used to reduce the processor load. Large values increase the latency in detection of new faces.
- max_detection_period: (for tracking) An integer that means the max time (in frames) between two runs of the internal detector. A zero value means ‘no restrictions’. For example, if you process a video offline, you can set the value to- 1so as not to miss a single person.
- max_occlusion_time_wait: (for tracking) A real number in seconds. When face occlusion is detected, the tracker holds the face position and tries to track it on new frames during this time.
- fda_max_bad_count_wait: An integer. When- fda_trackerdetects the decline in the face quality, the algorithm tries to track this face with the general purpose tracker (instead of the fda method designed and tuned for faces) during at most- fda_max_bad_count_waitframes.
- base_angle: An integer:- 0,- 1,- 2, or- 3. Camera orientation angles:- 0means- standard(default),- 1means- +90 degrees,- 2means- -90 degrees,- 3means- 180 degrees. When you change camera orientation, you need to set the new orientation value for this parameter, otherwise the detection quality will decrease.
- fake_detections_cnt: An integer. Number of start positions to search a face using- video_worker_fdatracker_fake_detector.xml. The start position is the fixed position of the face in the image. We can set the coordinates of start position if we are sure that there is a face in the given area of the image. The image with the marked start position goes to fake detector, that sends the image directly to the fitter. It is assumed that the image already has a face, which means that you can immediately proceed to the fitting of anthropometric points.
- fake_detections_period: An integer. Each start position will be used once in- fake_detections_periodframes.
- fake_rect_center_xN,- fake_rect_center_yN,- fake_rect_angleN,- fake_rect_sizeN: Real numbers for parameters of start positions. N is from- 0to- fake_detections_cnt – 1inclusive.- fake_rect_center_xN– x coordinate of a center relative to the image width.- fake_rect_center_yN– y coordinate of a center relative to the image height.- fake_rect_angleN– roll angle in degrees.- fake_rect_sizeN– size relative to max(image width, image height).
- downscale_rawsamples_to_preferred_size: An integer,- 1means- enabled,- 0means- disabled. Default value is- enabled. When enabled,- Capturerdownscales each sample to the suitable size (- RawSample.downscaleToPreferredSize) to reduce memory consumption. However, it decreases the system performance. We recommend that you disable- downscale_rawsamples_to_preferred_sizeand use- RawSample.downscaleToPreferredSizemanually for- RawSamplesthat you need to save or keep in RAM for a long time.
- iris_enabled: get an extended set of eye points.- 1means enabled (the returned vector contains eye points),- 0means disabled (the returned vector is empty).
Starting Capturing
You can pass an image to the detector in two ways:
- Pass the data of the encoded image in JPG, PNG, TIF or BPM format to the method Capturer.capture
- Pass the data of the decoded image to the method Capturer.capture, using theRawImageclass
The captured face is stored in RawSample object.
- C++
- C#
- Java
- Python
// read an image from a file
cv::Mat image;
image = cv::imread(image_path, cv::IMREAD_COLOR);
// create RawImage object
cv::Mat input_image;
cv::cvtColor(image, input_image, cv::COLOR_BGR2RGB);
pbio::RawImage input_rawimg(input_image.cols, input_image.rows, pbio::RawImage::Format::FORMAT_RGB, input_image.data);
// run detection
std::vector<pbio::RawSample::Ptr> samples = capturer->capture(input_rawimg);
// read an image from a file
OpenCvSharp.Mat image = OpenCvSharp.Cv2.ImRead(image_path);
// create RawImage object
byte[] byte_image = new byte[image.Total() * image.Type().Channels];
Marshal.Copy(image.DataStart, byte_image, 0, (int)byte_image.Length);
RawImage input_rawimg = new RawImage(frame.Width, frame.Height, RawImage.Format.FORMAT_BGR, byte_image);
// run detection
List<RawSample> samples = capturer.capture(input_rawimg);
// read an image from a file
byte[] byte_image;
byte_image = readImage(image_path);
// create RawImage object
RawImage input_rawimg = new RawImage(width, height, RawImage.Format.FORMAT_YUV_NV21, byte_image);
// run detection
Vector<RawSample> samples = capturer.capture(rawImage);
# read an image from a file
image = cv2.imread(image_path)
# create RawImage object
input_rawimg = RawImage(image.shape[1], image.shape[0], Format.FORMAT_BGR, image.tobytes())
# run detection
samples = capturer.capture(input_rawimg)
For detection combined with tracking you can also call the Capturer.resetHistory method to delete all frames and faces from the history and start tracking on a new video sequence.
Capturing Result
RawSample is an interface object that stores the capturing result. The following operations can be done using RawSample methods:
- Get a sample id (RawSample.getID) (only for detection combined with tracking);
- Get the detection confidence score (for BLF, REFA, ULD detectors). To do this, call the RawSample.getScore()method. As a result, you'll receive a float number in the range of [0, 1];
- Get a face rectangle (RawSample.getRectangle), angles (RawSample.getAngles), left / right eye (RawSample.getLeftEye/RawSample.getRightEye), face landmarks (RawSample.getLandmarks), only if the face is frontal;
- Get an extended set of eye points, which includes points of pupils and eyelids (RawSample.getIrisLandmarks());
- Downscale an internal face image to suitable size (RawSample.downscaleToPreferredSize);
- Serialize an object in a binary stream (RawSample.saveorRawSample.saveWithoutImage). You can deserialize it later usingFacerecService.loadRawSampleorFacerecService.loadRawSampleWithoutImage;
- Normalize a face image with subsequent cropping (see Face Normalization).
RawSample object can also be passed to the methods of the age, gender, quality and liveness estimation (see Face Estimation, test_facecut, test_videocap), or to Recognizer.processing for template creating (see Facial Recognition, test_identify).
Face Normalization
Face normalization refers to the rotation of a non-frontal face to a frontal position. It is needed for better handling of face recognition and other operations with detected faces. A face can be normalized by one of the following RawSample methods:
- RawSample.cutFaceImage: the normalized face is saved to the specified stream (for example, to a file), the encoding format is selected via- RawSample.ImageFormat
- RawSample.cutFaceRawImage: the normalized face is returned in the- RawImageformat (it stores the non-coded image pixels in the RGB/BGR/GRAY format (the format is selected via- RawImage.Format)
Examples of using RawSample.cutFaceRawImage:
- C++
- C#
- Java
- Python
auto raw_image_crop = sample->cutFaceRawImage(
    pbio::RawImage::Format::FORMAT_BGR,
    pbio::RawSample::FACE_CUT_FULL_FRONTAL);
cv::Mat img_crop(raw_image_crop.height, raw_image_crop.width, CV_8UC3, (void*) raw_image_crop.data);
RawImage raw_image_crop = sample.cutFaceRawImage(RawImage.Format.FORMAT_BGR, RawSample.FaceCutType.FACE_CUT_FULL_FRONTAL);
OpenCvSharp.Mat img_crop = new OpenCvSharp.Mat(raw_image_crop.height, raw_image_crop.width, OpenCvSharp.MatType.CV_8UC3, raw_image_crop.data);
RawImage raw_image_crop = sample.cutFaceRawImage(RawImage.Format.FORMAT_RGB, faceCutType);
int[] pixels = raw_image_crop.getPixels(RawImage.PixelFormat.BITMAP_ARGB_8888);
Bitmap img_crop =  Bitmap.createBitmap(pixels, raw_image_crop.width, raw_image_crop.height, Bitmap.Config.ARGB_8888);
raw_image_crop = sample.cut_face_raw_image(Format.FORMAT_GRAY)
img_crop = np.frombuffer(raw_image_crop.data, dtype=np.uint8).reshape([raw_image_crop.height, raw_image_crop.width])
Available face normalization types (RawSample.FaceCutType):
- FACE_CUT_BASE: basic normalization (any sample type);
- FACE_CUT_FULL_FRONTAL: ISO/IEC 19794-5 Full Frontal (only frontal sample type). It is used for saving face images in electronic biometric documents;
- FACE_CUT_TOKEN_FRONTAL: ISO/IEC 19794-5 Token Frontal (only frontal sample type).
To preview the normalized face, call the RawSample.getFaceCutRectangle method by specifying the normalization type. As a result, you'll get four points – the corners of the rectangle that will be used for cropping.
See the examples of face capturing and normalization for different programming languages in Samples section.