Skip to main content
Version: 3.21.0 (latest)

Components

Face SDK consists of components that perform the main functional tasks: detection, estimation of faces and human pose, facial recognition and processing of video streams. Face SDK components are implemented as processing blocks of Processing Block API and/or Legacy API objects.

Detection of faces, bodies and objects

Face Detector

Face Detector is a component for detecting faces in images. The result of the component's work is a list of detected faces with the following attributes:

Component implementation:

note

For face detection on videos or time-ordered image sequences, we recommend that you use Video Engine.

Body Detector

Body Detector is used to detect human bodies within images, bboosting the potential to detect people within the frame even when their faces are not discernible. The detection result is a bbox around the detected body.

Component implementation:

Object Detector

Object Detector is used to detect multiple various objects in images.

The detection result is a bbox around the detected object with object class: "body", "bicycle", "car", "motorcycle", "bus", "train", "truck", "traffic_light", "fire_hydrant", "stop_sign", "bird", "cat", "dog", "horse", "sheep", "cow", "bear", "backpack", "umbrella", "handbag", "suitcase", "sports_ball", "baseball_bat", "skateboard", "tennis_racket", "bottle", "wine_glass", "cup", "fork", "knife", "laptop", "phone", "book", "scissors".

Component implementation:

Estimation of faces and human pose

Face SDK provides a set of tools for estimating images received from Face Detector component, offering insightful demographic analysis.

Gender-Age Estimator

Gender-Age Estimator is used for estimating the gender and age of people based on their face images.

Component implementation:

Emotions Estimator

Emotions Estimator is used for estimating the prevailing emotional state of a person:

  • Happy
  • Surprised
  • Neutral
  • Angry
  • Disgusted
  • Sad
  • Scared

Component implementation:

Quality Estimator

Quality Estimator is used to estimate face image quality. The result is a list of the human face images with a detailed quality score.

Component implementation:

Mask Estimator

Mask Estimator determines the presence/absence of a mask in the face image.

Component implementation:

Eyes Openness Estimator

Eyes Openness Estimator is used for estimating the eyes’ state on the face image. This component provides the verdict “open” or “closed” for the right and left eye.

Component implementation:

Liveness Estimators

Liveness components are used to assess whether the face detected in the image or in video is real or fake. These components protect against malicious actions (spoofing attacks) using printed face image, a photo or video of a face from the screens of mobile devices and monitors, as well as various kinds of masks (paper, silicone, etc.).

Active Liveness Estimator analyzes certain human actions according to the check script, for example: “blink”, “smile”, “turn your head”.

Component implementation:

2D / RGB Liveness Estimator assesses face liveness in an RGB image. To perform the check, the appearance of the face in the field of view of the camera is sufficient.

Component implementation:

3D / Depth Liveness Estimator protects against attempts to use an image instead of a real face by analyzing the face surface using a depth map obtained from a 3D (RGBD) sensor.

Component implementation:

IR Liveness Estimator determines the reality of a human face based on an image taken from an infrared camera, combined with a color image.

Component implementation:

Human Pose Estimator

Human Pose Estimator is a component used to estimate human body skeleton keypoints in the image.

Component implementation:

Facial recognition

Face SDK provides components and algorithms to recognize faces. This functionality is based on the operations with a biometric face template.

Encoder

Encoder extracts biometric face template from a face image received from Face Detector.

A biometric face template is a unique set of biometric features extracted from a face image. Templates are used to compare two face images and to determine a degree of their similarity.

A biometric face template has the following key characteristics:

  • It does not contain personal data
  • It cannot be used to restore a face image
  • It can be serialized and saved to file, database, or sent over a network
  • It can be indexed. That helps accelerate face template matching by using a special index for face template batch.
Face SDK has several algorithms with different characteristics of speed and accuracy for all possible use cases - from low-powered embedded devices to expert face recognition systems.

Extracting a biometric template is one of the most computation-heavy operations, so Face SDK provides the ability to use a GPU accelerator to increase performance.

Component implementation:

Matcher

Matcher allows performing the following comparison operations with templates created by Encoder:

  • Verification 1:1 - comparing of two biometric templates (faces) between each other, estimating of coincidence.
  • Identification 1:N - comparing of one biometric template (face) with other templates (faces), searching and estimating of coincidence.

When comparing face templates, Matcher calculates the difference between biometric features of faces. The calculation result is a measure of coincidence between face images and the probability of belonging to one person.

Templates, extracted using different algorithms, have different properties and cannot be compared with each other.

Component implementation:

Video stream processing

Video Engine

Video Engine is used for real-time processing of video streams. This component solves the following tasks:

  • Face detection and tracking
  • Facial recognition (optional)
  • Liveness checking (optional)
  • Estimation of age, gender, and emotions (optional)

Video Engine works in a multi-stream mode. Each stream is a sequence of images (frames) obtained from one source (for example, a camera or video).

All streams are processed by Video Engine at the same time. Streams, frames, and detected faces in the frame are assigned their own identifiers. During face tracking, on the sequence of stream images a face track is formed, which is also indicated by its own ID (track_id).

The set of identifiers allows you to accurately track the generated events for each stream. To handle the events, Video Engine implements a callback interface that provides event data.

Component implementation: