Skip to main content
Version: 3.9.0

Video Stream Processing

The VideoWorker interface object is used to:

The VideoWorker object is responsible for thread control and synchronization routine, you only need to provide decoded video frames and register a few callback functions.

See an example of using VideoWorker in video_recognition_demo.
Learn how to detect and track faces in a video stream in our tutorial Face Detection and Tracking in a Video Stream.
Learn how to recognize faces in a video stream in our tutorial Face Recognition in a Video Stream.

Tracking Faces

Note: Learn how to detect masked faces in our tutorial.

VideoWorker can be created with FacerecService.createVideoWorker.

Examples

pbio::FacerecService::Config video_worker_config("video_worker_lbf.xml");
video_worker_config.overrideParameter("search_k", 3);
pbio::VideoWorker::Ptr video_worker = service->createVideoWorker(
pbio::VideoWorker::Params()
.video_worker_config(video_worker_config)
.recognizer_ini_file(recognizer_config)
.streams_count(streams_count)
.processing_threads_count(processing_threads_count)
.matching_threads_count(matching_threads_count)
.age_gender_estimation_threads_count(age_gender_estimation_threads_count)
.emotions_estimation_threads_count(emotions_estimation_threads_count)
.short_time_identification_enabled(enable_sti)
.short_time_identification_distance_threshold(sti_recognition_threshold)
.short_time_identification_outdate_time_seconds(sti_outdate_time)
);

Where:

  • video_worker_config – path to the configuration file for VideoWorker or FacerecService.Config object
  • video_worker_params – parameters of the VideoWorker constructor
  • recognizer_config – the configuration file for the recognizer used (see Face Identification)
  • streams_count – the number of video streams; a tracking stream is created for each stream
  • processing_threads_count – the number of threads for template creation. These threads are common to all video streams and they distribute resources evenly across all video streams regardless of their workload (except for the video streams without faces in the frame)
  • matching_threads_count – the number of threads for comparison of templates created from video streams with the database. Like processing threads, they distribute the workload evenly across all video streams
  • age_gender_estimation_threads_count – the number of threads for age and gender estimation. Like processing threads, they distribute the workload evenly across all video streams
  • emotions_estimation_threads_count – the number of threads for emotions estimation. Like processing threads, they distribute the workload evenly across all video streams
  • enable_sti – the flag enabling short time identification
  • sti_recognition_threshold – the recognition distance threshold for short time identification
  • sti_outdate_time – time period in seconds for short time identification

Currently, there are three configuration files with the tracking method from common_video_capturer.xml:

  • video_worker.xml with the esr points set
  • video_worker_lbf.xml with the singlelbf points set
  • video_worker_fda.xml with the fda points set

and three configuration files with the tracking method from fda_tracker_capturer.xml:

  • video_worker_fdatracker.xml with the fda points set
  • video_worker_fdatracker_fake_detector.xml with the fda points set
  • video_worker_fdatracker_blf_fda.xml with the fda set of points

(see Anthropometric Points, Capturer Class Reference).

If VideoWorker is used only for face tracking, it should be created with matching_thread=0 and processing_thread=0 and the standard Face Detector license is used. To create Face Detector for one stream, specify the streams_count=1 parameter.

To provide video frames, you should call VideoWorker.addVideoFrame. This method is thread-safe, so you can provide frames from different streams created for each video stream, without additional synchronization. The method returns an integer frame id that will be used to identify this frame in the callback.

You have to use two callbacks for face tracking:

  • VideoWorker.TrackingCallbackU provides the tracking results. This callback is called every time the frame has been processed by the tracking conveyor. Tracking callback will be called with frame_id equal to X not earlier than VideoWorker.addVideoFrame returns the value of X + N - 1, where N is the value returned by VideoWorker.getTrackingConveyorSize. Tracking callbacks with the same stream_id are called in ascending frame_id order. Therefore, if a callback with stream_id = 2 and frame_id = 102 was received immediately after a callback with stream_id = 2 and frame_id = 100, then the frame with frame_id = 101 was skipped for the video stream 2. Most of the samples are created from the frame_id frame, but some samples can be obtained from previous frames. Use the RawSample.getFrameID method to determine which frame the sample actually belongs to. To subscribe to this callback, use the VideoWorker.addTrackingCallbackU method. To unsubscribe from this method, use the VideoWorker.removeTrackingCallback method by submitting the callback_id you received from the VideoWorker.addTrackingCallbackU method.

  • VideoWorker.TrackingLostCallbackU returns the best sample and face template when tracking is lost (for example, when a person leaves the frame). The best sample can be empty if the weak_tracks_in_tracking_callback configuration parameter is enabled. It is guaranteed that this is the last callback for the pair <stream_id, track_id> (track_id is equal to sample.getID() for a sample given in any VideoWorker callback). That is, after this callback, no Tracking, MatchFound or TrackingLost callback for this stream_id can contain a sample with the same track_id identifier. It is also guaranteed that for each pair <stream_id, track_id>, which was mentioned in the Tracking callback, there is exactly one TrackingLost callback, except for the tracks removed during VideoWorker.resetStream – the TrackingLost callback won't be called for these tracks. Use the return value of VideoWorker.resetStream to release the memory allocated for these tracks. To subscribe to this callback, use the VideoWorker.addTrackingLostCallbackU method. To unsubscribe from this callback, use the VideoWorker.removeTrackingLostCallback method by providing the callback_id that you received from the VideoWorker.addTrackingLostCallbackU method.

Note: Exceptions that are thrown in the callbacks will be catched and rethrown in the VideoWorker.checkExceptions member function. Therefore, do not forget to call the VideoWorker.checkExceptions method from time to time to check for errors.

WARNING: Do not call the methods that change the state of VideoWorker inside the callbacks in order to avoid a deadlock. That is, only the VideoWorker.getMethodName and VideoWorker.getStreamsCount member functions are safe for calling in callbacks.

Creating Templates

If besides detection, the creation of templates is required, VideoWorker should be created with matching_thread=0 and processing_thread>0 and the Video Engine Standard license is used. To create Video Engine Standard for one stream, specify the parameters streams_count=1, processing_threads_count=1, matching_threads_count=0.

You can disable / enable the creation of templates for a specific video stream using the VideoWorker.disableProcessingOnStream and VideoWorker.enableProcessingOnStream member functions. At start, template creation is enabled for all video streams.

VideoWorker.TemplateCreatedCallbackU provides template generation results. This callback is called whenever a template is created within the VideoWorker. It is guaranteed that this callback will be called after at least one Tracking callback and before a TrackingLost callback with the same stream_id and track_id (track_id = sample->getID()). To subscribe to this callback, use the VideoWorker.addTemplateCreatedCallbackU method. To unsubscribe from this callback, use the VideoWorker.removeTemplateCreatedCallback method by providing the callback_id that you received from the VideoWorker.addTemplateCreatedCallbackU method.

Recognizing Faces

If face tracking, template creation and matching with the database are required, VideoWorker should be created with matching_thread>0 and processing_thread>0 and the Video Engine Extended license is used. To create Video Engine Extended for one stream, specify the parameters streams_count=1, processing_threads_count=1, matching_threads_count=1.

Use the VideoWorker.setDatabase member function to setup or change the database. It can be called at any time.

VideoWorker.MatchFoundCallbackU returns the result of the matching with the database. When a template is created for the tracked face, it is compared with each template from the database, and if the distance to the closest element is less than distance_threshold specified in this element, then a match is fixed. This callback is called after N consecutive matches with the elements belonging to the same person.

You can set the <not_found_match_found_callback> tag to 1 to enable this callback after N sequential not-found hits (i.e. when the closest element is beyond its distance_threshold.) In this case, match_result of the first element in VideoWorker.MatchFoundCallbackData.search_results will be at zero distance, and the person_id and element_id identifiers will be equal to VideoWorker.MATCH_NOT_FOUND_ID. The N number can be set in the configuration file in the <consecutive_match_count_for_match_found_callback> tag.

It is guaranteed that this callback will be called after at least one Tracking callback and before a TrackingLost callback with the same stream_id and track_id (track_id = sample->getID()). To subscribe to this callback, use the VideoWorker.addMatchFoundCallbackU method. To unsubscribe from this callback, use the VideoWorker.removeMatchFoundCallback method by providing the callback_id that you received from the VideoWorker.addMatchFoundCallbackU method. The maximum number of elements returned in the VideoWorker.MatchFoundCallbackData.search_results is set in the configuration file in the search_k tag and can be changed by the FacerecService.Config.overrideParameter object, for example: video_worker_config.overrideParameter("search_k", 3);

Estimation of age, gender, and emotions

To estimate age and gender, specify the parameter age_gender_estimation_threads_count > 0. To estimate emotions, specify the parameter emotions_estimation_threads_count > 0. The information about age, gender, and emotions is returned in VideoWorker.TrackingCallbackU. The information about emotions is constantly updated. The information about age and gender is updated only if there is a sample of better quality. By default the estimation of age, gender, and emotions is enabled after you create VideoWorker.

To disable estimation of age, gender, and emotions on a specified stream, use the following methods:

  • VideoWorker.disableAgeGenderEstimationOnStream (age and gender)
  • VideoWorker.disableEmotionsEstimationOnStream (emotions)

To enable estimation of age, gender, and emotions on a specified stream again, use the following methods:

  • VideoWorker.enableAgeGenderEstimationOnStream (age and gender)
  • VideoWorker.enableEmotionsEstimationOnStream (emotions)

Liveness Estimation

Active Liveness

To enable this type of liveness estimation, set the enable_active_liveness parameter in the VideoWorker configuration file to 1. All faces for identification will then be subjected to several checks. The following checks are available (set in the LivenessChecks structure):

  • SMILE: smile
  • BLINK: blink
  • TURN_UP: turn your head up
  • TURN_DOWN: turn your head down
  • TURN_RIGHT: turn your head to the right
  • TURN_LEFT: turn your head to the left
  • PERSPECTIVE: face position check (move your face closer to the camera)

Note: to use the SMILE check, you must specify the number of streams in the emotions_estimation_threads_count parameter of the VideoWorker object.

Between the checks, the face should be in a neutral position (in front of the camera). The order of the checks can be random (this mode is selected by default), or set during initialization (by creating a list of non-repeating checks). See an example of setting a checklist in the video_recognition_demo demo program (C++/C#/Java/Python).

The check status is returned in the active_liveness_result attribute of TrackingCallback. This attribute contains the following fields:

  • verdict: status of the current check (the ActiveLiveness.Verdict object)
  • type: type of check (the ActiveLiveness.LivenessChecks object, see the description above)
  • progress_level: degree of confidence for check passing - a number in the range of [0,1]

The ActiveLiveness.Verdict object contains the following fields:

  • ALL_CHECKS_PASSED: all checks passed
  • CURRENT_CHECK_PASSED: current check passed
  • CHECK_FAIL: check failed
  • WAITING_FACE_ALIGN: waiting for neutral face position
  • NOT_COMPUTED: liveness is not estimated
  • IN_PROGRESS: check is in progress

Short Time Identification

Short time identification (STI) is used to recognize a track as a person who has been in front of a camera not long ago, even if this person is not in the database and even if matching is disabled. For example, if a person is detected, tracked, lost, and then detected and tracked again during, for example, one minute, he/she will be considered as the same person.

If short time identification is enabled, VideoWorker matches the tracks, where a face is lost, with other tracks, where a face was lost not longer than sti_outdate_time seconds ago. Matched tracks are grouped as sti_person. ID of this group (sti_person_id) is returned in VideoWorker.TrackingLostCallbackU. The value of sti_person_id is equal to the track_id value of the first element that formed the group sti_person. When a specific group sti_person exceeds the specified period sti_outdate_time, then VideoWorker.StiPersonOutdatedCallbackU is called.

Short time identification does not affect the usage of the license. To use this function, there should be at least one thread for template creation (processing_thread>0).

Detailed Info about VideoWorker Configuration Parameters

Click here to see the list of parameters from the configuration file that can be changed with the FacerecService.Config.overrideParameter method
  • max_processed_width and max_processed_height – to limit the size of the image that is submitted to the internal detector of new faces.
  • min_size and max_size – minimum and maximum face size for detection (the size is defined for the image already downscaled under max_processed_width and max_processed_height). You can specify relative values for the REFA detector, then the absolute values will be values relative to the image width.
  • min_neighbors – an integer detector parameter. Please note that large values require greater detection confidence. You can change this parameter based on the situation, for example, increase the value if a large number of false detections are observed, and decrease the value if a large number of faces are not detected. Do not change this setting if you are not sure.
  • min_detection_period – a real number that means the minimum time (in seconds) between two runs of the internal detector. A zero value means ‘no restrictions’. It is used to reduce the processor load. Large values increase the latency in detecting new faces.
  • max_detection_period – an integer that means the max time (in frames) between two runs of the internal detector. A zero value means ‘no restrictions’. For example, if you are processing a video offline, you can set the value to 1 so as not to miss a single person.
  • consecutive_match_count_for_match_found_callback – an integer that means the number of consecutive matches of a track with the same person from the database to consider this match valid (see also VideoWorker.MatchFoundCallbackU).
  • recognition_yaw_min_threshold, recognition_yaw_max_threshold, recognition_pitch_min_threshold and recognition_pitch_max_threshold – real numbers that mean the restrictions on the face orientation to be used for recognition.
  • min_tracking_face_size – a real number, means the minimum size of a tracked face.
  • max_tracking_face_size – a real number, means the maximum size of a tracked face size, non-positive value removes this limitation.
  • min_template_generation_face_size – a real number, means the minimum face size for template generation, faces of lower size will be marked as weak = true.
  • single_match_mode – an integer, 1 means that the single match mode is enabled, 0 means that the single match is disabled. If this mode is on, the track that is matched with the person from the database will never generate a template and be matched with the database once again.
  • delayed_samples_in_tracking_callback – an integer, 1 means enabled, 0 means disabled. The fact is that the internal detector, which detects new faces, does not have time to work on all the frames. Therefore, some samples may be received with a delay. If the value 1 is selected, all delayed samples will be transmitted to TrackingCallbackFunc. Otherwise, the delayed samples appear only if otherwise the callback order would be violated (i.e. a track sample must appear at least once in TrackingCallback before calling TrackingLostCallback for this track). Use the RawSample.getFrameID method to determine to what frame a sample actually belongs.
  • weak_tracks_in_tracking_callback – an integer, 1 means enabled, 0 means disabled. By default this flag is disabled and samples with the flag of weak = true are not passed to the Tracking callback if at the time of their creation there were no samples with the same track_id (track_id = sample.getID()) with the flag of weak=false. If weak_tracks_in_tracking_callback is enabled then all samples are passed to the Tracking callback, so the TrackingLost callback can be called with best_quality_sample = NULL.
  • search_k – an integer, which means the maximum number of elements returned to the VideoWorker.MatchFoundCallbackU callback, i.e. this is the k parameter, passed internally in the Recognizer.search method.
  • processing_queue_size_limit – an integer that means the max count of samples in the queue for template creation. It is used to limit the memory consumption in cases, where new faces appear faster than templates are created.
  • matching_queue_size_limit – an integer that means the max count of templates in the queue for matching with database. It is used to limit the memory consumption in cases, where new templates appear faster than they are matched with the database (this can happen in case of a very large database).
  • recognizer_processing_less_memory_consumption – an integer that will be used as a value of the processing_less_memory_consumption flag in the FacerecService.createRecognizer method to create an internal recognizer.
  • not_found_match_found_callback – an integer, 1 means enabled, 0 means disabled, calls VideoWorker.MatchFoundCallbackU after N consecutive mismatches.
  • depth_data_flag – an integer value, 1 turns on depth frame processing to confirm face liveness (means that it belongs to a real person) during face recognition, 0 turns off this mode. All overriden parameters with the name prefix depth_liveness. are forwarded to the DepthLivenessEstimator config with the prefix removed. So if you need to override paramname param for depth liveness, you need to override depth_liveness.paramname for VideoWorker. See FacerecService.Config.overrideParameter.
  • timestamp_distance_threshold_in_microsecs – maximum allowed distance between the timestamps of a color image and a corresponding depth frame in microseconds; used if depth_data_flag is set.
  • max_frames_number_to_synch_depth – maximum queue size when synchronizing the depth data; used if depth_data_flag is set.
  • max_frames_queue_size – maximum queue size of frames used by the tracker; when depth_data_flag is set, the recommended value is max(3, rgbFPS / depthFPS).
  • offline_work_i_e_dont_use_time – an integer, 1 means enabled, 0 means disabled. Default value is disabled. When enabled, the check with max_detector_confirm_wait_time is not performed. Also max_occlusion_count_wait will be used istead of max_occlusion_time_wait.
  • max_occlusion_time_wait – a real number in seconds. When the tracker detects face occlusion, it holds the face position and tries to track it on new frames during this time.
  • max_occlusion_count_wait – an integer. Means the same as max_occlusion_time_wait but in this case time is measured in frames instead of seconds. Used only when offline_work_i_e_dont_use_time is enabled.
  • fda_max_bad_count_wait – an integer. When fda_tracker detects decline in the face quality, it tries to track that face with the general purpose tracker (instead of the fda method designed and tuned for faces) during at most fda_max_bad_count_wait frames.
  • base_angle – an integer: 0, 1, 2, or 3. Set camera orientation: 0 means standard (default), 1 means +90 degrees, 2 means -90 degrees, 3 means 180 degrees.
  • fake_detections_cnt – an integer. Number of start positions to search a face using video_worker_fdatracker_fake_detector.xml.
  • fake_detections_period – an integer. Each start position will be used once in fake_detections_period frames.
  • fake_rect_center_xN, fake_rect_center_yN, fake_rect_angleN, fake_rect_sizeN – real numbers. Parameters of start positions. N is from 0 to fake_detections_cnt – 1 including. fake_rect_center_xN – x coordinate of a center relative to the image width. fake_rect_center_yN – y coordinate of a center relative to the image height. fake_rect_angleN – roll angle in degrees. fake_rect_sizeN – size relative to max(image width, image height).
  • downscale_rawsamples_to_preferred_size – an integer, 1 means enabled, 0 means disabled. The default value is enabled. When enabled, VideoWorker downscales each sample to the suitable size (see RawSample.downscaleToPreferredSize) in order to reduce memory consumption. However, it decreases the performance. It's recommended to disable downscale_rawsamples_to_preferred_size and use RawSample.downscaleToPreferredSize manually for RawSamples that you need to save or keep in RAM for a long time.
  • squeeze_match_found_callback_groups – an integer, 1 means enabled, 0 means disabled. Default value is disabled. When the track gets the first N consecutive matches with the same person from the database, all N matches will be reported (where N = value of consecutive_match_count_for_match_found_callback), i.e. there will be N consecutive VideoWorker.MatchFoundCallbackU calls. But if squeeze_match_found_callback_groups is enabled, only one match will be reported – the one with the minimal distance to the database.
  • debug_log_enabled – an integer, 1 means enabled, 0 means disabled. Default value is disabled. This can also be enabled by setting the environment variable FACE_SDK_VIDEOWORKER_DEBUG_LOG_ENABLED to 1. If enabled, VideoWorker logs the result of its work in std::clog.
  • need_stable_results – an integer, 1 means enabled, 0 means disabled. Default value is disabled. The idea is to produce the same results when working with the same data. It disable few optimizations that can produce slightly different results due to unstable order of multithreaded computations. Also it enables offline_work_i_e_dont_use_time and set max_detection_period value to 1. Also it disables the frame skip (disables the limit set by max_frames_queue_size).