Skip to main content
Version: 3.23.1

GPU usage

Since face recognition requires a lot of processing power, GPU acceleration for Face SDK modules is now available for running deep learning algorithms.

In this section you'll learn

  • which Face SDK modules GPU acceleration is available for
  • how to enable GPU acceleration
  • timing characteristics for Face SDK modules with CPU and GPU usage
  • possible errors during GPU usage, and relevant solutions.

Desktop

System requirements for GPU usage

Currently, GPU acceleration is available for the following modules (single GPU mode only):

  • Recognition methods (12v30, 12v50, 12v100, 12v1000, 11v1000, 10v30, 10v100, 10v1000, 9v30mask, 9v300mask, 9v1000mask) (see Facial Recognition)
  • Detectors (BLF, REFA, ULD) (see Face Capturing)

To run models on GPU, edit the appropriate recognizer configuration file: set use_cuda parameter from 0 to 1.

note

To run processing on cuda 10.1, edit the object configuration file by adding the use_legacy field with a value of 1

GPU acceleration is performed on one of the available GPUs (by default on GPU with index 0). GPU index can be changed as follows:

  • via the gpu_index parameter in the recognizer configuration file
  • via the CUDA_VISIBLE_DEVICES environment variable (see more info about CUDA Environment Variables)

Test results

The table below shows the time spent on extraction of one biometric template using CPU and GPU:

MethodGPUCPU
12v100047 ms442 ms
9v30010 ms292 ms
12v1008 ms49 ms
12v506 ms21 ms
12v305 ms12 ms

Note: NVIDIA GeForce GTX 1070 and Intel Core i5-9400 4.0GHz were used for the speed test.

Troubleshooting

ErrorSolution
Assertion failed (Cannot open shared object file libtensorflow.so.2)Make sure the library file libtensorflow.so.2 is in the same directory as the libfacerec.so library you are using
Assertion failed (Cannot open shared object file tensorflow.dll)Make sure the library file tensorflow.dll is in the same directory as the facerec.dll library you are using
Slow initializationIncreasing the default JIT cache size: `export CUDA_CACHE_MAXSIZE=2147483647` (see JIT Caching)

Android

Currently, GPU acceleration is available for the following modules:

The GPU usage can be enabled/disabled via the use_mobile_gpu flag in the configuration files of the Capturer, Recognizer, VideoWorker objects (in the configuration file of the VideoWorker object, GPU is enabled for detectors). By default, mobile GPU support is enabled (the value is 1). To disable the GPU usage, change the use_mobile_gpu flag to 0.

Test results

The table below shows the time spent on extraction of one biometric template using CPU and GPU:

MethodCPUGPU
9v10003660ms610ms
9v3001960ms280ms
9v30170ms70ms

Note: The speed test was performed using Google Pixel 3.

note

GPU acceleration doesn't work on some Android devices :::