Skip to main content
Version: 1.0.0

Troubleshooting

Error occurred when generating offline license request

Problem: when executing a command ./setup/activate-lic-server.sh --generate-offline you can get the following error:

ERR Missing file path for offline activation request file! Specify path using--offline-request’ option.

Solution: Make sure that the platform.secrets.json and license-server.settings.cfg files contain values for the license-secret, license_key and license_server_address variables.

Error with nvidia-device-plugin when checking cluster components

Problem: when executing a command kubectl get all --all-namespaces you can receive the following error:

Error: failed to start container "nvidia-device-plugin-ctr": Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: nvml error: driver/library version mismatch: unknown

img.png

Solution:

  1. For information about your graphics card and available drivers, run the following command:
 ubuntu-drivers devices
  1. The console output indicates that the system has a "GeForce GTX 1050 Ti" graphics card and the recommended driver "nvidia-driver-515".
 == /sys/devices/pci0000:00/0000:00:10.0 ==
modalias : pci:v000010DEd00001C82sv00001458sd00003764bc03sc00i00
vendor : NVIDIA Corporation
model : GP107 [GeForce GTX 1050 Ti]
manual_install: True
driver : nvidia-driver-510-server - distro non-free
driver : nvidia-driver-450-server - distro non-free
driver : nvidia-driver-390 - distro non-free
driver : nvidia-driver-520 - distro non-free
driver : nvidia-driver-418-server - distro non-free
driver : nvidia-driver-515-server - distro non-free
driver : nvidia-driver-515 - distro non-free recommended
driver : nvidia-driver-510 - distro non-free
driver : nvidia-driver-470-server - distro non-free
driver : nvidia-driver-470 - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin
  1. To install the recommended driver, run the command:
 sudo apt install nvidia-driver-515
  1. After installing the driver, you can view the status of the graphics card using the nvidia-smi monitoring tool:

img.png

  1. You can view the driver version using the command:
 cat /proc/driver/nvidia/version

img.png

Error occurred when deploying the Platform in a cluster

Problem: not all services are started up when executing a command ./setup/deploy.sh.

img.png

Solution: Request the log platform-postgres-dep using the command:

 kubectl logs -f <full name of pod>

img.png

If the output shows an error about incorrect database name or authorization data, check the correctness of the entered authorization data in the ./cfg/platform.secrets.json file.

Problems with connection to BAF server

The connection to BAF Server can be refused due to firewall operation in your OS.

Solution: If a firewall is used with the default settings, simply open all firewall ports. In case the settings are customizable, you'll need to open ports 8080, 80, 443, 8090.