Skip to main content
Version: 1.16.2

Troubleshooting

Error occurred when generating offline license request

Problem: when executing a command ./cli.sh license-server generate-offline you can get the following error:

ERR Missing file path for offline activation request file! Specify path using ‘--offline-request’ option.

Solution: ensure that the files platform.secrets.json and license-server.settings.cfg contain values for license-secret, license_key and license_server_address variables.

Error with nvidia-device-plugin when checking cluster components

Problem: when executing a command kubectl get all --all-namespaces you can receive the following error:

Error: failed to start container "nvidia-device-plugin-ctr": Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: nvml error: driver/library version mismatch: unknown

img.png

Solution:

  1. For information about your graphics card and available drivers, run the following command:
 ubuntu-drivers devices
  1. The console output indicates that the system has a "GeForce GTX 1050 Ti" graphics card and the recommended driver "nvidia-driver-515".
 == /sys/devices/pci0000:00/0000:00:10.0 ==
modalias : pci:v000010DEd00001C82sv00001458sd00003764bc03sc00i00
vendor : NVIDIA Corporation
model : GP107 [GeForce GTX 1050 Ti]
manual_install: True
driver : nvidia-driver-510-server - distro non-free
driver : nvidia-driver-450-server - distro non-free
driver : nvidia-driver-390 - distro non-free
driver : nvidia-driver-520 - distro non-free
driver : nvidia-driver-418-server - distro non-free
driver : nvidia-driver-515-server - distro non-free
driver : nvidia-driver-515 - distro non-free recommended
driver : nvidia-driver-510 - distro non-free
driver : nvidia-driver-470-server - distro non-free
driver : nvidia-driver-470 - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin
  1. To install the recommended driver, run the command:
 sudo apt install nvidia-driver-515
  1. After installing the driver, you can view the status of the graphics card using the nvidia-smi monitoring tool:

img.png

  1. You can view the driver version using the command:
 cat /proc/driver/nvidia/version

img.png

Error occurred when deploying the Platform in a cluster

Problem: not all services are started up when executing a command ./setup/deploy.sh.

Solution: request the log platform-postgres-dep using the command:

 kubectl logs -f <full name of pod>

img.png

If the output shows an error about incorrect database name or authorization data, check the correctness of the entered authorization data in the ./cfg/platform.secrets.json file.

Error occurred when uploading images to external registry

Problem: when uploading images you can get the following error:

The push refers to repository [<DOCKER_REGISTRY_SERVER>/<IMAGE>]
Get "<DOCKER_REGISTRY_SERVER>/v2/": x509: certificate signed by unknown authority

img.png

Solution: add or change the file /etc/docker/daemon.json and add your DOCKER_REGISTRY_SERVER to the list of insecure-registries:

{
"insecure-registries" : [ "<DOCKER_REGISTRY_SERVER>" ]
}

Restart docker-service using the command below:

sudo systemctl restart docker

Possible testing errors and solutions

When testing errors occur, the system returns the following result:

Error: <error type>
Error message: <error message>

error type indicates the type of error that occurred, and error message provides more specific information about the error.

Commands for debugging services are described in section 3.2.

Combinations of errors and messages with possible solutions are listed below.

ConnectionError:

  • <urlopen error Wrong url format: asdasd> – invalid URL format, please enter a valid address.

  • <urlopen error [Errno -2] Name or service not known> \ <urlopen error [Errno 111] Connection refused> – URL belongs to an unavailable service. Check the entered address, and make sure that OMNI Platform is deployed correctly and is accessible from the outside. If you have access by domain, check that /etc/hosts file has the domain that points to IP address of the deployed OMNI Platform.

  • HTTP Error 405: Not Allowed – make sure that URL you entered will take you to OMNI Platform and not to a third-party service.

  • HTTP Error 502: Bad Gateway / HTTP Error 503: Service Temporarily Unavailable – make sure the backend-dep service is deployed.

PlatformError:

  • connection to server at "localhost" (::1), port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? – make sure the database is available and working properly.

  • Authorization error – make sure you enter the correct user password and email.

  • Wrong answer from server. JSON can not decoded – make sure that URL you entered takes you to OMNI Platform and not to a third-party service.

  • License has not been leased yet – make sure the license server is running and OMNI Platform has access to it. Also, check that the license is activated correctly.

  • Low quality photo – check that the service responsible for calculating the quality of photos is available and working correctly.

  • Profile not searched – make sure that the service responsible for searching the database of profiles is available and works correctly.

If you have difficulty with the above errors or meet any other errors or messages that cannot be debugged and resolved on the spot, please contact <ContactSupport/>.

Problems with connection to OMNI Platform server

The connection to OMNI Platform Server can be refused due to firewall operation in your OS.

Solution: If a firewall is used with the default settings, simply open all firewall ports. In case the settings are customizable, you'll need to open ports 8080, 80, 443, 8090.

The nginx load balancer does not start when installing and configuring the Kubernetes cluster

  1. Add a rule to iptables with the following command
sudo iptables -A INPUT -p tcp -m tcp --dport 6443 -j ACCEPT
  1. Repeat the commands from para. Install and Configure a Cluster.

Error "Trial license has expired!"

To correct this error, we recommend that you follow all the steps in para. Update a License.