Getting error: The NVIDIA driver was unable to open 'libnvidia-glvkspirv.so.440.59'
I've followed these instructions, and I'm able to run nvidia-smi
inside of the ade env, as well as running ade start --update --enter
without errors.
However when I run this step:
In the same terminal window, start the LGSVL simulator:
/opt/lgsvl/simulator
I'm seeing this error when I run ade$ ./opt/lgsvl/simulator
:
The NVIDIA driver was unable to open 'libnvidia-glvkspirv.so.440.59'. This library is required at run time.
In the Player.log
, I see:
Desktop is 1920 x 1080 @ 144 Hz
[Vulkan init] extensions: count=15
[Vulkan init] extensions: name=VK_KHR_device_group_creation, enabled=0
[Vulkan init] extensions: name=VK_KHR_display, enabled=1
[Vulkan init] extensions: name=VK_KHR_external_fence_capabilities, enabled=0
[Vulkan init] extensions: name=VK_KHR_external_memory_capabilities, enabled=0
[Vulkan init] extensions: name=VK_KHR_external_semaphore_capabilities, enabled=0
[Vulkan init] extensions: name=VK_KHR_get_physical_device_properties2, enabled=0
[Vulkan init] extensions: name=VK_KHR_get_surface_capabilities2, enabled=0
[Vulkan init] extensions: name=VK_KHR_surface, enabled=1
[Vulkan init] extensions: name=VK_KHR_xcb_surface, enabled=0
[Vulkan init] extensions: name=VK_KHR_xlib_surface, enabled=1
[Vulkan init] extensions: name=VK_EXT_acquire_xlib_display, enabled=0
[Vulkan init] extensions: name=VK_EXT_debug_report, enabled=0
[Vulkan init] extensions: name=VK_EXT_debug_utils, enabled=0
[Vulkan init] extensions: name=VK_EXT_direct_mode_display, enabled=0
[Vulkan init] extensions: name=VK_EXT_display_surface_counter, enabled=0
Vulkan error VK_ERROR_INCOMPATIBLE_DRIVER (-9) file: ./Runtime/GfxDevice/vulkan/VKContext.cpp, line: 333
Vulkan error./Runtime/GfxDevice/vulkan/VKContext.cpp:333
Vulkan detection: 0
No supported renderers found, exiting
(Filename: ./PlatformDependent/LinuxStandalone/main.cpp Line: 639)
I was able to work around the issue by running the lgsvl simulator outside the docker container on the host itself. Since the docker container is launched with --net=host, the lgsvl running on the host can still reach the ros2 bridge on localhost:9090. But I'd still be interested in knowing why the simulator works on the host but not in the container.
When you run
nvidia-smi
both inside and outsideade
, what driver version do you see in the upper-right corner of the output? Do either of these match the version you see when you rungrep "X Driver" /var/log/Xorg.0.log
? How did you install your NVidia driver andlibvulkan
? There have been several updates recently and if you install the update and try to run the simulator without restarting, the Nvidia driver will throw errors similar to this.In both inside and outside of
ade
, I'm seeingDriver Version: 440.59
, which matches the version in/var/log/Xorg.0.log
. Unfortunately I installed the drivers using a non-standard approach:apt-get install system76-driver-nvidia
based on these instructions. I also did try restarting the machine and I'm still seeing theThe NVIDIA driver was unable to open 'libnvidia-glvkspirv.so.440.59'. This library is required at run time.
error. Again, the simulator does work on the host, and I'm using thelgsvlsimulator-linux64-2020.01
version. How do I get the simulator version inside ade to compare with this?The version that's used in ADE is in the image name in
.aderc-lgsvl
. Currently, that should be 2020.01, the same that you are using outside the container. The Pop!_OS driver should be fine (I love Pop!_OS, BTW - the guys and girls from System76 are awesome!). We just got 2020.03-rc1 and I'm working on an update.Based on this issue, it looks like this is a problem with Vulkan, which is not yet supported in containers. I'm checking to see if it can be disabled in LGSVL.
Can you try running
sudo apt update && sudo apt install libvulkan1
inside ADE and see if this solves the problem?Nice! I'm actually on Ubuntu 18 since I didn't want to diverge from what most people are using. I tried to install
libvulkan1
inside ADE, but it told me it was already installed. If you say Vulkan doesn't work inside containers, why aren't most autoware.auto developers hitting the same issue? One update: after updating ADE toade-lgsvl 2020.03-native-bridge
, now when I start the/opt/lgsvl/simulator
inside ADE it fails immediately withaborted
and in the unity3d player logs shows:Caught fatal signal - signo:11 code:1 errno:0 addr:(nil). Obtained 1 stack frames. #0 (nil) in (Unknown)
.Please be aware that the instructions for how to launch the simulator have changed with the "native bridge" version. Check the LGSVL page for details.
Do you have the Nvidia Docker driver installed? See these instructions. I only just learned that
ade
launches with GPU support enabled by default - something I intend to change with a merge request againstade-cli
as soon as I can make time.Yeah, I have the
nvidia-container-toolkit
installed and when I rundocker run --gpus all nvidia/cuda:10.0-base nvidia-smi
it shows the same output I see on the host. It's onDriver Version: 440.59
andCUDA Version: 10.2
. Also, I did start the simulator withRMW_IMPLEMENTATION=rmw_cyclonedds_cpp /opt/lgsvl/simulator
but it still crashes.