Why does action-ros-ci fail with missing rmw when running on RHEL?
So I'm trying to add an RHEL CI pipeline to ros-controls/ros2_controllers. I made an ros alma linux container where I installed ros and a few rosdep dependencies. I then created a working ci pipeine(job titled: rhel) that runs rosdep and colcon to build and test. That can be seen here:
rhel:
name: RHEL
runs-on: ubuntu-latest
container: jaronl/ros:galactic-alma
steps:
- uses: actions/checkout@v2
with:
path: src/ros2_controllers
- run: |
rosdep init
rosdep update
rosdep install -iy --from-path src/ros2_controllers
source /opt/ros/$ROS_DISTRO/setup.bash
colcon build
colcon test
I wanted to take it a little step further to see if I could get it to run with action-ros-ci in case the ros-controls group wants to customize it. To do that, I forked action-ros-ci and modified the action to also be able to use dnf
as an alternative to apt-get
; however, I get an error about not having a ros middleware installed when it gets to colcon build
. The ci file for that run can be seen here:
rhel:
name: RHEL
runs-on: ubuntu-latest
container: jaronl/ros:galactic-alma
steps:
- uses: jaron-l/action-ros-ci@test_RHEL
with:
target-ros2-distro: ${{ env.ROS_DISTRO }}
The failed run job can be seen publicly but I copy it below.
What I don't get is why the difference between the working CI job (just manually calling colcon build
) and the failed one is (action-ros-ci
). They both checkout ros2_controllers, update the package manager cache, use rosdep to install dependencies, and then run colcon build and test. If someone could just point to the different between the two CI pipelines, that would be great.
Variations I've tried:
- I verified that the same packages are installed with rosdep and they obviously are using the same container.
- Using the exact same colcon build command arguments does not produce the same error.
- Using the exact same rosdep install args does not produce the same error.
As additional context, ros-controls/ros2_controllers has a build error on RHEL, but the one I'm looking for is in joint_state_trajectory_controller. An example of that is here.
Failed action-ros-ci output:
/usr/bin/bash -c source /opt/ros/galactic/setup.sh && colcon build --event-handlers console_cohesion+ --symlink-install
Starting >>> forward_command_controller
Starting >>> diff_drive_controller
--- output: forward_command_controller
-- The C compiler identification is GNU 8.5.0
-- The CXX compiler identification is GNU 8.5.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found ament_cmake: 1.1.4 (/opt/ros/galactic/share/ament_cmake/cmake)
-- Found PythonInterp: /usr/bin/python3 (found suitable version "3.6.8", minimum required is "3")
-- Using PYTHON_EXECUTABLE: /usr/bin/python3
-- Override CMake install command with custom implementation using symlinks instead of copying resources
-- Found controller_interface: 1.1.0 (/opt/ros/galactic/share/controller_interface/cmake)
-- Found ...
The error makes it seem like none of the RMW packages are installed. I can't seem to reproduce this locally, either. Building your Dockerfile locally, I can see that
rmw_cyclonedds_cpp
is getting installed, which is what I expected.Could you re-build your image and give this another shot?
I just rebuilt my image and re-ran as you can see here: https://github.com/jaron-l/ros2_contr...; however it fails in the same way. I think this is to be expected because if it were a package issue, the script method would fail as well. I'm pretty sure this is an issue with how action-ros-ci runs it's CLI commands, but I just can't find the problem there.