Ask Your Question
0

Change the C++ compiller in ROS to use with openACC and CUDA

asked 2018-11-14 12:28:14 -0500

billyDong gravatar image

I'm trying to compile a rospackage with the PGI compiler that uses openACC. I want to parallelize some code.

This works with standard c++ code and uses the pgcc / pgc++ compiler. So I tried to compile a simple ros package with this compiler. Here is the source code:

#include <ros/ros.h>
#include <iostream>
#include "std_msgs/String.h"
#include <sstream>


int main(int argc, char **argv)
{
  ros::init(argc, argv, "pgi_test_node");

  ros::NodeHandle n;

 ros::Publisher chatter_pub = n.advertise<std_msgs::String>("chatter", 1000);

 ros::Rate loop_rate(10);

 int cout = 0;

 while(ros::ok()) {

     std_msgs::String msg;  

     ss << heelo world" << count;


   ROS_INFO("%s", msg.data.c_str());

   chatter_pub.publish(msg);

   ros::spinOnce();

   loop_rate.sleep();
   ++count;
 }

return 0;
}

And here is the my cmakelists.txt . I've tried a lot of things but the mains change might be the compiler and its flags in the beginning.

cmake_minimum_required(VERSION 2.8.3)
project(pgi_test)

## Compile as C++11, supported in ROS Kinetic and newer
add_compile_options(-std=c++11)
SET(CMAKE_C_COMPILER /opt/pgi/linux86-64/18.4/bin/pgcc) 
SET(CMAKE_CXX_COMPILER /opt/pgi/linux86-64/18.4/bin/pgc++) 



# flags
add_definitions("-DENABLE_SSE")
SET(CMAKE_CXX_FLAGS
   "${SSE_FLAGS}  -O3 -std=c++11 -ta=tesla:cuda9.1 -acc -Minfo=accel"
)



## Find catkin macros and libraries
## if COMPONENTS list like find_package(catkin REQUIRED COMPONENTS xyz)
## is used, also find other catkin packages
find_package(catkin REQUIRED
roscpp
rospy
std_msgs
genmsg
)

## System dependencies are found with CMake's conventions
#find_package(Boost REQUIRED COMPONENTS system)

find_package(Boost REQUIRED COMPONENTS system thread filesystem)
INCLUDE_DIRECTORIES( ${Boost_INCLUDE_DIR} )

find_package( CUDA REQUIRED )
include_directories(
  ${catkin_INCLUDE_DIRS}
  ${CUDA_INCLUDE_DIRS}
)

SET(BOOST_HAS_FLOAT128)

## Uncomment this if the package has a setup.py. This macro ensures
## modules and global scripts declared therein get installed
## See http://ros.org/doc/api/catkin/html/user_guide/setup_dot_py.html
# catkin_python_setup()

################################################
## Declare ROS messages, services and actions ##
################################################

## To declare and build messages, services or actions from within this
## package, follow these steps:
## * Let MSG_DEP_SET be the set of packages whose message types you use in
##   your messages/services/actions (e.g. std_msgs, actionlib_msgs, ...).
## * In the file package.xml:
##   * add a build_depend tag for "message_generation"
##   * add a build_depend and a exec_depend tag for each package in MSG_DEP_SET
##   * If MSG_DEP_SET isn't empty the following dependency has been pulled in
##     but can be declared for certainty nonetheless:
##     * add a exec_depend tag for "message_runtime"
## * In this file (CMakeLists.txt):
##   * add "message_generation" and every package in MSG_DEP_SET to
##     find_package(catkin REQUIRED COMPONENTS ...)
##   * add "message_runtime" and every package in MSG_DEP_SET to
##     catkin_package(CATKIN_DEPENDS ...)
##   * uncomment the add_*_files sections below as needed
##     and list every .msg/.srv/.action file to be processed
##   * uncomment the generate_messages entry below
##   * add every package in MSG_DEP_SET to generate_messages(DEPENDENCIES ...)

## Generate messages in the 'msg' folder
# add_message_files(
#   FILES
#   Message1.msg
#   Message2.msg
# )

## Generate services in the 'srv' folder
# add_service_files(
#   FILES
#   Service1.srv
#   Service2.srv
# )

## Generate actions in the 'action' folder
# add_action_files(
#   FILES
#   Action1.action
#   Action2.action
# )

## Generate added messages and services with any dependencies listed here
# generate_messages(
#   DEPENDENCIES
#   std_msgs  # Or other packages containing msgs
# )

################################################
## Declare ROS dynamic reconfigure parameters ##
################################################

## To declare and ...
(more)
edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted
0

answered 2018-11-15 05:14:55 -0500

billyDong gravatar image

updated 2018-11-15 06:32:07 -0500

I managed to compile and run with success, but I discourage this approach, so I will not close the question until I found a better way to solve the problem. A normal system update will ruin this, and I'm not sure if this will not have any consequences in the future when I need to use BOOST library. Anyway, here's my approach:

As I said, in the first place you need to add this line in the beginning of your code:

    #define __CUCACC__
    #include <ros/ros.h>

A normal catkin_make would give two errors with this approach. For the first one:

    sudo nano /usr/include/boost/type_traits/is_floating_point.hpp

Replace this line:

    #if defined(BOOST_HAS_FLOAT128)

with:

    #if defined(BOOST_HAS_FLOAT128) && !defined(__PGI)

For the following errors:

    sudo nano /usr/include/boost/core/swap.hpp

Comment all the line with:

    BOOST_GPU_ENABLED

There should be 3 lines.

With this I compiled and run the code. I added a pi generator example to test the speed in CPU and GPU. If someone wants to test..:

#define __CUDACC__
#include <ros/ros.h>
#include <iostream>
#include "std_msgs/String.h"
#include <sstream>

#define N 2000000000
#define vl 1024

int main(int argc, char **argv)
{

  ros::init(argc, argv, "pgi_test_node");

  ros::NodeHandle n;

  ros::Publisher chatter_pub = n.advertise<std_msgs::String>("chatter", 1000);

  ros::Rate loop_rate(10);

  int count = 0;
  while (ros::ok())
  {

   std_msgs::String msg;

    std::stringstream ss;
    ss << "hello world " << count;
    msg.data = ss.str();

    ROS_INFO("%s", msg.data.c_str());

  double pi = 0.0f;
  long long i;

  #pragma acc parallel vector_length(vl) 
  #pragma acc loop reduction(+:pi)
  for (i=0; i<N; i++) {
    double t= (double)((i+0.5)/N);
    pi +=4.0/(1.0+t*t);
  }

  printf("pi=%11.10f\n", pi/N);

  chatter_pub.publish(msg);

    ros::spinOnce();

    loop_rate.sleep();
    ++count;
  }


  return 0;
}

Timers are not even necessary, if you just comment the pragmas, the loop will run on CPU and you can clearly see the difference.

edit flag offensive delete link more
0

answered 2018-11-14 12:52:14 -0500

john.j.oneill gravatar image

This doesn't directly answer your question, but in the past I have made my CUDA-specific code a custom external C++ library, which you then link to from catkin. I don't know how different pgcc is from gcc, so you will possibly run into ABI issues, but according to a quick search it seems like it might work okay.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2018-11-14 12:28:14 -0500

Seen: 162 times

Last updated: Nov 15 '18