Ask Your Question
0

Change the C++ compiller in ROS to use with openACC and CUDA

asked 2018-11-14 12:28:14 -0600

billyDong gravatar image

I'm trying to compile a rospackage with the PGI compiler that uses openACC. I want to parallelize some code.

This works with standard c++ code and uses the pgcc / pgc++ compiler. So I tried to compile a simple ros package with this compiler. Here is the source code:

#include <ros/ros.h>
#include <iostream>
#include "std_msgs/String.h"
#include <sstream>


int main(int argc, char **argv)
{
  ros::init(argc, argv, "pgi_test_node");

  ros::NodeHandle n;

 ros::Publisher chatter_pub = n.advertise<std_msgs::String>("chatter", 1000);

 ros::Rate loop_rate(10);

 int cout = 0;

 while(ros::ok()) {

     std_msgs::String msg;  

     ss << heelo world" << count;


   ROS_INFO("%s", msg.data.c_str());

   chatter_pub.publish(msg);

   ros::spinOnce();

   loop_rate.sleep();
   ++count;
 }

return 0;
}

And here is the my cmakelists.txt . I've tried a lot of things but the mains change might be the compiler and its flags in the beginning.

cmake_minimum_required(VERSION 2.8.3)
project(pgi_test)

## Compile as C++11, supported in ROS Kinetic and newer
add_compile_options(-std=c++11)
SET(CMAKE_C_COMPILER /opt/pgi/linux86-64/18.4/bin/pgcc) 
SET(CMAKE_CXX_COMPILER /opt/pgi/linux86-64/18.4/bin/pgc++) 



# flags
add_definitions("-DENABLE_SSE")
SET(CMAKE_CXX_FLAGS
   "${SSE_FLAGS}  -O3 -std=c++11 -ta=tesla:cuda9.1 -acc -Minfo=accel"
)



## Find catkin macros and libraries
## if COMPONENTS list like find_package(catkin REQUIRED COMPONENTS xyz)
## is used, also find other catkin packages
find_package(catkin REQUIRED
roscpp
rospy
std_msgs
genmsg
)

## System dependencies are found with CMake's conventions
#find_package(Boost REQUIRED COMPONENTS system)

find_package(Boost REQUIRED COMPONENTS system thread filesystem)
INCLUDE_DIRECTORIES( ${Boost_INCLUDE_DIR} )

find_package( CUDA REQUIRED )
include_directories(
  ${catkin_INCLUDE_DIRS}
  ${CUDA_INCLUDE_DIRS}
)

SET(BOOST_HAS_FLOAT128)

## Uncomment this if the package has a setup.py. This macro ensures
## modules and global scripts declared therein get installed
## See http://ros.org/doc/api/catkin/html/user_guide/setup_dot_py.html
# catkin_python_setup()

################################################
## Declare ROS messages, services and actions ##
################################################

## To declare and build messages, services or actions from within this
## package, follow these steps:
## * Let MSG_DEP_SET be the set of packages whose message types you use in
##   your messages/services/actions (e.g. std_msgs, actionlib_msgs, ...).
## * In the file package.xml:
##   * add a build_depend tag for "message_generation"
##   * add a build_depend and a exec_depend tag for each package in MSG_DEP_SET
##   * If MSG_DEP_SET isn't empty the following dependency has been pulled in
##     but can be declared for certainty nonetheless:
##     * add a exec_depend tag for "message_runtime"
## * In this file (CMakeLists.txt):
##   * add "message_generation" and every package in MSG_DEP_SET to
##     find_package(catkin REQUIRED COMPONENTS ...)
##   * add "message_runtime" and every package in MSG_DEP_SET to
##     catkin_package(CATKIN_DEPENDS ...)
##   * uncomment the add_*_files sections below as needed
##     and list every .msg/.srv/.action file to be processed
##   * uncomment the generate_messages entry below
##   * add every package in MSG_DEP_SET to generate_messages(DEPENDENCIES ...)

## Generate messages in the 'msg' folder
# add_message_files(
#   FILES
#   Message1.msg
#   Message2.msg
# )

## Generate services in the 'srv' folder
# add_service_files(
#   FILES
#   Service1.srv
#   Service2.srv
# )

## Generate actions in the 'action' folder
# add_action_files(
#   FILES
#   Action1.action
#   Action2.action
# )

## Generate added messages and services with any dependencies listed here
# generate_messages(
#   DEPENDENCIES
#   std_msgs  # Or other packages containing msgs
# )

################################################
## Declare ROS dynamic reconfigure parameters ##
################################################

## To declare and ...
(more)
edit retag flag offensive close merge delete

2 Answers

Sort by ยป oldest newest most voted
0

answered 2018-11-14 12:52:14 -0600

john.j.oneill gravatar image

This doesn't directly answer your question, but in the past I have made my CUDA-specific code a custom external C++ library, which you then link to from catkin. I don't know how different pgcc is from gcc, so you will possibly run into ABI issues, but according to a quick search it seems like it might work okay.

edit flag offensive delete link more
0

answered 2018-11-15 05:14:55 -0600

billyDong gravatar image

updated 2018-11-15 06:32:07 -0600

I managed to compile and run with success, but I discourage this approach, so I will not close the question until I found a better way to solve the problem. A normal system update will ruin this, and I'm not sure if this will not have any consequences in the future when I need to use BOOST library. Anyway, here's my approach:

As I said, in the first place you need to add this line in the beginning of your code:

    #define __CUCACC__
    #include <ros/ros.h>

A normal catkin_make would give two errors with this approach. For the first one:

    sudo nano /usr/include/boost/type_traits/is_floating_point.hpp

Replace this line:

    #if defined(BOOST_HAS_FLOAT128)

with:

    #if defined(BOOST_HAS_FLOAT128) && !defined(__PGI)

For the following errors:

    sudo nano /usr/include/boost/core/swap.hpp

Comment all the line with:

    BOOST_GPU_ENABLED

There should be 3 lines.

With this I compiled and run the code. I added a pi generator example to test the speed in CPU and GPU. If someone wants to test..:

#define __CUDACC__
#include <ros/ros.h>
#include <iostream>
#include "std_msgs/String.h"
#include <sstream>

#define N 2000000000
#define vl 1024

int main(int argc, char **argv)
{

  ros::init(argc, argv, "pgi_test_node");

  ros::NodeHandle n;

  ros::Publisher chatter_pub = n.advertise<std_msgs::String>("chatter", 1000);

  ros::Rate loop_rate(10);

  int count = 0;
  while (ros::ok())
  {

   std_msgs::String msg;

    std::stringstream ss;
    ss << "hello world " << count;
    msg.data = ss.str();

    ROS_INFO("%s", msg.data.c_str());

  double pi = 0.0f;
  long long i;

  #pragma acc parallel vector_length(vl) 
  #pragma acc loop reduction(+:pi)
  for (i=0; i<N; i++) {
    double t= (double)((i+0.5)/N);
    pi +=4.0/(1.0+t*t);
  }

  printf("pi=%11.10f\n", pi/N);

  chatter_pub.publish(msg);

    ros::spinOnce();

    loop_rate.sleep();
    ++count;
  }


  return 0;
}

Timers are not even necessary, if you just comment the pragmas, the loop will run on CPU and you can clearly see the difference.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower

Stats

Asked: 2018-11-14 12:28:14 -0600

Seen: 390 times

Last updated: Nov 15 '18