Profiling an application
This readme is a step by step application for using VTUNE to profile a node.
General Overview
In order to profile your application, you will need to do the following:
- Edit the application CMakeLists.txt to compile it correctly for profiling
- Edit the application to add any user instrumentation
- Edit the application launch file in order to launch it under the profiler with desired profile type
- Edit the launch point of the application to set the timeouts longer
- Set the correct settings in the pc to allow user profiling
- Run the node
- View the profiler results with the VTune GUI
We will go through these step by step here.
CMakeLists.txt
There are a few changes for your application binary that allow it to be profiled.
# README: This is an example CMakeLists.txt for adding a target that you intend to profile. It
# is a mix of the regular catkin for ros stuff as well as the extra stuff you need for the
# actual profiling bit. Each bit will be explained below.
# Regular ROS stuff, as required for all ros nodes.
cmake_minimum_required(VERSION 3.2)
project(profiler_example)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED YES)
find_package(catkin REQUIRED COMPONENTS
roscpp
)
find_package(Threads REQUIRED) # Threads may or may not be required. For the profiler, the target that you link against
# will need to be linked against threads if it uses threads. For basically any ros node
# this will be the case because of subscription queues etc.
catkin_package()
set(THREADS_PREFER_PTHREAD_FLAG ON) # This is just for us, we do prefer pthreads on our system, because linux etc. We
# want this to ensure that we use pthreads which are profilable. Very high chance
# we are using them anyway.
add_executable(test_profiler # We add our executable as per usual, with all the translation units included.
thread_profile_example.cpp
)
# The following 2 lines are very important if you wish to use the user instrumentation API detailed here:
# https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/api-support/instrumentation-and-tracing-technology-apis/instrumentation-and-tracing-technology-api-reference.html
# This is so you can include and then link against their libs. For very basic example usage, see the thread_profiling_example.cpp
target_include_directories(test_profiler PRIVATE /opt/intel/oneapi/vtune/latest/sdk/include)
target_link_directories(test_profiler PRIVATE /opt/intel/oneapi/vtune/latest/sdk/lib64)
# Each of the compiler options and libs we are linking against are quite important, so they will be detailed one at a time.
target_compile_options(test_profiler PRIVATE
-g # The most important, this compiles your application with profiling symbols.
-fno-omit-frame-pointer # This adds some extra information that allows the vtune application to more easily view your profile.
-D_LINUX # This is suggested by Vtune, i don't know what this does.
-fno-asm # This stops you from ending up with no code trace I think.
)
target_link_libraries(test_profiler PRIVATE
ittnotify # This is important to link against if you are using the user API, see above include directories
dl # This allows the dynamic linking of the user api.
pthread # This or the below must be linked if you are using ...
(more)
Would you happen to know how to do this with CMake? If so: Catkin is a set of macros and functions for CMake. You should be able to just use the CMake workflow for VTune.