Ask Your Question

Voice commands / speech to and from robot? [closed]

asked 2011-02-21 14:48:22 -0600

evanmj gravatar image

I have used the sound_play package with festival to synthesize voices to make my robot "talk", but I would also like to be able to command the robot by voice.

Basically, I feel like someone has used a tool like CMU Sphinx with ROS, but I am unable to find any examples.

edit retag flag offensive reopen merge delete

Closed for the following reason the question is answered, right answer was accepted by SL Remy
close date 2017-07-21 00:33:44.806765

10 Answers

Sort by ยป oldest newest most voted

answered 2011-02-21 14:55:31 -0600

Eric Perko gravatar image

updated 2011-04-06 06:20:37 -0600

It's quite experimental and definitely not documented, but we have been using PocketSphinx to do speech recognition with ROS. See the cwru_voice package for source.

If you run the voice.launch file (after changing some of the hardcoded model paths appropriately in whichever node it launches), you should be able to get certain keywords out on the "chatter" topic. As an example, voice.launch should recognize a command to "Open the Door" or "Go to the hallway" and output a keyword on the chatter topic. If you do try it out and have problems, let me know as you would be the first outside our lab to try it that I know of.

Stanford also has a speech package in their repository. EDIT: Thanks to @fergs for finding the Stanford package.

UPDATE: Make sure to take a look at Scott's answer below for a nice tutorial and demo code for getting speech recognition up and running for your own uses.

edit flag offensive delete link more


Thanks. I'll give it a go when I get a chance. I found another implementation of some sort here:
evanmj gravatar image evanmj  ( 2011-02-21 15:13:16 -0600 )edit
I hadn't seen that package before. Thanks for the info.
Eric Perko gravatar image Eric Perko  ( 2011-02-21 15:18:45 -0600 )edit
Here's the sail-ros-pkg version you were probably thinking about:
fergs gravatar image fergs  ( 2011-02-22 03:38:31 -0600 )edit

answered 2011-04-05 17:20:55 -0600

Scott gravatar image

I took me a bit longer to get this done and documented than I thought it would, but as promised, here is the tutorial and files for how to do basic robot speech recognition based on pocketsphinx. It also includes a handy wav file player based on sfml and derived from Garratt Gallagher's kinect piano playing code. The wav/audio player can also be used separately from the speech recognition code as a convenient and dependable way to play audio files. Instead of uploading the tutorial here, I have created a google.code page. You will find the code samples under the downloads page and the tutorial details on the wiki page. The link to the tutorial home is at...

If anyone creates some great new code derived from this and extends or improves this work, please create a google.code site of your own (very easy) and post the link to your work and code samples here so others can benefit from it as well. Best Regards, -Scott

edit flag offensive delete link more


Cool! One thing I see is that you might want to include a link to the actual cwru_voice package. The demo script still requires it (the roslib.load_manifest("cwru_voice") line), so you should either create your own package to depend on (and keep the script in) or add a note about getting cwru_voice.
Eric Perko gravatar image Eric Perko  ( 2011-04-05 17:30:12 -0600 )edit
I'm wondering if you tried using to play your sound files and found something you didn't like that made you go with the wave player you wrote?
Eric Perko gravatar image Eric Perko  ( 2011-04-05 17:30:19 -0600 )edit
hi, fergs wrote a program without the GUI that i found really nice when coupled with my script which is really only the if else statement that publishes to speak and chatter. i found i could let it run continuously when it has a match the robot will speak
Peter Heim gravatar image Peter Heim  ( 2011-04-05 22:48:19 -0600 )edit
here is the link to Fergs program ( I will post a link to my code later
Peter Heim gravatar image Peter Heim  ( 2011-04-05 22:51:43 -0600 )edit
Thanks for the reply Eric. Yes I tried sound_play and I could get it to do the "computer speech" which is the code, but when I tried to use the code which is the section that should play a was. it would just do a quick click sound and not play right/ I worked on that for about two days researching why it as doing that, found a few suggestions that did not work and then dug into Garratt's code for his piano demo to see how he did it and stumbled across the
Scott gravatar image Scott  ( 2011-04-05 23:48:26 -0600 )edit
sfml code and found that to be more dependable for me.
Scott gravatar image Scott  ( 2011-04-05 23:49:42 -0600 )edit
Reminder: For those that like the tutorial I created, please don't forget to vote for it! If people find this useful I will post tutorials on other topics as well. (Thanks, Scott)
Scott gravatar image Scott  ( 2011-04-06 04:59:24 -0600 )edit

answered 2011-04-06 12:20:36 -0600

fergs gravatar image

We've released a pocketsphinx package at Albany several weeks ago:

This is basically the same gstreamer demo, but we've removed the GUI (it now just does continuous diction, although we've also added ROS services to start/stop the recognizer), added parameters for setting language model and dictionary, and added rosdep configurations so that you can install pocketsphinx itself using the rosdep tool. Parameter names and topics are listed in the ROS wiki page.

edit flag offensive delete link more


I am new to ROS and have been using the pocketsphinx package that you have mentioned above. I have managed to install it and also get a sample lm and dic file through the Sphinx Knowledge Base Tool. However, I am unable to set the path to it.Pls help

dexter05 gravatar image dexter05  ( 2016-08-25 22:40:35 -0600 )edit

answered 2017-03-02 10:58:15 -0600

gorinars gravatar image

We recently provided a very simple example of using pocketsphinx-python to control turtlebot.

Compared to similar projects that I've found (if I missed some better ones, please let me know) so far the advantages are:

  • removed GStreamer dependency
  • support latest CMU Sphinx decoder (pocketsphinx-5prealpha) with its last features and models for several languages
  • key phrase spotting mode that allows continuous listening and filtering out-of-vocabulary words and noises
  • simple code (one python script) and tutorial for the beginners

I'll be happy for any suggestions on how to integrate this properly in ROS community.

edit flag offensive delete link more


I tried this on my turtlebot and it worked for me! Thank you! The recognition is pretty poor though but I'm sure it can be improved.

danie11am gravatar image danie11am  ( 2017-04-28 19:16:55 -0600 )edit

answered 2016-06-03 01:32:31 -0600

Kei Okada gravatar image

for speech recognition, you can also use following ROS nodes

web based ->


android based ->

edit flag offensive delete link more

answered 2011-02-27 04:37:57 -0600

Scott gravatar image

Excellent. Thanks for the information. I will check out the link and post back with tips for others or problems I run into once I get a chance to dig in and try it. Very appreciated. Best Regards, -Scott

edit flag offensive delete link more

answered 2011-02-22 19:30:34 -0600

KoenBuys gravatar image

TUM also has a speach recognition package.

edit flag offensive delete link more

answered 2011-02-25 07:12:50 -0600

Peter Heim gravatar image

here is a link to the online tool that will generate a dic file link text


edit flag offensive delete link more

answered 2011-02-24 08:27:14 -0600

Scott gravatar image

Thanks for the post. Very cool. I was able to get this to work as well. I tried to update the dictionary with a few additional words. I want to be able to say TURN LEFT, TURN RIGHT, STOP, BACK UP, and some basic directional command. Any suggestions for how to update the code to allow for that would be appreciated.

Here is a cut and past of what I added to the 1495.dic file.





Thanks, -Scott

edit flag offensive delete link more


I'll have to check with my colleague on some of this. I believe there is a tool online somewhere that he uses to generate the models for speech recognition. When I find it, I'll update my answer with that info. You couldd take a look at the "motoric.launch" file for a "verbal joystick" interface.
Eric Perko gravatar image Eric Perko  ( 2011-02-24 09:00:30 -0600 )edit

answered 2011-02-22 18:54:06 -0600

Peter Heim gravatar image

I just tried it and it works fine, the voice recognition works for both me and my 9 year old son. I previously used voice recognition with the leaf robot (MS sapi5 ?) this works just as good for my voice and much better for my son's voice

edit flag offensive delete link more

Question Tools



Asked: 2011-02-21 14:48:22 -0600

Seen: 9,238 times

Last updated: Mar 02 '17