ROS Resources: Documentation | Support | Discussion Forum | Index | Service Status | ros @ Robotics Stack Exchange
Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

pocketsphinx will work. I've done it that way. But of course the goal gets sent to move_base, not to RVIZ.

In my case, I created a node that takes the desired goal location from pocketsphinx as a string("door", "kitchen", etc) and lookup the pose information from a table, then send that pose goal to navigation stack. I discussed sending the goal in and answer to this post: https://answers.ros.org/question/259418/sending-goals-to-navigation-stack-using-code/#259427

pocketsphinx will work. I've done it that way. But of course the goal gets sent to move_base, not to RVIZ.

In my case, I created a node that takes the desired goal location from pocketsphinx as a string("door", "kitchen", etc) and lookup the pose information from a table, then send that pose goal to navigation stack. I discussed sending the goal in and an answer to this post: https://answers.ros.org/question/259418/sending-goals-to-navigation-stack-using-code/#259427

pocketsphinx will work. I've done it that way. But of course the goal gets sent to move_base, not to RVIZ.

In my case, I created a node that takes the desired goal location from pocketsphinx as a string("door", "kitchen", etc) and lookup the pose information from a table, then send that pose goal to navigation stack. I discussed sending the goal in an answer to this post: https://answers.ros.org/question/259418/sending-goals-to-navigation-stack-using-code/#259427

*Update after follow on question: *

In comment below you ask how to do this in reality, but if you have it working on turtlebot then you should be able to extend to move_base as described above. It depends on how you want to do the voice the commands or rather what type of commends you want to give. Do you want to:

  1. give it a goal location using coordinates and an orientation, like"go to ten dot 23 X 2 dot 1 Y 2 dot 5 Z"?
  2. give predefined place in the map names and then you simply say the name (as I describe above) like "kitchen"
  3. say "forward" "back" and essentially telop it around with your voice?

You may approach it differently given what it is you want to do. It sounds like you want autonomous navigation so I think option 3 is out so you'll need to decide between option 1 and 2.

I'll assume option 2 since option 1 really males no sense to me even though it is what you asked for.

Step 1 - add the names of the predefined locations to pocketsphinx You'll use the voice_cmd.launch file for starting. voice_cmd.lm and voice_cmd.dic will both need updating with the new words you need pocketsphinx to recognize. Here is my voice_cmd.dic file to show pocketsphinx how to pronounce the words. This file includes words that did not come with the tutorial.

I got the phonetic spellings from this web page: http://www.speech.cs.cmu.edu/cgi-bin/cmudict

BACK    B AE K
BACKWARD    B AE K W ER D
FORWARD F AO R W ER D
FULL    F UH L
HALF    HH AE F
HALT    HH AO L T
LEFT    L EH F T
MOVE    M UW V
RIGHT   R AY T
SPEED   S P IY D
STOP    S T AA P
COFFEE  K AA F IY
COFFEE  K AO F IY
TABLE   T EY B AH L
KITCHEN K IH CH AH N
ROBOT   R OW B AA T
DINING  D AY N IH NG
CANCEL  K AE N S AH L
REBECCA R AH B EH K AH
LASER   L EY Z ER
ON  AA N
OFF AO F
SING    S IH NG
FRONT   F R AH N T
DOOR    D AO R
THE DH AH
BAR B AA R
REMOTE  R IH M OW T
CONTROL K AH N T R OW L
WAKE    W EY K
SLEEP   S L IY P

The new words need to be added to .LM file and I'll tell you right now that the LM file is tricky. I'm just gonna post the entire file as I use it because I can't explain how to set it up fully. I had to play around quite a bit as I added words to the dictionary to keep pocketsphinx working. If you have questions about this this file please google pocketsphinx instead of asking for my help. As you add words to the list you need to update the ngram numbers, but it wasn't always clear to me where to add the words and never clear to what timing I should use before and after the words, but the way I did it seem to work OK.

Language model created by QuickLM on Sat Mar 19 20:15:00 EDT 2011
Copyright (c) 1996-2000
Carnegie Mellon University and Alexander I. Rudnicky

This model based on a corpus of 14 sentences and 13 words
The (fixed) discount mass is 0.5

\data\
ngram 1=33
ngram 2=37
ngram 3=26

\1-grams:
-0.8451 </s> -0.3010
-0.8451 <s> -0.2074
-1.6902 BACK -0.2341
-1.6902 BACKWARD -0.2341
-1.6902 FORWARD -0.2341
-1.9912 FULL -0.2921
-1.9912 HALF -0.2921
-1.9912 HALT -0.2341
-1.6902 LEFT -0.2341
-1.2923 MOVE -0.2543
-1.6902 RIGHT -0.2341
-1.6902 SPEED -0.2341
-1.9912 STOP -0.2341
-1.9912 COFFEE -0.2341
-1.9912 TABLE -0.2341
-1.9912 DINING -0.2341
-1.9912 ROBOT -0.2341
-1.9912 KITCHEN -0.2341
-1.9912 CANCEL -0.2341
-1.9912 REBECCA -0.2341
-1.9912 LASER -0.2341
-1.9912 ON -0.2341
-1.9912 OFF -0.2341
-1.9912 SING -0.2341
-1.9912 REMOTE -0.2341
-1.9912 CONTROL -0.2341
-1.9912 FRONT -0.2341
-1.9912 DOOR -0.2341
-1.9912 THE -0.2341
-1.9912 BAR -0.2341
-1.9912 WAKE -0.2341
-1.9912 SLEEP -0.2341
-1.9912 UP -0.2341

\2-grams:
-1.4472 <s> BACK 0.0000
-1.4472 <s> BACKWARD 0.0000
-1.4472 <s> FORWARD 0.0000
-1.4472 <s> FULL 0.0000
-1.4472 <s> HALF 0.0000
-1.4472 <s> HALT 0.0000
-1.4472 <s> LEFT 0.0000
-0.7482 <s> MOVE 0.0000
-1.4472 <s> RIGHT 0.0000
-1.4472 <s> STOP 0.0000
-1.4472 <s> COFFEE 0.0000
-1.4472 <s> DINING 0.0000
-1.4472 <s> ROBOT 0.0000
-1.4472 <s> KITCHEN 0.0000
-0.3010 BACK </s> -0.3010
-0.3010 BACKWARD </s> -0.3010
-0.3010 FORWARD </s> -0.3010
-0.3010 FULL SPEED 0.0000
-0.3010 HALF SPEED 0.0000
-0.3010 HALT </s> -0.3010
-0.3010 LEFT </s> -0.3010
-1.0000 MOVE BACK 0.0000
-1.0000 MOVE BACKWARD 0.0000
-1.0000 MOVE FORWARD 0.0000
-1.0000 MOVE LEFT 0.0000
-1.0000 MOVE RIGHT 0.0000
-0.3010 RIGHT </s> -0.3010
-0.3010 SPEED </s> -0.3010
-0.3010 STOP </s> -0.3010
-1.0000 COFFEE TABLE 0.0000
-1.0000 DINING TABLE 0.0000
-1.0000 LASER OFF 0.0000
-1.0000 LASER ON 0.0000
-0.3010 REMOTE CONTROL -0.3010
-1.0000 FRONT DOOR 0.0000
-1.0000 THE BAR 0.0000
-1.0000 WAKE UP 0.0000


\3-grams:
-0.3010 <s> BACK </s>
-0.3010 <s> BACKWARD </s>
-0.3010 <s> FORWARD </s>
-0.3010 <s> FULL SPEED
-0.3010 <s> HALF SPEED
-0.3010 <s> HALT </s>
-0.3010 <s> LEFT </s>
-1.0000 <s> MOVE BACK
-1.0000 <s> MOVE BACKWARD
-1.0000 <s> MOVE FORWARD
-1.0000 <s> MOVE LEFT
-1.0000 <s> MOVE RIGHT
-0.3010 <s> RIGHT </s>
-0.3010 <s> STOP </s>
-1.0000 <s> COFFEE TABLE
-1.0000 <s> DINING TABLE
-0.3010 FULL SPEED </s>
-0.3010 HALF SPEED </s>
-0.3010 MOVE BACK </s>
-0.3010 MOVE BACKWARD </s>
-0.3010 MOVE FORWARD </s>
-0.3010 MOVE LEFT </s>
-0.3010 MOVE RIGHT </s>
-0.3010 COFFEE TABLE </s>
-0.3010 DINING TABLE </s>
-0.3010 REMOTE CONTROL </s>

\end\

Pocketsphinx will put the recognized text of the /recognizer/output topic. You need a node that subscribes to that topic and looks up the location and generates a goal to move_base as I mentioned in the post originally.

If you need help with that part of the project after looking through the question/answer I linked in the original answer I think maybe that's probably good for another question.