Ask Your Question

mathematical calculation using speech

asked 2012-11-09 19:58:11 -0600

this post is marked as community wiki

This post is a wiki. Anyone with karma >75 is welcome to improve it.

i want my robot to do simple calculation like addition or subtraction on my voice command how to proceed ? :( the code that i am trying is as follow

!/usr/bin/env python

import roslib; roslib.load_manifest('pi_speech_tutorial') import rospy

from time import gmtime, strftime from geometry_msgs.msg import Twist from std_msgs.msg import String from math import copysign import re

Small = { 'zero': 0, 'one': 1, 'two': 2, 'three': 3, 'four': 4, 'five': 5, 'six': 6, 'seven': 7, 'eight': 8, 'nine': 9, 'ten': 10, 'eleven': 11, 'twelve': 12, 'thirteen': 13, 'fourteen': 14, 'fifteen': 15, 'sixteen': 16, 'seventeen': 17, 'eighteen': 18, 'nineteen': 19, 'twenty': 20, 'thirty': 30, 'forty': 40, 'fifty': 50, 'sixty': 60, 'seventy': 70, 'eighty': 80, 'ninety': 90 }

Magnitude = { 'thousand': 1000, 'million': 1000000, 'billion': 1000000000, 'trillion': 1000000000000, 'quadrillion': 1000000000000000, 'quintillion': 1000000000000000000, 'sextillion': 1000000000000000000000, 'septillion': 1000000000000000000000000, 'octillion': 1000000000000000000000000000, 'nonillion': 1000000000000000000000000000000, 'decillion': 1000000000000000000000000000000000, }

class TalkBack:

def __init__(self):

# def __init__(self, msg): # Exception.__init__(self, msg) rospy.on_shutdown(self.cleanup)

    self.voice = rospy.get_param("~voice", "voice_don_diphone")
    self.wavepath = rospy.get_param("~wavepath", "")

    # Create the sound client object
    self.soundhandle = SoundClient()


    # Announce that we are ready for input
    self.soundhandle.playWave(self.wavepath + "/R2D2a.wav")
    self.soundhandle.say("Ready", self.voice)

    rospy.loginfo("Say commands...")

    # Subscribe to the recognizer output
    rospy.Subscriber('/recognizer/output', String, self.talkback)

    # A mapping from keywords to commands.
    self.keywords_to_command = {'plus': ['addition', 'add'],
                                'minus': ['subtraction', 'subtract'],
                                'devide': ['devision', 'by'],
                                'multiply': ['multiplication', 'into'],

   # rospy.loginfo("Ready to receive voice commands")

def get_command(self, data):
for (command, keywords) in self.keywords_to_command.iteritems():
        for word in keywords:
            if data.find(word) > -1:
                return command

def text2num(self, s):
a = re.split(r"[\s-]+", s)
    n = 0
    g = 0
    for w in a:
        x = Small.get(w, None)
        if x is not None:
            g += x
        elif w == "hundred":
            g *= 100
            x = Magnitude.get(w, None)
            if x is not None:
                n += g * x
                g = 0
                raise NumberException("Unknown number: "+w)
return n + g

def talkback(self, msg):        
    command =
    rospy.loginfo("Command: " +

    if 'plus' in command.split():
    var1,var2= command.split("plus")
        var1 = self.text2num(var1)
    var2 = self.text2num(var2)
    self.soundhandle.say(var3, self.voice)


 def cleanup(self):
    # When shutting down be sure to stop the robot!  Publish a Twist message consisting of all zeros.
rospy.loginfo("Shutting down talkback node...")

if __name__=="__main__": rospy.init_node('talkback')


edit retag flag offensive close merge delete

3 Answers

Sort by ยป oldest newest most voted

answered 2012-11-10 04:31:42 -0600

this post is marked as community wiki

This post is a wiki. Anyone with karma >75 is welcome to improve it.

Interesting question, you can reference this tutorial.

For your scenario, I think you need to do the folowings:

1.Construct the language model for keywords like "plus", "minus"...etc
2.Launch the recognizer and let the recognizer referencing your language model
3.Compute the value according to the parsed

For each part, I can give you some hints,

1.To construct the language model, write the words you need in a text file, and upload it to this tool.

2.To launch the recognizer using language model, you can reference this command after you go through the tutorial I gave.

roslaunch pocketsphinx robocup.launch

3.To compute the value, you need a simple calculator, you can either write one by using Flex&Bison or just simply use other's calculator.

If you want to let the robot speak out the answer, you can reference the last part of the tutorial.

edit flag offensive delete link more


i have read that tutorial and my language model is also complete but i am not able to think how will robot recognize whether it is a addition or subtraction problem

karan4515 gravatar image karan4515  ( 2012-11-11 19:55:26 -0600 )edit

A simple method is to map the recognized output to keywords and then perform calculation. One possible implementation is to store the recognized output in a string, send this string to a service which can parse this string and perform calculation. The work of recognize "plus" is done in the service.

Po-Jen Lai gravatar image Po-Jen Lai  ( 2012-11-11 20:19:58 -0600 )edit

send this string to a service which can parse this string and perform calculation? which services are this that you are referring to sir..

karan4515 gravatar image karan4515  ( 2012-11-15 21:53:01 -0600 )edit

I mean write one by yourself....this is not that hard.Programs which can perform this function is already out there on the web, just look for them and add service in that program.

Po-Jen Lai gravatar image Po-Jen Lai  ( 2012-11-15 23:08:11 -0600 )edit

answered 2012-11-10 04:14:07 -0600

SL Remy gravatar image

There are many possible ways that this can be accomplished. I can outline one path.

Develop (download) an api like Sphinx, Dragon, or Julius to perform speech to text conversion.

Develop (download) an api like Festival, espeak to perform text to speech conversion.

Develop (download) a node that uses each of these to perform text to speech and speech to text. sound_play and pocketsphinx may be all you need..

Develop (download) a node to parse a std_msgs/String and determine the calculation to be performed. Once that is known you could make a service call (as shown in the tutorials to generate the answer. Once the answer is acquired, it can then be published (again as a std_msgs/String) and the text to speech node would provide the audible response.

Good luck!

edit flag offensive delete link more

answered 2012-11-10 04:11:22 -0600

joq gravatar image

To answer the ROS part of your question: use the audio_common stack. It defines the audio_common_msgs/AudioData message, and provides capture and playback commands.

The hard part of your question is how to do speech recognition. That is a large topic, and this may not be the best forum for answering it, although some people here probably know something about it.

Perhaps you should join one of the speech recognition forums.

edit flag offensive delete link more

Your Answer

Please start posting anonymously - your entry will be published after you log in or create a new account.

Add Answer

Question Tools

1 follower


Asked: 2012-11-09 19:58:11 -0600

Seen: 634 times

Last updated: Nov 25 '12