利用goole voice实现语音识别

wjb711 发表于 2013-9-29 16:18:44

1#!/bin/bash

2

3echo "Recording... Press Ctrl+C to Stop."

4arecord -D "plughw:1,0" -q -f cd -t wav | ffmpeg -loglevel panic -y -i - -ar 16000 -acodec flac file.flac> /dev/null 2>&1

5

6echo "Processing..."

7wget -q -U "Mozilla/5.0" --post-file file.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" | cut -d\" -f12>stt.txt

8

9echo -n "You Said: "

10cat stt.txt

11

12rm file.flac> /dev/null 2>&1

上面的这些创建一个speech2text.sh的文件，插上能用语音的摄像头
说英文，人品大爆发的时候，真的可以翻译出来

不过多数时候会被墙掉，悲哀啊。这么好的免费应用

转载自
http://blog.oscarliang.net/raspberry-pi-voice-recognition-works-like-siri/

Raspberry Pi Speech Recognition IntroductionThis tutorial demonstrate how to use voice recognition on the Raspberry Pi. By the end of this demonstration, we should have a working application that understand and answers your oral question.This is going to be a simple and easy project because we have a few free API available for all the goals we want to achieve. It basically converts our spoken question into to text, process the query and return the answer, and finally turn the answer from text to speech. I will divide this demonstration into four parts:
[*]speech to text
[*]query processing
[*]text to speech
[*]Putting Them Together
Result Example:
Raspberry Pi Voice Recognition For Home AutomationThis has been a very popular topic since Raspberry Pi came out. With the help of this tutorial, it should be quite easily achieved. I actually having an idea of combining the Speech recognition ability on the Raspberry Pi with the powerful digital/analog i/o hardware, to build a useful voice control system, which could also be adopted in Robotics and Home Automation. This will be in the next couple of blog posts.Hardware and Preparationhttp://blog.oscarliang.net/wp-content/uploads/2013/06/IMAG0683.jpgYou can use an USB Microphone, but I don’t have one so I am using the built-in Mic on my webcam. It worked straight away without any driver installation or configuration.
http://blog.oscarliang.net/wp-content/uploads/2013/06/8601p.jpgOf course, the Raspberry Pi as well.
http://blog.oscarliang.net/wp-content/uploads/2013/06/images.jpgYou will also need to have internet connection on your Raspberry Pi.Speech To TextSpeech recognition can be achieved in many ways on Linux (so on the Raspberry Pi), but personally I think the easiest way is to use Google voice recognition API. I have to say, the accuracy is very good, given I have a strong accent as well. To ensure recording is setup, you first need to make sure ffmpeg is installed:sudo apt-get install ffmpegTo use the Google’s voice recognition API, I use the following bash script. You can simply copy this and save it as ‘speech2text.sh‘
1#!/bin/bash

2

3echo "Recording... Press Ctrl+C to Stop."

4arecord -D "plughw:1,0" -q -f cd -t wav | ffmpeg -loglevel panic -y -i - -ar 16000 -acodec flac file.flac> /dev/null 2>&1

5

6echo "Processing..."

7wget -q -U "Mozilla/5.0" --post-file file.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" | cut -d\" -f12>stt.txt

8

9echo -n "You Said: "

10cat stt.txt

11

12rm file.flac> /dev/null 2>&1

What it does is, it starts recording and save the audio in a flac file. You can stop the recording by pressing CTRL+C. The audio file is then sent to Google for conversion and text will be returned and saved in a file called “stt.txt”. And the audio file will be deleted.And to make it executable.chmod +x speech2text.shTo run it./speech2text.shThe screen shot shows you some tests I did.http://blog.oscarliang.net/wp-content/uploads/2013/06/1.pngQuery ProcessingProcessing the query is just like “Google-ing” a question, but what we want is when we ask a question, only one answer is returned. Wolfram Alpha seems to be a good choice here.There is a Python interface library for it, which makes our life much easier, but you need to install it first.Installing Wolframalpha Python LibraryDownload package from https://pypi.python.org/pypi/wolframalpha, unzip it somewhere. And then you need to install setuptool and build the setup.apt-get install python-setuptools easy_install pipsudo python setup.py buildAnd finally run the setup.sudo python setup.pyGetting the APP_IDTo get a unique Wolfram Alpha AppID, signup here for a Wolfram Alpha Application ID.You should now be signed in to the Wolfram Alpha Developer Portal and, on the My Apps tab, click the “Get an AppID” button and fill out the “Get a New AppID” form. Use any Application name and description you like. Click the “Get AppID” button.Wolfram Alpha Python InterfaceSave this Pyhon script as “queryprocess.py”.
1#!/usr/bin/python

2

3import wolframalpha

4import sys

5

6# Get a free API key here http://products.wolframalpha.com/api/

7# This is a fake ID, go and get your own, instructions on my blog.

8app_id='HYO4TL-A9QOUALOPX'

9

10client = wolframalpha.Client(app_id)

11

12query = ' '.join(sys.argv)

13res = client.query(query)

14

15if len(res.pods) > 0:

16 texts = ""

17 pod = res.pods

18 if pod.text:

19 texts = pod.text

20 else:

21 texts = "I have no answer for that"

22 # to skip ascii character in case of error

23 texts = texts.encode('ascii', 'ignore')

24 print texts

25else:

26 print "Sorry, I am not sure."

You can test it like this shown in the screen shot below.http://blog.oscarliang.net/wp-content/uploads/2013/06/2.pngText To SpeechFrom the processed query, we are returned with an answer in text format. What we need to do now is turning the text to audio speech. There are a few options available like Cepstral or Festival, but I chose Google’s speech service due to its excellent quality. Here is a good introductions of these software mentioned.First of all, to play audio we need to install mplayer:sudo apt-get install mplayerWe have this simple bash script. It downloads the MP3 file via the URL and plays it. Copy and call it “text2speech.sh“:
1#!/bin/bash

2say() { local IFS=+;/usr/bin/mplayer -ao alsa -really-quiet -noconsolecontrols"http://translate.google.com/translate_tts?tl=en&q=$*"; }

3say $*

And to make it executable.chmod +x text2speech.shTo test it, you can try./text2speech.sh "My name is Oscar and I am testing the audio."Google Text To Speech Text Length LimitationAlthough it’s very kind of Google sharing this great service, there is a limit on the length of the message. I think it’s around 100 characters.To work around this, here is an upgraded bash script that breaks up the text into multiple parts so each part is no longer than 100 characters, and each parts can be played successfully. I modified the original script is from here to fit into our application.
1#!/bin/bash

2

3INPUT=$*

4STRINGNUM=0

5ary=($INPUT)

6for key in "${!ary[@]}"

7do

8SHORTTMP[$STRINGNUM]="${SHORTTMP[$STRINGNUM]} ${ary[$key]}"

9LENGTH=$(echo ${#SHORTTMP[$STRINGNUM]})

10

11if [[ "$LENGTH" -lt "100" ]]; then

12

13SHORT[$STRINGNUM]=${SHORTTMP[$STRINGNUM]}

14else

15STRINGNUM=$(($STRINGNUM+1))

16SHORTTMP[$STRINGNUM]="${ary[$key]}"

17SHORT[$STRINGNUM]="${ary[$key]}"

18fi

19done

20for key in "${!SHORT[@]}"

21do

22say() { local IFS=+;/usr/bin/mplayer -ao alsa -really-quiet -noconsolecontrols"http://translate.google.com/translate_tts?tl=en&q=${SHORT[$key]}"; }

23say $*

24done

Putting It TogetherFor all of these scripts to work together, we have to call them in a another script. I call this “main.sh“.
1#!/bin/bash

2

3echo "Recording... Press Ctrl+C to Stop."

4

5./speech2text.sh

6

7QUESTION=$(cat stt.txt)

8echo "Me: ", $QUESTION

9

10ANSWER=$(python queryprocess.py $QUESTION)

11echo "Robot: ", $ANSWER

12

13./text2speech.sh $ANSWER

I have also updated and removed all the ‘echo’ commands from “speech2text.sh”
1#!/bin/bash

2

3arecord -D "plughw:1,0" -q -f cd -t wav | ffmpeg -loglevel panic -y -i - -ar 16000 -acodec flac file.flac> /dev/null 2>&1

4wget -q -U "Mozilla/5.0" --post-file file.flac --header "Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=en-us&client=chromium" | cut -d\" -f12>stt.txt

5rm file.flac> /dev/null 2>&1

Finally, make “main.sh” executable, run it and have silly conversation with your computer http://blog.oscarliang.net/wp-includes/images/smilies/icon_smile.gifchmod +x text2speech.sh./main.shThe EndThat’s the end of Raspberry Pi Voice Recognition tutorial, but it’s just the beginning of fun! You can now modify this project and turn it into something really cool, let me know what you can come up with. In the next project, I will exploit the speech to text feature, to make a voice control system to control an Arduino board, and even better, a robot.Have fun.

页: [1]

树莓派论坛's Archiver

利用goole voice实现语音识别