Sophie will be able to listen to your voice commands, execute them and give you back an oral reply!
For this, we’re going to use a beagleboard (black in this case) a usb sound card, a microphone and a set of speakers.
Step 1:
Login to your wit.ai account (this is an easy step, you can simply login using your github credentials).
There, you will need to create your first app (i’m calling my assistant Sophie) and a set of intents. For this version, I’m only expecting Sophie to respond to wiki and weather queries so you will need to create two intents
1. wiki beirut
2. weather byblos
and then you will need to specify the entity type (location for “byblos” in weather and wikipedia_search_query for wiki)
Step 2:
Create the wiki and weather connectors in scriptri.io
For the weather connector, you can re-use the code from the weather machine (requires a free account at accuweather and a scriptr script).
For wiki, we’re going to use the MediaWiki api and get back the extracts in json mode https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&explaintext=&titles=Beirut
The code will look like this
var http = require("http"); var requestObject = { "url": "https://en.wikipedia.org/w/api.php", "params": { "format":"json", "action":"query", "prop":"extracts", "explaintext":"" } } requestObject.params.titles = search; var res = http.request(requestObject);
Step 3:
Get the wit.ai scriptr client from our github repo) and configure it to point to your connectors.
I’ve called the wiki connector “wit/wiki” and the weather connector “wit/weather” so we will need to configure the wit.ai client to use these.
var mapping = { "wit_weather": { "script": "wit/weather", "params":{ "location":"location" } }, "wit_wiki": { "script": "wit/wiki", "params": { "wikipedia_search_query": "search" } } };
Step 4:
Create the scriptr API that will handle all requests from your device.
This script needs to be able to
1. receive an mp3 file
2. post it to wit.ai and get back the intent / entities
3. use the wit.ai client to process the wit.ai request and call the appropriate connector (based on the mappings)
4. get the response from the wit.ai client and pass it along to a text to speech service (tts-api.com) from which we will get back a url containing the mp3 version of the response we got
wit-config wit-reply wit-wiki wit-weather
Step 5:
Configure the hardware side which consists of a python script which will
1. open the microphone and listen while the sound is under a certain threshold
2. Record the voice segment that is over that specified threshold
3. Encode the wave file into an mp3 file
4. Upload the encoded file to a scriptr script
5. Fetch the url that contains the voice response to the query
6. Download the response
7. Play it on your speakers
To do so, we’re going use python (pre-installed), pyaudio (apt-get install python-pyaudio) requests (pip install requests), lame (apt-get install lame) to encode the output into mp3, mplayer (apt-get install mplayer) to play back the text-to-speech(ed) response and some code fetched quickly of some forums to manipulate py-audio (eg, stack overflow) which shows how to listen and record commands
Finally, we will wrap all these in a continuous loop like this
print("please speak a command into the microphone") record_to_file(path) print("done - result written to " + path) call(["lame", path, "command.mp3"]) postToScriptr("command.mp3") playmp3("out.mp3")
Step 6:
Meet Sophie!
Feel free to share with us any other functionality you might teach her!