Thursday 12 January 2017

Alice - Digital Assistant like Jarvis with Smart home automation

Demo:



If the above video has issues here is the link to the demo video

Being lazy I always wished of having something like a Jarvis (for those who don't know what I am saying from Ironman) who would do whatever I say and also because it would be so cool to have something like that. And honestly with all the IoT platforms and advances in AI (with many companies providing API's and libraries) it has become so easy to build one for yourself. And I did just that. When one of our co-founders last visited India he got a Amazon Echo for the team in India. It is such a cool gadget. I was immediately so fascinated by it and I wanted to just build one by myself. So I started exploring and researching where and how to get started. I looked at various IoT devices like Raspberry Pi, Arduino, Electric Imp (we use this one my current company in a vibration sensor we build) etc. I also looked up at some examples to get started . Initially my idea was to just build a home automation device that was voice controlled but after looking at Amazon Echo I decided I could add other functionalities to it too. So I bought a Raspberry pi, an Arduino, a microphone, speakers, a relay module, wires etc for the automation. Now not being an electronics guy I had to look into some examples for putting together the electronics to make the entire automation thing work.

Setting up raspberry pi was easy as it came with a SD card with NOOBS pre-installed. I followed this tutorial to setup the PI. Next we want to control the Arduino with your raspberry pi. There are various ways for that but I used the simple one to use the serial USB cable mode to connect the raspberry pi and arduino and used Nanpy to setup serial communication between them. Once that was done I connected the relay module to the Arduino as illustrated in the following figure.

 (The relay has 2 modes of connection Normally Open and Normally Closed. I used the normally Open one in which the switch is open ie. not connected. In this we connect the power line to common connection point ie. the middle point and the neutral line to normally open point). I connected only a fan and light in my room to it but we can connect more devices to it.

Once I had the hardware ready now its time to breakdown the problem of software. The following diagram explains my system.
As shown above the home assistant is divided in different modules. I wanted the assistant to be completely handsfree. So I needed continuous speech recognition. However using online api for continuous speech is not practically possible. So I needed some sort of trigger that will start the speech recognition. So I used a offline library that listens for a keyword (like Alice in my case). Whenever it recognizes it, it executes a callback which triggers the further process. Further then I start the speech recognition that listens for a command and sends it to the API for speech to text conversion. Once the text is received the intent parser module identifies intent and the entities from the natural language text. Once the intent is identified the corresponding intent handler handles the request. The intent parser and handler module is extendable. So we can add more features (like booking a cab, ordering food etc.) by simply creating a intent parser and intent handler for the feature.

Although there are few issues with this application. The speech recognition api doesn't work properly if the microphone is at some distance from the source. Also the speech recognition doesn't function properly if there is background noise (like when music is playing the background). So to overcome this issue to some extent, I added a feature to run the system in web server mode. In this mode, the raspberry pi runs a webserver and listens for text command, which is then parsed by the intent parser and processed by intent handler. The client sends a text command to web server. So we can use any device like computer or smart phone (although I haven't built an mobile app) to record speech and convert it to text and send the text command. For demo, I created a client for the computer which does it. 

It was fun little exercise and I got to learn a lot. Here is the link for the code for the assistant. Please feel free to check it out and use it as you like (and possibly help improve).