Thursday 12 January 2017

Alice - Digital Assistant like Jarvis with Smart home automation

Demo:



If the above video has issues here is the link to the demo video

Being lazy I always wished of having something like a Jarvis (for those who don't know what I am saying from Ironman) who would do whatever I say and also because it would be so cool to have something like that. And honestly with all the IoT platforms and advances in AI (with many companies providing API's and libraries) it has become so easy to build one for yourself. And I did just that. When one of our co-founders last visited India he got a Amazon Echo for the team in India. It is such a cool gadget. I was immediately so fascinated by it and I wanted to just build one by myself. So I started exploring and researching where and how to get started. I looked at various IoT devices like Raspberry Pi, Arduino, Electric Imp (we use this one my current company in a vibration sensor we build) etc. I also looked up at some examples to get started . Initially my idea was to just build a home automation device that was voice controlled but after looking at Amazon Echo I decided I could add other functionalities to it too. So I bought a Raspberry pi, an Arduino, a microphone, speakers, a relay module, wires etc for the automation. Now not being an electronics guy I had to look into some examples for putting together the electronics to make the entire automation thing work.

Setting up raspberry pi was easy as it came with a SD card with NOOBS pre-installed. I followed this tutorial to setup the PI. Next we want to control the Arduino with your raspberry pi. There are various ways for that but I used the simple one to use the serial USB cable mode to connect the raspberry pi and arduino and used Nanpy to setup serial communication between them. Once that was done I connected the relay module to the Arduino as illustrated in the following figure.

 (The relay has 2 modes of connection Normally Open and Normally Closed. I used the normally Open one in which the switch is open ie. not connected. In this we connect the power line to common connection point ie. the middle point and the neutral line to normally open point). I connected only a fan and light in my room to it but we can connect more devices to it.

Once I had the hardware ready now its time to breakdown the problem of software. The following diagram explains my system.
As shown above the home assistant is divided in different modules. I wanted the assistant to be completely handsfree. So I needed continuous speech recognition. However using online api for continuous speech is not practically possible. So I needed some sort of trigger that will start the speech recognition. So I used a offline library that listens for a keyword (like Alice in my case). Whenever it recognizes it, it executes a callback which triggers the further process. Further then I start the speech recognition that listens for a command and sends it to the API for speech to text conversion. Once the text is received the intent parser module identifies intent and the entities from the natural language text. Once the intent is identified the corresponding intent handler handles the request. The intent parser and handler module is extendable. So we can add more features (like booking a cab, ordering food etc.) by simply creating a intent parser and intent handler for the feature.

Although there are few issues with this application. The speech recognition api doesn't work properly if the microphone is at some distance from the source. Also the speech recognition doesn't function properly if there is background noise (like when music is playing the background). So to overcome this issue to some extent, I added a feature to run the system in web server mode. In this mode, the raspberry pi runs a webserver and listens for text command, which is then parsed by the intent parser and processed by intent handler. The client sends a text command to web server. So we can use any device like computer or smart phone (although I haven't built an mobile app) to record speech and convert it to text and send the text command. For demo, I created a client for the computer which does it. 

It was fun little exercise and I got to learn a lot. Here is the link for the code for the assistant. Please feel free to check it out and use it as you like (and possibly help improve).

Saturday 4 June 2016

Interesting undocumented Facebook API to identify friends in any personal photos

So last week I was casually browsing through Facebook feed and looking at some random photo posted by a friend. So hovering over the picture the Facebook displayed a rectangle over the face of my friend and showed "Do you want to tag X". This caught my attention. OK so Facebook does image recognition to identify not only faces of people but also identifies who it is. I wondered how FB does it so I spawned up my Chrome Dev Tools, to debug into how this whole thing works.



So it seems that FB has a bunch of div blocks which are the defining the face boxes with the suggestions for identity for each face box in data-recognizeduids. It also had another block of html with the names of the identified users.

OK so now the question is how is this data coming in the first place. It seems it is loaded when one clicks on a photo. So it must be loaded using some ajax call to the server later. Now to find out what call we switch to network tab which shows all the traffic between the browser and FB servers. So now it was all about finding the needle in haystack as there were hundreds of requests. So I started looking for the once with relevant names and struck gold when I found a request "/ajax/pagelet/generic.php/PhotoViewerInitPagelet". 



























Then I searched the response for relevant content and found above block of html in the response. The params to the route looked encoded so after decoding them it seemed it sends bunch of params along with the id of the photo. The request looks something like this

https://www.facebook.com/ajax/pagelet/generic.php/PhotoViewerInitPagelet?dpr=2&ajaxpipe=1&ajaxpipe_token=<changed>&no_script_path=1&data={"fbid":"<changed>","set":"<changed>","type":"3","theater":null,"firstLoad":true,"ssid":<changed>,"av":"<changed>"}&__user=1&__a=1&__dyn=<changed>&__req=jsonp_2&__be=0&__pc=PHASED:DEFAULT&__rev=11111&__adt=2

Most of the params don't make much sense except for fbid which looks like the id of the photo. So I played around removing unnecessary params to find the ones that are actually needed for api to work and I found that it needed only the fb id and 2-3 other constant params. Chrome has a pretty neat feature where it allows you to copy a request as CURL request which has all headers and params so you could run it from the command line as is. Using that I replicated the request and found the same response which had the html data containing the information about the users. So I thought if we can use this API to find friends in any photo which might not be uploaded on FB. 

But for the API to work it needs the photo to be on FB. So I thought if we can temporarily upload a photo to FB without publishing it and use the API. Now to upload photos on FB I used the FB Graph API. I needed these photos not to be published and uploaded temporarily for using the API. The Graph Photo Upload API allows to upload the images with options "temporary"=> true and "published"=> false so that the images are uploaded temporarily and not show up in the feed. I created an app on FB Photo Friend Finder which uses publish_actions and user_photos permissions to allow access to the photo API. Once the photo is uploaded the API returns with the id of the photo which is then passed to the above PhotoViewerInitPagelet API which returns HTML data containing details of face suggestions. Now all I needed to do was parse the response to extract the data which can easily be done using HTML parser. I used Nokogiri in ruby for the same.

I put together a small Ruby script to do all of these. So you just run the script with path to image as input param and the script returns the number of people in the photo and identifies the friends. I found this API has a limitation it only suggests people in the photo who are in your FB friend list. I used Koala gem to access FB Graph API and JSON and Nokogiri libraries to parse the API response.
Here is the link to the script.
Script in action.























It was a lot of fun exploring and putting together the script.

Sunday 29 March 2015

Series Torrent Downloader to Dropbox

Blog post after a lonnng time. So I like to watch movies and series a lot. I started following so many series that it became difficult to keep track of what series is aired when and what episode am I on. It was almost like monday one, tuesday another, wednesday third one. And after that search and download it from somewhere( ;) torrents). So for a long time I had this idea that there should be some application that automatically takes care of everything from keeping track of all my series (which episode is aired when and download the next aired episode from torrents to my dropbox and notify me after that). But didn't find time to build one. But finally I was able to put something together. 

So it all started when I came across the popcorn app, which is used for streaming movies and series from torrents. So I just got curious about how it works, and I came the torrent stream library, that the popcorn app used. It basically takes a torrent and generates a stream out of it, and you can write it to a file, stream it on a video player. So I thought if I could upload the stream to some cloud storage. Now dropbox provides a chunked upload feature that can take chunks of file, so essentially you could break down a very large file into chunks and upload it, or resume a partial download using this feature. So I took the chunks in stream provided by library and uploaded chunks to dropbox. It seems even google drive provides an api for it. But for now I have it only for dropbox. As I am open sourcing it, someone might be interested in extending it to other cloud storage services. So now I had the portion that uploaded a torrent to dropbox. Now as this library is in nodejs. But I am not so proficient in JS. So for the other part I used ruby. So I wrote a service that essentially monitors all series that you specify in configs and download them to dropbox using the nodejs script. It uses the tv maze api, to get the next episode for each series that it needs to download. So it essentially runs like a cron, every 2 hours it gets an episode of a series from api checks if that episode is out for download, if not it records it in a DbStore. (I used redis as my db store.) and sleeps. Next time when it runs it checks if the next episode in my db store is available. Now if it is then it downloads the file to dropbox, and then updates the next episode to download in db store. 

Once both scripts were done, I thought of deploying it to my (free) heroku instance. However the challenge with that is how do I trigger it at specific times in a day. It does provide something called a worker dyno, but I don't have much idea if heroku provides a free plugin for scheduling scripts. So I came up with this. I run a web server (I used sinatra), and have a continuous background thread that actually runs the above task, then sleeps for x hours and then comes back, so essentially behaving like a scheduled task. And using this technique I could also add a route that simple dumps the logs of the task to monitor if everything went well. Now there was one more small issue with heroku. Heroku shuts down your app if there is inactivity for some time( I think 15 minutes.) But now I cannot afford my task shutting down as I don't have any way to track if it shutdown in between a download. So to keep it up all the time, I used a service (pingdom) that sends a ping to the app every 5 minutes. It is essentially used to track the uptime of a service. Ironically, I used it to keep my app up all the time.

I have open sourced it on github. You could check it out, and help me improve it. And use it at your own risk. It can be deployed in both modes as a webserver on heroku or a cron on some other cloud instance. Although to use it you need to create a app on your dropbox, authorize it and get a access token. And add the access token to dnode.js. It is the nodejs script that is actually used for downloading the torrent to dropbox. If you want yourself to be notified you could also create a mailgun account and create a key and add it to constants.yml that is used to send emails to your specified email. Otherwise you could comment the mail sending line. You also need a redis node on cloud, which is used as a db store. I used redislabs service which provides a 25 MB node for free. And to add the series you want to monitor you could add it to the list in constants.yml.




Sunday 5 January 2014

Gmail Notifier for Mac OSX

Recently, I came across this gem called terminal-notifier, it allows you to send Mac OS X User Notifications. It also comes with a command-line tool. I was thinking of some interesting use-case for it. Then an idea struck, how about an application that notifies when I receive a mail contains specific text, or from some specific user.
Now I'll give a little background about how and why this app. In our office on any special occasion, people get chocolates or sweets and put it in cafeteria, and then mail in a common group to let people know. As it happens, I don't have the habit to check my mails very frequently, and often miss out on important mails at right time. So how about an script that checks for mails containing certain words or from a specific user.
So the task was simple check for the new emails and if they contain a match for the given criteria in my case it was if the mail subject or the body contains words sweets or chocolates then send a User Notification. To get the emails I used the gmail gem. Now this gem needs your username and password. But storing them as plaintext in the script is not smart from security point. So I was looking for a simple and reliable option to store and retrieve my username and password. So I used OSX keychain tool to store the sensitive information. To retrieve the email and password I used the commandline tool "security". Initially to store the username and password in OSX Keychain use,

security add-internet-password -a username -w password -s http://mail.google.com

And then to retrieve it use,

security 2>&1 find-internet-password -ga username | grep password | cut -d '"' -f2

Moving forward once I get the username and password, I logged into the gmail account by,

gmail = Gmail.new(username, password)

After logging in I searched through the unread mails for the search queries in both subject and body of mail. Once I find the mail containing the search term I fire a notification. Now as this gem didn't have a sound notification I used the system sounds file and played it using the afplay command as,

afplay -v 4 -q 1 /System/Library/Sounds/Glass.aiff

and then fire the user notification by calling,

TerminalNotifier.notify("Subject : #{content_to_display}", :title => 'Title')

and then marked the mail as read so that it didn't come up when next time the cron runs. Finally to continuously scan through my inbox I set up this script as a cron that will run every minute,
*/1 * * * * ruby gmail_notifier.rb

You can find the complete script on github here

Wednesday 26 December 2012

A simple No-SQL key-value db using self-modifying ruby script: An interesting application of Ruby's Reflection API

I am basically a java developer and I recently started learning ruby and was amazed at the ease in which you can develop in it and elegance of the language. The easiest way to learn any programming language is to develop simple applications in it that use various features of language. So when going through the various features I came across Reflection and Metaprogramming in Ruby. A very powerful feature in ruby. It amazing how the eval function allows one to write and execute ruby code dynamically. So while thinking of an application using this feature I came up with the following application.
It is a simple No-SQL key-value db that stores data in hash in ruby. The first line declares a hash. Then the following part of script reads its own code and uses eval function to execute it declare the hash. Then depending upon the function called GET/SET it either retrieves the value associated with the key or sets a new key/value in hash. Then it simply stores the new hash in the source code.
Here is the source code for the ruby script:
hash={"1"=>"Narendra", "2"=>"Mangesh", "3"=>"Viru", "4"=>"Virendra", "5"=>"Genh"}
z=[]
err_msg="ruby #{__FILE__} <GET/SET> <key_to_search/key_to_set> <not_required/value_to_set>"
if ARGV.length<2 && (ARGV[0]!="GET" || ARGV[0]!="SET")
  puts err_msg
  exit
end
if ARGV[0]=="SET" && ARGV.length!=3
  puts err_msg
  exit
end
File.open(__FILE__,'r'){|f| f.each_line {|l| z<<l}}
c=eval(z[0])
if ARGV[0]=="GET"
  puts c[ARGV[1]] if c.include?(ARGV[1])
elsif ARGV[0]=="SET"
  c[ARGV[1]]=ARGV[2]
  z[0]="hash="+c.inspect+"\n"
  File.open(__FILE__,'w'){|f| z.each{|x| f<<x}}
end

Thursday 13 December 2012

Hacking the Little Alchemy game with only Chrome in less than an hour

Today my friend introduced me to this cute little but addictive game named Little Alchemy.
So playing around with it for few minutes and looking at the time it took to find new elements I wondered how much time it would take to find all. Not having the patience to play whole game to determine all elements I thought why not do it the hacker style. So popped open the Chrome Web Inspector and looked around in the network tab to see what all data is sent by littlealchemy web page. I noticed it stores all data in application cache and retrieves from there. Looking around the js files for something interesting I struck gold when I found the logic in alchemy.js file. It seemed to make ajax call to 2 files /base/names.json and /base/base.json. Then I looked into these files and found that the mapping of all the elements was stored in base.json in array forms and all the names of elements were stored in names.json. After finding this it was a piece of cake to hack together a javascript code to display the combinations for all elements. So then I opened the javascript console and put together this piece of code to print the combinations for all elements and dumped it to a html file. You can check the output here.

And here is the piece of javascript code

var base,names,i,j;
$.ajax({
          type: "GET",
          url: "http://littlealchemy.com/base/base.json",
          }).done(function( data ) {
          base=data;
});
$.ajax({
          type: "GET",
          url: "http://littlealchemy.com/base/names.json",
          }).done(function( data ) {
          names=data;
});
for(i=4;i<base.base.length;i++){
   for(j=0;j<base.base[i].length;j++){
      if(names.lang[Number(base.base[i][j][0])]==undefined){
         console.log(names.lang[i]+ " doesn't have any combination");
         continue;
      }
      console.log(names.lang[Number(base.base[i][j][0])]+" + "+names.lang[Number(base.base[i][j][1])]+" => "+names.lang[i]);
   }
}


Thursday 16 August 2012

So damn true: "Necessity the mother of building cool stuff"

Few days back, some hardware issue with my hard drive left it irrepairable and all my data was lost including softwares, movies, music. (Thank god I use github to host all my projects).

Now loss of movies and softwares is no big deal. But finding one's song collection is difficult as one has a particular taste for music. But good for me all that music is stored in my IPod. But as we all know Apple doesn't allow one to copy music files from IPod to computer. So what I did was look around some way to get my music from IPod to PC. I opened the IPod in USB storage mode and looked at where the files are stored. I found all the files but they were stored with random (unrecognizable) names in it. So for starters I copied all those files to my PC. Then I looked for a way to get the files some understandable names. But naming around 1000 songs listening to each one is not the way a geek would do it. So then I wrote this java app that scanned through all mp3 files and read the ID3 tags to get the song title and renamed all songs. Cool so it scanned all files and did the task in few minutes. Then I thought why not modify it a little and let it scan through your entire computer and look for mp3 files, rename them using song title and store them in a single directory where the songs are categorized in subdirectories using the album  name or artist name. In this way all the duplicates of a single mp3 would also be removed.

So this Mp3Manager is a console utility in java that takes as input the name of root directory to store all your music files and type of categorization to use to store the music files in directories(album name or artist). It is available on github

Read the readme.md to know how to use it. Let me know if you like it or if you would like any modifications to it.