Wednesday 26 December 2012

A simple No-SQL key-value db using self-modifying ruby script: An interesting application of Ruby's Reflection API

I am basically a java developer and I recently started learning ruby and was amazed at the ease in which you can develop in it and elegance of the language. The easiest way to learn any programming language is to develop simple applications in it that use various features of language. So when going through the various features I came across Reflection and Metaprogramming in Ruby. A very powerful feature in ruby. It amazing how the eval function allows one to write and execute ruby code dynamically. So while thinking of an application using this feature I came up with the following application.
It is a simple No-SQL key-value db that stores data in hash in ruby. The first line declares a hash. Then the following part of script reads its own code and uses eval function to execute it declare the hash. Then depending upon the function called GET/SET it either retrieves the value associated with the key or sets a new key/value in hash. Then it simply stores the new hash in the source code.
Here is the source code for the ruby script:
hash={"1"=>"Narendra", "2"=>"Mangesh", "3"=>"Viru", "4"=>"Virendra", "5"=>"Genh"}
z=[]
err_msg="ruby #{__FILE__} <GET/SET> <key_to_search/key_to_set> <not_required/value_to_set>"
if ARGV.length<2 && (ARGV[0]!="GET" || ARGV[0]!="SET")
  puts err_msg
  exit
end
if ARGV[0]=="SET" && ARGV.length!=3
  puts err_msg
  exit
end
File.open(__FILE__,'r'){|f| f.each_line {|l| z<<l}}
c=eval(z[0])
if ARGV[0]=="GET"
  puts c[ARGV[1]] if c.include?(ARGV[1])
elsif ARGV[0]=="SET"
  c[ARGV[1]]=ARGV[2]
  z[0]="hash="+c.inspect+"\n"
  File.open(__FILE__,'w'){|f| z.each{|x| f<<x}}
end

Thursday 13 December 2012

Hacking the Little Alchemy game with only Chrome in less than an hour

Today my friend introduced me to this cute little but addictive game named Little Alchemy.
So playing around with it for few minutes and looking at the time it took to find new elements I wondered how much time it would take to find all. Not having the patience to play whole game to determine all elements I thought why not do it the hacker style. So popped open the Chrome Web Inspector and looked around in the network tab to see what all data is sent by littlealchemy web page. I noticed it stores all data in application cache and retrieves from there. Looking around the js files for something interesting I struck gold when I found the logic in alchemy.js file. It seemed to make ajax call to 2 files /base/names.json and /base/base.json. Then I looked into these files and found that the mapping of all the elements was stored in base.json in array forms and all the names of elements were stored in names.json. After finding this it was a piece of cake to hack together a javascript code to display the combinations for all elements. So then I opened the javascript console and put together this piece of code to print the combinations for all elements and dumped it to a html file. You can check the output here.

And here is the piece of javascript code

var base,names,i,j;
$.ajax({
          type: "GET",
          url: "http://littlealchemy.com/base/base.json",
          }).done(function( data ) {
          base=data;
});
$.ajax({
          type: "GET",
          url: "http://littlealchemy.com/base/names.json",
          }).done(function( data ) {
          names=data;
});
for(i=4;i<base.base.length;i++){
   for(j=0;j<base.base[i].length;j++){
      if(names.lang[Number(base.base[i][j][0])]==undefined){
         console.log(names.lang[i]+ " doesn't have any combination");
         continue;
      }
      console.log(names.lang[Number(base.base[i][j][0])]+" + "+names.lang[Number(base.base[i][j][1])]+" => "+names.lang[i]);
   }
}


Thursday 16 August 2012

So damn true: "Necessity the mother of building cool stuff"

Few days back, some hardware issue with my hard drive left it irrepairable and all my data was lost including softwares, movies, music. (Thank god I use github to host all my projects).

Now loss of movies and softwares is no big deal. But finding one's song collection is difficult as one has a particular taste for music. But good for me all that music is stored in my IPod. But as we all know Apple doesn't allow one to copy music files from IPod to computer. So what I did was look around some way to get my music from IPod to PC. I opened the IPod in USB storage mode and looked at where the files are stored. I found all the files but they were stored with random (unrecognizable) names in it. So for starters I copied all those files to my PC. Then I looked for a way to get the files some understandable names. But naming around 1000 songs listening to each one is not the way a geek would do it. So then I wrote this java app that scanned through all mp3 files and read the ID3 tags to get the song title and renamed all songs. Cool so it scanned all files and did the task in few minutes. Then I thought why not modify it a little and let it scan through your entire computer and look for mp3 files, rename them using song title and store them in a single directory where the songs are categorized in subdirectories using the album  name or artist name. In this way all the duplicates of a single mp3 would also be removed.

So this Mp3Manager is a console utility in java that takes as input the name of root directory to store all your music files and type of categorization to use to store the music files in directories(album name or artist). It is available on github

Read the readme.md to know how to use it. Let me know if you like it or if you would like any modifications to it.

Sunday 22 July 2012

Fuzzy matching autocomplete library with inbuilt standalone http server in java

Past couple of days I had this idea in mind of implementing a autocomplete that uses fuzzy matching. For fuzzy matching of a partial string with all the strings in dictionary it uses the Levenshtein distance. Now finding the Levenshtein distance of given string with each string in dictionary is very inefficient. So I used the idea described here. Converted it to java and modified it to fit the needs.

It is available in 2 modes. First is the Http server mode in which you run the standalone server and call it directly from your ajax script.It returns data in json format with decreasing scores. For this you need to configure it by providing the file name containing the words for suggestion. To use it run the server and then use this url to get the results   http://server-ip/autoc?word=word_for_autocomplete&tf=max_typos_allowed
Specify the word_for_autocomplete and the max_typos_allowed parameter.

The second is you can embed it directly in your application by first creating trie and then calling the getResult() method in CHandler class. See the source code for better understanding. A sample main method is given which uses the second method.

You can find the source code for the project here.

If you like the idea and want to improve it or port it to other languages fork me github.

Sunday 8 July 2012

Find a word in 2D matrix

I read about this problem in TC forums.

How to efficiently search for a word in a 2D matrix of letters , given the conditions that , we can start at any position and for every position we can go to only the adjacent positions (vertically,horizontally and diagonally ,i.e in all 8 directions).
Example- searching topcoder in this matrix :
t o p t
g f c q
t y o d
d f r e



My solution to this problem is a simple recursive DFS. Along with yes or no it highlights the word itself in the matrix by displaying the word and '-' at all other locations.  My implementation is as shown below.
You can also find it here


import java.util.*;
import java.io.*;
public class Main{
        public static int flag=0,m,n,dx[]={0,0,1,-1,1,-1,-1,1};
        public static int dy[]={1,-1,0,0,1,-1,1,-1};
        public static char wrd[];
        static class S{char a;int p,x,y;
                S(){}
                S(char b,int pos,int c,int d){a=b;p=pos;x=c;y=d;}
        }
        public static LinkedList<S>vis=new LinkedList<S>();
        public static LinkedList<S>init=new LinkedList<S>();
  public static Stack<S>fc=new Stack<S>();
        public static char mat[][];
        public static void main(String args[]) throws Exception{
                BufferedReader br=new BufferedReader(new InputStreamReader(System.in));
                m=Integer.parseInt(br.readLine());
                n=Integer.parseInt(br.readLine());
                int i,j;
    long a=System.currentTimeMillis();
                mat=new char[m][n];
    wrd=br.readLine().toCharArray();
                for(i=0;i<m;i++) mat[i]=br.readLine().toCharArray();
                for(i=0;i<m;i++) for(j=0;j<n;j++) if(mat[i][j]==wrd[0]){init.add(new S(wrd[0],0,i,j));}
                for(i=0;i<init.size();i++) {solve(init.get(i));if(flag==1) break;}
    if(flag==0) {System.out.println("NO");return;}
    for(i=0;i<n;i++) for(j=0;j<n;j++) mat[i][j]='-';
    S tp=null;
    for(;!fc.isEmpty();) {tp=fc.pop();mat[tp.x][tp.y]=tp.a;}
    for(i=0;i<m;i++) {
     for(j=0;j<n;j++) System.out.print(mat[i][j]);
     System.out.println();
    }
    System.out.println((System.currentTimeMillis()-a)+"ms");
        }
        public static void solve(S s){
   int i,j;
   if(s.p==wrd.length-1){System.out.println("YES");fc.push(s);flag=1;return;}
            for(i=0;i<8;i++) 
    if((s.x+dx[i])>0&&(s.y+dy[i])>0&&(s.x+dx[i])<m&&(s.y+dy[i])<n&&mat[s.x+dx[i]][s.y+dy[i]]==wrd[s.p+1]) 
     solve(new S(wrd[s.p+1],(s.p+1),(s.x+dx[i]),(s.y+dy[i])));
            if(s.p>=0&&s.p<wrd.length&&flag==1){fc.push(s);}
        }
}
21
21
topcoder
ataabmtdvegybznbzqwtb
odiurtrdutatureiortzn
godytqoqdxjwvxbnxwaxm
jhhxicpsqzfaxcctcefck
lcjsoideqlcwevkovrfyj
padervwrtjjthnjpbtuth
qacjgkaektdewmhsnyyfg
zbxbigjdgrythhgdmudef
crufjakjerwwsdffliywd
vqzbnkmfjtgrwkdgkodas
naohuyhfgofgfjuhjrvaa
mjgfdskuifyiumyjgleaw
lrepwoiepodsobtbdjdcf
hfueydfughyuicavsfosh
iovhjuhodjitgzzcaxctg
hjgsiugfyiusfgioqzpeb
jbhsfgudfghdhfghgsoqv
fgyhdutsfyirtyiytrtac
gjihysoduftfuihfytadx
gyrwydfiuzadytiudfufs
jgiosjtisodfstotisryt
YES
---------------------
---------------------
---------------------
---------------------
---------------------
---------------------
---------------------
---------------------
---------------------
---------------------
-----------------r---
------------------e--
------------------d--
------------------o--
------------------c--
------------------p--
------------------o--
------------------t--
---------------------
---------------------
---------------------
125ms

Trie datastructure

Trie is a very useful data structure. It finds many uses in string based applications like dictionary, autocomplete, spell check etc.
This blog post is not a tutorial on trie but I am simply sharing my trie implementation in java here.
You could find many other implementations on the internet but I really struggled understanding the other implementations especially the function of getting complete words matching a prefix. So I decided to write my own. Here is my implementation.
You can also find it here

class Node{
	public char d;public boolean e;
	public LinkedList<Node>c;
	public Node(){}
	public Node(char a){d=a;e=Boolean.FALSE;c=new LinkedList<Node>();}
	public Node sub(char a){
		int i;
		if(c.isEmpty()) return null;
		for(i=0;i<c.size();i++)
			if(c.get(i).d==a) return c.get(i);
		return null;
	}
}
class Trie{
	Node root;
	public Trie(){
		root=new Node(' ');
	}
	public void add(String a){
		Node n,cur=root;int len=a.length(),i,j;
		if(len==0) cur.e=Boolean.TRUE;
		else{
			for(i=0;i<len;i++){
				if(cur.sub(a.charAt(i))==null) break;
				cur=cur.sub(a.charAt(i));
			}
			while(i<len){
				n=new Node(a.charAt(i));
				cur.c.add(n);
				cur=cur.sub(a.charAt(i));
				i++;
			}
			cur.e=Boolean.TRUE;
		}
	}
	public boolean search(String a){
		Node cur=root;int i,j,len=a.length();
		for(i=0;i<len;i++) {
			if(cur.sub(a.charAt(i))==null) return Boolean.FALSE;
			cur=cur.sub(a.charAt(i));
		}
		if(i==len) return Boolean.TRUE;
		return Boolean.FALSE;
	}
	public boolean searchw(String a){
		Node cur=root;int i,j,len=a.length();
		for(i=0;i<len;i++) {
			if(cur.sub(a.charAt(i))==null) return Boolean.FALSE;
			cur=cur.sub(a.charAt(i));
		}
		return cur.e;
	}
	public String next(String pre){
		Node cur=root;int i,len=pre.length();
		for(i=0;i<len;i++) {
			if(cur.sub(pre.charAt(i))==null) return "";
			cur=cur.sub(pre.charAt(i));
		}
		String res="";
		if(i==len) {
			for(Node a:cur.c)
				res=res+String.valueOf(a.d);
			return res;
		}
		return "";
	}
	public LinkedList<Node> getnode(String pre){
		Node cur=root;int i,j,len=pre.length();
		for(i=0;i<len;i++) {
			if(cur.sub(pre.charAt(i))==null) return null;
			cur=cur.sub(pre.charAt(i));
		}
		if(i==len) return cur.c;
		return null;
	}
	public Vector<String>all=new Vector<String>();
	String tm;
	public Vector<String> getStrings(String pre){
		rec(pre);
		return all;
		//for(String f:all) System.out.println(f);
		//System.out.println(all.size());
	}
	public void rec(String pr){
		LinkedList<Node>nod=getnode(pr);
		for(Node n:nod){
			tm=pr+String.valueOf(n.d);
			if(n.e) {all.add(tm);}
			else {
				rec(tm);
			}
		}
	}
}

Levenshtein Distance:(How big the social network for a word is?)


Last week I came across this problem on codeeval.com 

Levenshtein Distance 

Description:

Two words are friends if they have a Levenshtein distance of 1 (For details see http://en.wikipedia.org/wiki/Levenshtein_distance). That is, you can add, remove, or substitute exactly one letter in word X to create word Y. A word’s social network consists of all of its friends, plus all of their friends, and all of their friends’ friends, and so on. Write a program to tell us how big the social network for the word 'hello' is, using this word list https://raw.github.com/codeeval/Levenshtein-Distance-Challenge/master/input_levenshtein_distance.txt

Input sample:

Your program should accept as its first argument a path to a filename.The input file contains the word list. This list is also available at https://raw.github.com/codeeval/Levenshtein-Distance-Challenge/master/input_levenshtein_distance.txt.

Output sample:

Print out how big the social network for the word 'hello' is. e.g. The social network for the word 'abcde' is 4846.
Now there are many ways of solving this as you might find on the internet. Most of them depend on finding the Levenshtein distance for calculating the neighbors of a word. But the problem I found that was it takes O(n2) time to calculate Levenshtein  distance(edit distance using dp) between 2 words. After this it all gets down to doing a DFS starting with 'hello'. This approach takes a lot of time to run. 
My approach for overcoming the problem with calculating the neighbors is as follows. Now we know that in this we need only the words that are at a distance of 1. Now these words are the once that can be obtained either by deleting a character, or replacing a character. So I simply simulated it using for loops and generating all possible words formed by deleting or adding or changing a letter and look it up in dictionary. For efficient storing of dictionary and lookup I used 2 dimensions. ie. An array of HashSet where HashSet at index i stores only words of length i.
You can find the full program here
Here is the logic for getting all words at distance 1:

public static void addwords(String wrd){
	int i,len=wrd.length();
	char z;
	StringBuffer sb;
	String b="";
	for(i=0;i<len;i++){
		sb=new StringBuffer(wrd);
		sb=sb.deleteCharAt(i);b=sb.toString();
		if(h[len-1].contains(b)){
			st.add(b);h[len-1].remove(b);
		}
	}
	for(i=0;i<=len;i++){
		for(z='a';z<='z';z++){
			sb=new StringBuffer(wrd);
			sb=sb.insert(i,z);
			b=sb.toString();
			if(h[len+1].contains(b)){
				st.add(b);h[len+1].remove(b);
			}
		}
	}
	for(i=0;i<len;i++){
		for(z='a';z<='z';z++){
			sb=new StringBuffer(wrd);
			sb.setCharAt(i,z);
			b=sb.toString();
			if(h[len].contains(b)){
				st.add(b);h[len].remove(b);
			}
		}
	}
}

Tuesday 5 June 2012

AddAGram: A Puzzle challenge from ITA Software


Last week I came across this page where ITA Software posted a few puzzle challenges it used in its hiring process. I found the one named AddAGram very interesting and then decided to solve it.
Here is the question:


An "add-a-gram" is a sequence of words formed by starting with a 3-letter word, adding a letter and rearranging to form a 4-letter word, and so on. For example, here are add-a-grams of the words "CREDENTIALS" and "ANACHRONISM":
ail + s =
sail + n =
nails + e =
aliens + t =
salient + r =
entrails + c =
clarinets + e =
interlaces + d =
CREDENTIALS (length 11)
mar + c =
cram + h =
march + s =
charms + o =
chromas + n =
monarchs + i =
harmonics + a =
maraschino + n =
ANACHRONISM (length 11)


Test your own credentials: given the dictionary found here, what is the longest add-a-gram?


 Initially I thought it would be solvable using a trie by inserting the words in dictionary in a trie and then simply searching the trie for the longest path down with each node after 3 node which is an end node ie. each node after third node marked as end of some word. But then I realized that it was wrong after half way coding it. Then I went back to square one and rethought the problem and then realized that it can be solvable only by searching. So then I came up with the following approach. I started with the longest word in the dictionary and then found the next word that is at a distance of 1 deletion from the previous word then pushed all those on a stack and for each word in the stack searched for next word at a distance of 1. So I basically created a graph from the words in dictionary t runtime and ran a DFS on it. The biggest hurdle or problem with this approach is searching through the whole dictionary for a word of that length and then backtracking when error occurs is very costly. So then I came up with a simple optimization technique to reduce the amount of words to be searched by storing the words in a array of  vectors where vector at index i stores words of length i so dividing the access in 2 dimensions gives access to the all words of length i directly by calling dict[i].get(). Then finally we can rebuild the original sequence in the visited list of nodes.
It is written in java and it takes ~20 secs to run wit the dictionary of words given here.


Here is only the main logic part of the code:

You can find the complete program here


public class AddAGram{
 public static Vector<String>dic[];
 public static Stack<String>s;
 public static HashSet<String>vis=new HashSet<String>();
 public static boolean check(char []l,char []s){
  int i,j,len=l.length;
  for(i=0;i<s.length;i++)
   for(j=0;j<l.length;j++)
    if(s[i]==l[j]&&l[j]!=0) {s[i]=0;l[j]=0;len--;}
  if(len==1) return true;
  return false;
 }
 static void addAll(String str){
  int l=str.length()-1,tot=dic[l].size();
  String other="";
  for(int i=0;i<tot;i++){
   other=dic[l].elementAt(i);
   if(check(str.toCharArray(),other.toCharArray())&&!vis.contains(other))
    s.push(other);
  }
 }
 static char diff(String small,String large){
  StringBuffer sb=new StringBuffer(large);
  for(int i=0;i<small.length();i++)
   sb=sb.deleteCharAt(sb.indexOf(String.valueOf(small.charAt(i))));
  return sb.charAt(0);
 }
 public static void generateOriginal(String start,String end){
  LinkedList<String>list=new LinkedList<String>();
  for(String a:vis) list.add(a);
  String next="",prev=start;
  list.remove(start);
  int k,j=end.length();
  for(int i=4;i<=j;){
   for(k=0;k<list.size();k++){
    if(list.get(k).length()==i){
     if(check(list.get(k).toCharArray(),prev.toCharArray())){
      next=list.remove(k);i++;
      System.out.println(prev+"+"+diff(prev,next)+"="+next);
      prev=next;break;
     }
    }
   }
  }
 }
 public static void main(String args[]) throws Exception{
  dic=new Vector[30];
  String str;int i,j,k;
  for(i=0;i<30;i++) dic[i]=new Vector<String>();
  InputReader sc=new InputReader(new FileInputStream("WORD.LST"));
  long init=System.currentTimeMillis();
  while(!sc.isExhausted()){
   str=sc.readLine();dic[str.length()].add(str);
  }
  int z=0;
  String word="",tmp="";Vector<String>v;
  for(i=29;i>2;i--){
   v=dic[i];
   for(z=0;z<v.size();z++){
    word=v.elementAt(z);
    s=new Stack<String>();
    vis.clear();
    s.push(word);
    while(!s.empty()){
     tmp=s.pop();
     addAll(tmp);
     vis.add(tmp);
     if(tmp.length()==3) {
      generateOriginal(tmp,word);
      System.out.println("Time taken : "+(System.currentTimeMillis()-init));
      return;
     }
    }
   }
  }
 }
}