Posts

Showing posts from July, 2012

python code to compute jaccard index

Computing Jaccard Index (Jaccard similarity coefficient) is easy. Here is my first python implementation of jaccard index: def compute_jaccard_index(set_1, set_2):     return len(set_1.intersection(set_2)) / float(len(set_1.union(set_2))) But we can make it more efficient. If you think for a moment, you will find that we really don't need to compute the union set, rather the cardinality. So this code works better: def compute_jaccard_index(set_1, set_2):     n = len(set_1.intersection(set_2))     return n / float(len(set_1) + len(set_2) - n)