python code to compute jaccard index

Computing Jaccard Index (Jaccard similarity coefficient) is easy. Here is my first python implementation of jaccard index:

def compute_jaccard_index(set_1, set_2):
    return len(set_1.intersection(set_2)) / float(len(set_1.union(set_2)))



But we can make it more efficient. If you think for a moment, you will find that we really don't need to compute the union set, rather the cardinality. So this code works better:

def compute_jaccard_index(set_1, set_2):
    n = len(set_1.intersection(set_2))
    return n / float(len(set_1) + len(set_2) - n)


Comments

mafinar said…
I had stumbled upon your blog quiet a few times in the past but never really had the opportunity to contact you or comment on it. It's really cool that somebody from Bangladesh is blogging about Python. A python enthusiast myself, I am also planning on doing something similar. Cool to have you beat me to it by years! I hope to have a chat with you soon. Keep up the good work!
Tamim Shahriar said…
Good to know that you liked my Python blog. You can check my other website too (http://cpbook.subeen.com) and recommend it to newbies.

As you are a Python enthusiast you can join this group: https://www.facebook.com/groups/pythonbd/.

Thanks.

Popular posts from this blog

Strip HTML tags using Python

lambda magic to find prime numbers

Convert text to ASCII and ASCII to text - Python code