Posts

Showing posts from April, 2008

Convert text to ASCII and ASCII to text - Python code

Python has a very simple way to convert text to ASCII and ASCII to text. To find the ASCII value of a character use the ord() function. Example: >>> ord('B') 66 To get a character from it's ASCII value use the chr() function. Example: >>> chr(65) 'A' Here is a program to get the list of all characters along with their values: # program to print the list of characters and their ASCII values for value in range(0, 255): print "ASCII Value:", value, "\t", "Character:", chr(value), "\n" Here is another simple ASCII encoding-decoding program: # Convertion from text to ASCII codes message = raw_input("Enter message to encode: ") print "Decoded string (in ASCII):" for ch in message: print ord(ch), print "\n\n" # Convertion from ASCII codes to text message = raw_input("Enter ASCII codes: ") decodedMessage = "" for item in message.split(): decodedMessage += chr

PyHP - Python Hypertext Preprocessor

Couple of months ago, I was trying to embed Python code into HTML. After given the problem to solve, I was planning for a solution and looking into Python libraries. Suddenly I thought, why not spend some time in google to see if there already exists something similar. Actually I was looking for something very simple and I got one!!! It's called PyHP - Python Hypertext Preprocessor. Thank God, I didn't have to reinvent the wheel, as it was almost the similar thing what I was thinking to build. It's a small, fast, hypertext preprocessor in my favorite language - Python. And of course it is open source. The only thing I needed to change in the code is, I made the code to print output in a html file rather than in console. It was just a simple tweak. Download and play with PyHP - Python Hypertext Preprocessor . You can use it as a standalone program, or use it with Apache's mod-python. Enjoy Programming, Enjoy Python!

Copy Compound object lists using deepcopy

Sometimes you might get into trouble while copying a compound object list to another in Python. If this is so, then you should use the copy module of Python. To know details about shallow and deep copy operations in Python please visit http://docs.python.org/lib/module-copy.html To copy simple lists without using copy module of python, you can use the method described in another post of my blog http://love-python.blogspot.com/2008/04/how-to-copy-list-in-python.html One of my friend found that this method wasn't working for him while copying a graph into another (he used adjacency matrix to represent graph), then he tried deepcopy and got rid of the problem.

Locate Firefox Cache - Python Code

Here I share my python code to locate firefox cache directory. # program to locate firefox cache directory in Linux from subprocess import Popen, PIPE name = raw_input('Enter the top-level direcotry ( example: /home ): ') li = Popen(['find', name, '-iname', 'cache'], stdout=PIPE).stdout.read().split('\n') flag = 1 for item in li: if item.find('firefox') != -1: print 'Cache Directory: ', item flag = 0 break if flag: for item in li: if item.find('mozilla') != -1: print 'Cache Directory: ', item break It will work on Linux (*nix). To use it in windows with minimum modification, you need to install UnxUtil in your machine. Please visit http://speed-dev.blogspot.com/2008/03/unxutil-run-unix-linux-command-in.html to know more about UnxUtil.

Adjacency Matrix (Graph) in Python

Adjacency matrix is a widely used graph data structure used to store graph. Couple of days ago I had to write a Python code that reads user input from console and stores the graph in an adjacency matrix. I used List for this purpose. Here is my code:- def graph_input(): V = int(raw_input("Enter the number of Nodes: ")) E = int(raw_input("Enter the number of Edges: ")) print "Enter Edges with weights:\n" adj_matrix = [] # vertex numbering starts from 0 for i in range(0, V): temp = [] for j in range(0, V): temp.append(0) adj_matrix.append(temp) for i in range(0, E): s = raw_input() u, v, w = s.split() u = int(u) v = int(v) w = int(w) adj_matrix[v][u] = adj_matrix[u][v] = w # print the adjacency matrix for i in range (0, V): print adj_matrix[i] return adj_matrix graph = graph_input() Let me know if you have any other idea to repres

Python Web Frameworks - A Short List

Python is heavily used for web application development. Web frameworks make your life easier while developing a web application - doesn't matter it's small, medium or huge. Here I give a short list of popular python web frameworks: Django - Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design. Pylons - Pylons is a lightweight web framework emphasizing flexibility and rapid development. TurboGears - the rapid Web development megaframework web.py - web.py is a web framework for python that is as simple as it is powerful. I have little idea about Pylons. But haven't looked at other python web frameworks. So I am not going for any comparison among the python web frameworks :-) For a more detail list of python web framework please visit here: http://wiki.python.org/moin/WebFrameworks

get content (html source) of an URL by HTTP POST method in Python

To retrieve or get content (html source) of an URL, sometimes we need to POST some values. Here I show you a sample python code that uses POST. import urllib, urllib2, time url = 'http://www.example.com' # write ur URL here values = {'key1' : 'value1', #write ur specific key/value pair 'key2' : 'value2', 'key3' : 'value3', } try: data = urllib.urlencode(values) req = urllib2.Request(url, data) response = urllib2.urlopen(req) the_page = response.read() print the_page except Exception, detail: print "Err ", detail Hope it will be useful for you while writing web crawler/spider, specially where values must be submitted using HTTP POST method to get or extract content from an URL. Please write your comments.

Python code to scrape email address using regular expression

Programmers when learn writing web spiders or crawlers, try to write a script to parse/collect email address from website. Here I post a class in Python that can harvest email address. It uses regular expression. import re class EmailScraper(): def __init__(self): self.emails = [] def reset(self): self.emails = [] def collectAllEmail(self, htmlSource): "collects all possible email addresses from a string, but still it can miss some addresses" #example: t.s@d.com email_pattern = re.compile("[-a-zA-Z0-9._]+@[-a-zA-Z0-9_]+.[a-zA-Z0-9_.]+") self.emails = re.findall(email_pattern, htmlSource) def collectEmail(self, htmlSource): "collects all emails that starts with mailto: in the html source string" #example: <a href="mailto:t.s@d.com"> email_pattern = re.compile("<a\s+href=\"mailto:([a-zA-Z0-9._@]*)\">", re.IGNORECASE

Read FireFox Cache with Python

Do you know the easiest way to view the disk cache data of your firefox browser? Just paste "about:cache?device=disk" (without the quotes) in the address bar of your firefox browser. If you see it for the first time, you will be amazed. Anyway, I had to write a program that does similar thing (locate and read firefox cache and show cache data) in Python. First I tried to automate the about:cache mechanism, but couldn't make it. Later I went for processing raw cache data. In the Cache directory, the files _CACHE_001_, _CACHE_002_, and _CACHE_003_ contain a mix of data and metadata. The metadata contains information such as the URL requested, timestamps, size, and other details. You can view the files (I used VI in my Ubuntu to view those). The file structure of each _CACHE_00n file is as follows: ______________________________ | | | 4096 byte free | | block bit-map | |------------------------------| |

How to copy a list in Python?

Let me tell about a common mistake many python beginners do while trying to copy a list to another list. Suppose to copy a list listA to listB, they use listB = listA , now if you change something in one list, the other list is changed automatically as they refer to same list - actually listB points to listA! So the proper way should be listB = [] listB.extend(listA) Here I paste some experiments I made: >>> listA = [1, 2, 3, 4, 5] >>> listB = listA >>> listC = [] >>> listC.extend(listA) >>> listB [1, 2, 3, 4, 5] >>> listC [1, 2, 3, 4, 5] >>> listA [1, 2, 3, 4, 5] >>> listB[4] = 0 >>> listB [1, 2, 3, 4, 0] >>> listA [1, 2, 3, 4, 0] >>> listC [1, 2, 3, 4, 5] >>> listC[4] = 10 >>> listC [1, 2, 3, 4, 10] >>> listA [1, 2, 3, 4, 0] >>> listB [1, 2, 3, 4, 0] >>> Hope you get the point!

Get the ASCII string for Decimal number in your data - reformat with Python

Once while writing a spider for a site, I came across some data that were not ASCII, rather had the ASCII value in the format '&#123' ... so I wrote a function myself to reformat the data (replace those with their corresponding ASCII values). But I had a feel that it was not Pythonic. So I searched Google, and found the following function that does the same task in a Pythonic way. Unfortunately I forgot from where I got it :( def asciirepl(self, match): "Return the ascii string for a decimal number" s = match.group() value = int(s[2:-1]) if value > 255: return '' return chr(value) def reformat_content(self, data): p = re.compile(r'&#(\d+);') return p.sub(self.asciirepl, data)

How to think like a (Python) programmer - Free eBook

I just found this book couple of days ago. Though it won't be very useful for me to learn Python (as I already familiar with Python), I think it's an excellent book for Python newbies even with no prior programming experience. It can also be used for teaching a one semester programming course - the author wrote the book for this purpose! The book is divided into 19 chapters that covers programming fundamentals, python fundamentals, python data structures, object oriented programming, and the last chapter introduces to GUI programming with Python (Tkinter). The book is released under the GNU Free Documentation License! You can get the it from here . Download is free! Post your comments on the book.