get remote file size through http

Couple of days ago I wrote this code.

The requirement is to get the size of a remote file (http). For example: the Python program needs to find the size of the file, http://abc.com/dir/file1.mp3

Now here is a very stupid solution for this:

import urllib2

url = 'http://abc.com/dir/file1.mp3'
usock = urllib2.urlopen(url)
data = usock.read()
size = data.__len__() # size in bytes
size = size / 1024.0 # in KB (Kilo Bytes)
size = size / 1024.0 # size in MB (Mega Bytes)
...

The stupidity happens in this line: data = usock.read() where the whole file is being read to get it's size! This solution came to my mind first. But soon I understood that the file size can be found from the http response header. Here is a much better solution:

import urllib2

url = 'http://abc.com/dir/file1.mp3'
usock = urllib2.urlopen(url)
size = usock.info().get('Content-Length')
if size is None:
size = 0
size = float(size) # in bytes
size = size / 1024.0 # in KB (Kilo Bytes)
size = size / 1024.0 # size in MB (Mega Bytes)
...

Comments

Anonymous said…
Hi. Good idea to use the header. One word of warning though: a webpage does not have to set the content-length, it's not mandatory
Unknown said…
Hi, your solution still isn't very good as you're still putting unnecessary load on the server. Instead of an HTTP GET request, use HTTP HEAD which will return the same response header without actually transmitting any part of the file. That's exactly the purpose of HEAD.
Tamim Shahriar said…
Steve / Marina, thanks for your comments.Can you give the code example to call HTTP HEAD In Python?
Ben said…
This comment has been removed by the author.
Ben said…
#Call HTTP HEAD in Python
import httplib
conn=httplib.HTTPConnection("www.abc.com")
conn.request("HEAD", "/dir/file1.mp3")
res=conn.getresponse()
fileSize=res.getheader('content-length')
#or res.getheaders() for all headers
conn.close()

Popular posts from this blog

Strip HTML tags using Python

lambda magic to find prime numbers

Convert text to ASCII and ASCII to text - Python code