So you can do the same from you Python code. Just add
('Accept-Encoding', 'gzip,deflate') in the request header. Check the following code chunk:
opener = urllib2.build_opener()
opener.addheaders = [('Referer', referer),
('User-Agent', uagent),
('Accept-Encoding', 'gzip,deflate')]
usock = opener.open(url)
url = usock.geturl()
data = decode(usock)
usock.close()
return data
Note the decode() function used in the code. Yes, you have to decode the content (if it's compressed).
def decode (page):
encoding = page.info().get("Content-Encoding")
if encoding in ('gzip', 'x-gzip', 'deflate'):
content = page.read()
if encoding == 'deflate':
data = StringIO.StringIO(zlib.decompress(content))
else:
data = gzip.GzipFile('', 'rb', 9, StringIO.StringIO(content))
page = data.read()
return page
You can also have a look at this page from the book - Dive Into Python: http://diveintopython.org/http_web_services/gzip_compression.html
If you would like to buy a hard copy of this book, get it from here: Dive Into Python
6 comments:
AWESUM !!!! ....
this is a boost ...
thnx for d code :)
Thanks for your comment.
in case encoding is not gzipped, some corrections.
encoding = page.info().get("Content-Encoding")
if encoding in ('gzip', 'x-gzip', 'deflate'):
content = page.read()
if encoding == 'deflate':
data = StringIO.StringIO(zlib.decompress(content))
else:
data = gzip.GzipFile('', 'rb', 9, StringIO.StringIO(content))
content = data.read()
else:
content = page.read()
return content
+ 1 for Andrey change
Also you sould import this libraries:
import gzip
import StringIO
import zlib
Its awesome... maybe it'll work only for python2
for python3
fetch = opener.open(request)
data = gzip.decompress(fetch.read())
data = str(data,'utf-8')
this will work...cheers
Post a Comment