simple web crawler / scraper tutorial using requests module in python
Let me show you how to use the Requests python module to write a simple web crawler / scraper. So, lets define our problem first. In this page: http://cpbook.subeen.com/p/blog-page_11.html, I am publishing some programming problems. So, now I shall write a script to get the links (url) of the problems. So, lets start. First make sure you can get the content of the page. For this write the following code: import requests def get_page(url): r = requests.get(url) print r.status_code with open("test.html", "w") as fp: fp.write(r.text) if __name__ == "__main__": url = 'http://cpbook.subeen.com/p/blog-page_11.html' get_page(url) Now run the program: $ python cpbook_crawler.py 200 Traceback (most recent call last)...