scrape macy's deals using beautiful soup

Let me show you a tiny real example on how to use the bs4 (beautiful soup version 4) module of Python. Say we want to collect information about the hot deals from macy's. The URL is here. Well, you can see all the info in one page and copy-paste, but that's not our purpose. First you have to get the content of the page using the cute requests module.
import requests

url = 'http://bit.ly/19zWmQT'
r = requests.get(url)
html_content = r.text
Now start cooking the soup:
from bs4 import BeautifulSoup

soup = BeautifulSoup(html_content)
Now look at the html code (page source code) of the url. You will see that the offers are in a list (li) that has 'offer' as a css class name (and some other class names). So you can write the code in the following way:
offer_list = soup('li', 'offer')
Or you can write:
offer_list = soup.find_all('li', 'offer')
Another way to write this is:
offer_list = soup.select('li.offer')
Now run this loop:
for offer in offer_list:
    title = offer.find('h3').text
    url = offer.find('a')['href']
    description = offer.find('span').text
    promo_code = offer.find('span', class_='promo-code').text
    promo_date = offer.find('span', class_='end-date').text
    print title, url, description, promo_date, promo_code
You are done! :)

Comments

Unknown said…
I tried using this code with no luck. Specifically, I tried writing 'offer_list' in all three ways, and for each one the 'for loop' simply produces nothing. And I have Beautiful Soup loaded.

Unknown said…
I tried using this code with no luck. Specifically, when I ran the 'for loop' I got no output. And I do have Beautiful Soup loaded and running.

Popular posts from this blog

Strip HTML tags using Python

lambda magic to find prime numbers

Convert text to ASCII and ASCII to text - Python code