Python, BeautifulSoup or LXML - Parsing image URL's from HTML using CSS tags?

Using lxml, you might do something like this: import feedparser import lxml. Html as lh import urllib2 #Import Feed for Parsing d = feedparser. Parse("feeds.boston.com/boston/bigpicture/index") # Print feed name print d'feed''title' # Determine number of posts and set range maximum posts = len(d'entries') # Collect Post URLs for post in d'entries': link=post'link' print('Parsing {0}'.

Format(link)) doc=lh. Parse(urllib2. Urlopen(link)) imgs=doc.

Xpath('//img@class="bpImage"') for img in imgs: print(img. Attrib'src').

This is perfect. Thank you very much. – tyebud Nov 23 '10 at 17:28.

The code you have posted looks for all a elements with the bpImage class. But your example has the bpImage class on the img element, not the a. You just need to do: soup.

Find("img", { "class" : "bpImage" }).

Haha. Of course. So that returns the url with tags.Is there some way to strip those down to just the url?

– tyebud Nov 23 '10 at 17:10.

Using pyparsing to search for tags is fairly intuitive: from pyparsing import makeHTMLTags, withAttribute imgTag,notused = makeHTMLTags('img') # only retrieve tags with class='bpImage' imgTag. SetParseAction(withAttribute(**{'class':'bpImage'})) for img in imgTag. SearchString(html): print img.src.

I have searched high and low for a decent explanation of how BeautifulSoup or LXML work. Granted, their documentation is great, but for someone like myself, a python/programming novice, it is difficult to decipher what I am looking for. Anyways, as my first project, I am using Python to parse an RSS feed for post links - I have accomplished this with Feedparser.

My plan is to then scrape each posts' images. For the life of me though, I can not figure out how to get either BeautifulSoup or LXML to do what I want! I have spent hours reading through the documentation and googling to no avail, so I am here.

The following is a line from the Big Picture (my scrapee). To find all instances with that css class. Well, it doesn't return anything.

I'm sure I'm overlooking something trivial so I greatly appreciate your patience. Thank you very much for your responses!

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions