Parsing HTML with Python 2.7 - HTMLParser, SGMLParser, or Beautiful Soup?

I am using and would recommend lxml and pyquery for parsing HTML. I had to write a web scraping bot a few month ago and of all the popular alternatives I tried, including HTMLParser and BeautifulSoup I went with lxml and the syntax sugar of pyquery . I haven't tried SGMLParser though.

BeautifulSoup in particular is for dirty HTML as found in the wild. It will parse any old thing, but is slow.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Parsing HTML with Python 2.7 - HTMLParser, SGMLParser, or Beautiful Soup?

Related Questions

Parsing Source Code (Python) Approach: Beautiful Soup, lxml, html5lib difference?

Basic Python/Beautiful Soup Parsing?

Chicken noodle soup, chicken noodle soup, chicken noodle soup with a soda on the side. <<< Who sings that song? :)?

Python using Beautiful Soup for HTML processing on specific content?

How to have the HTMLParser continue parsing after a parse error?

Get data between tags for tags that fit defined CSS selector with Python's HTMLParser?