Figured this out using html5lib based on this answer as a starting point. Here's a version of what I ended up with that does the same thing as the BeautifulSoup code I started with above, except works properly for the integer case I described.
You are doing it wrong (tm). BeatifulSoup is not meant to be used like that. Take a look at this instead: http://code.activestate.com/recipes/52281-strip-tags-and-javascript-from-html-page-leaving-o/ This recipe removes invalid tags and you sound like you want to keep them in but escaped.
Should be a pretty easy modification.
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.