Removing Tags from HTML Parsed with BeautifulSoup?

Use the ". Text" property: d'name' = line. Find('div', {'class':'torrentname'}).

Find('a'). Text Or do a join on findAll(text=True): anchor = line. Find('div', {'class':'torrentname'}).

Find('a') d'name' = ''. Join(anchor. FindAll(text=True)).

This doesn't work. It doesn't keep the spaces in an example like this: Ubuntu Linux. It comes out as UbuntuLinux.

– FlowofSoul Aug 29 '10 at 4:24 I have updated the answer with an additional option. – Matt Austin Aug 29 '10 at 5:29 Thanks so much, that works great! Could you explain how that second line of code works?

– FlowofSoul Aug 29 '10 at 15:29 The BeautifulSoup documentation says the text argument allows you to "search for NavigableString objects instead of Tags". FindAll returns a python list, which can then be joined together (.join) to form one string. Crummy.Com/software/BeautifulSoup/documentation.

Html – Matt Austin Aug 29 '10 at 4:46.

I'm new to python and I'm using BeautifulSoup to parse a website and then extract data. But due to the strong html tags it returns None. Is there a way to extract the strong tags and then use .

String or is there a better way? I have tried using BeautifulSoup's extract() function but I couldn't get it to work. Edit: I just realized that my solution does not work if there are two sets of strong tags as the space between the words are left out.

What would be a way to fix this problem?

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions