How to find links and modify an Html using BeautifulSoup in Python?

This should be easy in Beautiful Soup Something like: from BeautifulSoup import BeautifulSoup from BeautifulSoup import Tag count = 1 links_dict = {} soup = BeautifulSoup(text) for link_tag in soup. FindAll('a'):  if link_tag'href' and len(link_tag'href') > 0:   links_dictcount  = link_tag'href'    newTag = Tag(soup, "a", link_tag. Attrs)   newTag.

Insert(0, ''. Join(''. Join(link_tag.

Contents), "%s" % str(count))) Â Â link_tag. ReplaceWith(newTag) count += 1 Result of executing this on your text: soup this if foo1 this if bar2 >>> links_dict {1: u'http://www.foo.com', 2: u'http://www.bar.com'} The only problem I can foresee with this solution is if your link text contains subtags; then you couldn't do join(link_tag. Contents) instead you would need to navigate to the rightmost text element.

This should be easy in Beautiful Soup. Something like: from BeautifulSoup import BeautifulSoup from BeautifulSoup import Tag count = 1 links_dict = {} soup = BeautifulSoup(text) for link_tag in soup. FindAll('a'):  if link_tag'href' and len(link_tag'href') > 0:   links_dictcount  = link_tag'href'    newTag = Tag(soup, "a", link_tag.

Attrs) Â Â newTag. Insert(0, ''. Join(''.

Join(link_tag. Contents), "%s" % str(count))) Â Â link_tag. ReplaceWith(newTag) count += 1 Result of executing this on your text: >>> soup this if foo1 this if bar2 >>> links_dict {1: u'http://www.foo.com', 2: u'http://www.bar.com'} The only problem I can foresee with this solution is if your link text contains subtags; then you couldn't do ''.

Join(link_tag. Contents); instead you would need to navigate to the rightmost text element.

Danben +1 for the effort. Actually this is like the code I made before asking the question. It does not work because you end up with something like this if foo and this is not what I want.

– systempuntoout May 24 '10 at 21:42 @systempuntoout: edited; the current code is working for me. – danben May 24 '10 at 21:57 @danben do you think is it possible to change the node's content without recreating a new tag? – systempuntoout May 25 '10 at 9:14 I was not able to do that, and the documentation suggests that there is not.

Why is creating a new Tag undesirable? – danben May 25 '10 at 12:34 @Danben uhm, because I could have other attributes besides href; a rel="nofollow" for example. Please, have a look to this other question stackoverflow.Com/questions/2904542/… – systempuntoout May 25 '10 at 18:01.

How to find links and modify an Html using BeautifulSoup in Python - Stack Overflow.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions