Python BeautifulSoup Automatically tracking content table rows andcolumns?

Here's how I'd tackle it: from BeautifulSoup import BeautifulSoup doc = ''' Item Description Part No. Color Toaster 2-Slice #25713 Chorme ''' soup = BeautifulSoup(doc) # find the table element in the HTML document table = soup. Find("table") # grabs the top row firstRow = table.

Contents0 # find how many columns there are numberOfColumns = len(firstRow. Contents) restOfRows = table. Contents1: for row in restOfRows: for x in range(0,numberOfColumns): print "column data: %s" % row.contentsx.

String That will extract the table element from any document. Then find the number of columns based on the first row. Finally, it will loop through the rest of the rows printing out the data in the row Useful link to BS docs: http://www.crummy.com/software/BeautifulSoup/documentation.html.

Here's how I'd tackle it: from BeautifulSoup import BeautifulSoup doc = ''' Item Description Part No. Color Toaster 2-Slice #25713 Chorme ''' soup = BeautifulSoup(doc) # find the table element in the HTML document table = soup. Find("table") # grabs the top row firstRow = table.

Contents0 # find how many columns there are numberOfColumns = len(firstRow. Contents) restOfRows = table. Contents1: for row in restOfRows: for x in range(0,numberOfColumns): print "column data: %s" % row.contentsx.

String That will extract the table element from any document. Then find the number of columns based on the first row. Finally, it will loop through the rest of the rows printing out the data in the row.

Useful link to BS docs: http://www.crummy.com/software/BeautifulSoup/documentation.html.

When I runs this code I get the same error I keep getting my own code. Attribute Error: NavigableString has no attribute 'contents' on the row:numberOfColumns = len(firstRow. Contents).

I am using Python 2.6.6 if that matters. – a25bedc5-3d09-41b8-82fb-ea6c353d75ae Apr 7 at 16:10 That error would suggest that the "firstRow" variable is just a TextNode. Print out "firstRow" and see what gets returned.

– Shakakai Apr 8 at 1:35.

Here is how you do it with HTQL: import htql; doc = ''' Item Description Part No. Color Toaster 2-Slice #25713 Chorme '''; query = "..{item=&tx; desc=&tx | item'Item'}"; for item, desc in htql. HTQL(doc, query): print(item, desc).

Grab the table (my table is always contained in the only div on the page. For each row grab the row and column index' when I find a desired field names. For each column grab the text value.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions