This should be pretty straight forward if you have a chunk of HTML to parse with BeautifulSoup. The general idea is to navigate to your table using the findChildren method, then you can get the text value inside the cell with the string property from BeautifulSoup import BeautifulSoup >>> >>> html = """ ... ... ... ... column 1column 2 ... value 1value 2 ... ... ... ... """ >>> >>> soup = BeautifulSoup(html) >>> tables = soup. FindChildren('table') >>> >>> # This will get the first (and only) table.
Your page may have more. >>> my_table = tables0 >>> >>> # You can find children with multiple tags by passing a list of strings >>> rows = my_table. FindChildren('th', 'tr') >>> >>> for row in rows: ... cells = row.
FindChildren('td') ... for cell in cells: ... value = cell. String ... print "The value in this cell is %s" % value ... The value in this cell is column 1 The value in this cell is column 2 The value in this cell is value 1 The value in this cell is value 2.
This should be pretty straight forward if you have a chunk of HTML to parse with BeautifulSoup. The general idea is to navigate to your table using the findChildren method, then you can get the text value inside the cell with the string property. >>> from BeautifulSoup import BeautifulSoup >>> >>> html = """ ... ... ... ... column 1column 2 ... value 1value 2 ... ... ... ... """ >>> >>> soup = BeautifulSoup(html) >>> tables = soup.
FindChildren('table') >>> >>> # This will get the first (and only) table. Your page may have more. >>> my_table = tables0 >>> >>> # You can find children with multiple tags by passing a list of strings >>> rows = my_table.
FindChildren('th', 'tr') >>> >>> for row in rows: ... cells = row. FindChildren('td') ... for cell in cells: ... value = cell. String ... print "The value in this cell is %s" % value ... The value in this cell is column 1 The value in this cell is column 2 The value in this cell is value 1 The value in this cell is value 2.
That was the trick! Code worked and I should be able to modify it as needed. Many thanks.
One last question. I can follow the code except for when you search the table for children th and tr. Is that simply searching my table and returning both the table header and table rows?
If I only wanted the table rows, I simply could search for tr only? Many thanks again! – Btibert3 Jan 6 '10 at 2:19 Yes, .
FindChildren('th', 'tr') is searching for elements with tag type of th or tr. If you just want to find tr elements you would use . FindChildren('tr') (note not a list, just the string) – jgeewax Jan 8 '10 at 22:15.
It may be basic, but I really am trying to learn how to read a HTML table. I can read it into Open Office and it says that it is Table #11. It seems like BeautifulSoup is the preferred choice, but can anyone provide insight as to how I would grab a particular table and grab all of the rows?
I have taken a look at the documentation for the module, but I still am having trouble getting my head around it. Many of the examples that I have found online appear to do more than I need. Any help you can provide will be greatly appreciated!
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.