Scrape data from HTML pages using Java, output to database?

First you need to get familiar with a HTML DOM parser in Java like JTidy . This will help you to extract the stuff you want from a HTML file. Once you have the essential stuff, you can use JDBC to put in the database .

I successfully used lobo browser API in a project that scraped HTML pages. The lobo browser project offers a browser but you can also use the API behind it very easily. It will also execute javascript and if that javascript manipulates the DOM, then that will also be reflected in the DOM when you investigate the DOM.

So, in short, the API allows you mimic a browser, you can also work with cookies and stuff.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions