The easiest and crudest way to do this would be.
The easiest and crudest way to do this would be: extract top N terms (keywords) from each page (could be as simple as top N terms by frequency, excluding stop words such as 'a, the, an' in English). This will give you a feature set for each page. Compare top terms between pages for overlaps.
You could use Wordnet to compare synonyms of your terms e.g. Sneakers trainers. If you have some degree of keyword overlap then pages are in some way related. EDIT: A better way to derive a feature set of keywords for each page would be to extract statistically significant words for each page.
You can do this by acquiring or compiling a list of (1 2 and 3 word) n-grams from a reference text e.g. Wikipedia) and then computing the n-grams for the words/phrases on your page and comparing the frequency of occurrence of your n-grams with that of those in the global n-gram set. If you find you have n-grams on your page that occur more frequently than what you would expect given the reference corpus then they are likley to be statistically significant for that page. The hard part in this is acquiring or compiling the reference n-gram (it needs to be big enough to be statistically viable) which you ned to compare with the n-grams on your webpages.
You can acquire google's n-gram corpus, or possibly build your own by looking at freely available to download websites like wikipedia. Others may have published a freely available n-gram set if you look around on google.
I haven't used it much personally, but I've heard that the NLTK (Natural Language Toolkit) library can be a great help for these kinds of language analysis tasks. They have a lot of nice documentation and tutorials online, in addition to plenty of language corpora and other datasets to get you started.
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.