Constituents It sounds like you want to identify the sentence's constituents, which are groups of words that operate as a single unit according to the grammar of a language. In fact, when linguistics are trying to discover a language's grammar, they do it in part by looking at movement. As in your example, this is where a group of words can be moved to a different position in a sentence while still preserving the meaning of the sentence.
Constituents can be individual words, phrases, or even larger groups such as whole clauses. Within a sentence, they have a nested hierarchical structure. For instance, the first example sentence you gave could be analyzed as: (S (PP (IN On) (NP (NNP March) (CD 1))) (NP (PRP he)) (VP (VBD was) (VP (VBN born)))) The whole sentence is made up of a prepositional phrase, followed by a noun phrase, and then a verb phrase.
The prepositional phrase can be further decomposed into a unit consisting of the single word 'On' followed by a noun phrase. Phrase Structure Parsers To find constituents automatically, you will probably want to use a phrase structure parser. There are many such parses to choose from that are available as open source, including: Stanford Parser (Java) Berkeley Parser (Java) Charniak Parser (C++) Bikel Parser (this is a reimplemented and improved version of the Collins parser write in Java) Collins Parser (C++) OpenNLP Parser (Java) SharpNLP Parser (C#) The Stanford and Berkeley parsers are probably the easiest to install and use.As seen in Cer et al.
2010, the most accurate parsers are Berkeley and Charniak. The Bikel parser is slower and less accurate than the others. Online Demo There's an online demo for the Stanford parser here.
I used the demo to produce the parse given above of your example sentence. A Note About Deletion Within each constituent, there will be a head word. For example, take the noun phrase: (NP (DT The) (JJ big) (JJ blue) (NN ball)) The head word here is the noun ball, and it is modified by the adjectives big and blue.
If this noun phrase was embedded in a sentence, you could delete those modifiers and still have something that was consistent with, but less specific than, the meaning of the original sentence. Within noun phrases, you can generally delete the adjectives, nouns that are not the head, and nested prepositional phrases. Within verb phrases and complete clauses, things get more tricky since deleting material that servers as an argument to the verb can completely change the interpretation a sentence.
For example, deleting the book from He sold Jim the book results in He sold Jim.
OpenNLP may do some of this for you. Phrase chunking and parsing should help you with this. However, this is not a particularly simple problem, and algorithms will tend to get confused as sentence structure becomes more complex and ambiguous.
You should sometimes be able to reorder phrases within a sentence and maintain meaning.
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.