How do I create my own training corpus for stanford tagger?

To train the PoS tagger, see this mailing list post which is also included in the JavaDocs for the MaxentTagger class.

To train the PoS tagger, see this mailing list post which is also included in the JavaDocs for the MaxentTagger class. The javadocs for the edu.stanford.nlp.tagger.maxent. Train class specifies the training format: The training file should be in the following format: one word and one tag per line separated by a space or a tab.

Each sentence should end in an EOS word-tag pair. (Actually, I'm not entirely sure that is still the case, but it probably won't hurt. -wmorgan).

– goh Jul 2 '10 at 7:23 @goh: I've responded with an edit. – Ken Bloom Jul 2 '10 at 13:22 thanks for the help. – goh Jul 6 '10 at 7:52.

For the Stanford Parser, you use Penn treebank format, and see Stanford's FAQ about the exact commands to use. The JavaDocs for the LexicalizedParser class also give appropriate commands, particularly: java -mx1500m edu.stanford.nlp.parser.lexparser. LexicalizedParser -v \ -train trainFilesPath fileRange -saveToSerializedFile serializedGrammarFilename.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions