Pure statistical, or Natural Language Processing engine?

LingPipe is probably worth a look as complete NLP tool.

LingPipe is probably worth a look as complete NLP tool. However, if all you need to do is find verbs and nouns and stem them, then you could just 1) tokenize text 2) run a POS tagger 3) run a stemmer The Stanford tools can do this for multiple languages I believe, and NLTK would be a quick way to try it out. However, you want to be careful of just going after verbs and nouns- what do you do about noun phrases and multiword nouns?

Ideally an nlp package can handle this, but a lot of it depends on the domain you are working in. Unfortunately a lot of NLP is how good your data is.

You're probably looking for the Snowball project, which has developed stemmers for a number of different languages.

If you're looking for Java code, I can recommend Stanford's set of tools. Their POS tagger works for English, German, Chinese and Arabic (though I only used it for English) and includes an (English-only) lemmatizer. These tools are all free, accuracy is pretty high and the speed is not too bad for a Java-based solution; the main problems are sometimes flaky APIs and high memory use.

I had good experience with TreeTagger: ims.uni-stuttgart.de/projekte/corplex/Tr... It's easy to use, faster than the Stanford's one, and belongs to the "good" stemmers/taggers out there. It does all operations at once: tokenization/stemming/tagging.

Interesting, but it has a commercial license. I was hoping for something free. – Inge Henriksen Jul 10 at 16:07.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Pure statistical, or Natural Language Processing engine?

Related Questions

Need resources for Statistical Natural Language Processing?

Python or Java for text processing (text mining, information retrieval, natural language processing)?

How to know statistical data on google app engine?

Natural Language Processing .net tools?

Natural Language processing for getting qualitative info?

Binarization in Natural Language Processing?