How to enable stemming when searching using lucene.net?

To do this you need to write your own analyzer class. This is relatively straightforward. Here is the one that I am using.It combines stop word filtering.

Porter stemming and (this may be too much for your needs) stripping of accents from characters.

To do this you need to write your own analyzer class. This is relatively straightforward. Here is the one that I am using.It combines stop word filtering.

Porter stemming and (this may be too much for your needs) stripping of accents from characters. /// /// An analyzer that implements a number of filters. Including porter stemming, /// Diacritic stripping, and stop word filtering.

/// public class CustomAnalyzer : Analyzer { /// /// A rather short list of stop words that is fine for basic search use. /// private static readonly string stopWords = new { "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "000", "$", "£", "about", "after", "all", "also", "an", "and", "another", "any", "are", "as", "at", "be", "because", "been", "before", "being", "between", "both", "but", "by", "came", "can", "come", "could", "did", "do", "does", "each", "else", "for", "from", "get", "got", "has", "had", "he", "have", "her", "here", "him", "himself", "his", "how","if", "in", "into", "is", "it", "its", "just", "like", "make", "many", "me", "might", "more", "most", "much", "must", "my", "never", "now", "of", "on", "only", "or", "other", "our", "out", "over", "re", "said", "same", "see", "should", "since", "so", "some", "still", "such", "take", "than", "that", "the", "their", "them", "then", "there", "these", "they", "this", "those", "through", "to", "too", "under", "up", "use", "very", "want", "was", "way", "we", "well", "were", "what", "when", "where", "which", "while", "who", "will", "with", "would", "you", "your", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z" }; private Hashtable stopTable; /// /// Creates an analyzer with the default stop word list. /// public CustomAnalyzer() : this(stopWords) {} /// /// Creates an analyzer with the passed in stop words list.

/// public CustomAnalyzer(string stopWords) { stopTable = StopFilter. MakeStopSet(stopWords); } public override TokenStream TokenStream(string fieldName, System.IO. TextReader reader) { return new PorterStemFilter(new ISOLatin1AccentFilter(new StopFilter(new LowerCaseTokenizer(reader), stopWords))); } }.

Thanks, I will try this. – lucene user Jul 31 '09 at 3:53 +1 thanks Jack, just what I was looking for. If I could I'd mark this as the answer!

– andy Mar 13 '11 at 22:41.

You can use Snowball or PorterStemFilter. See the Java Analyzer documentation as a guide to combining different Filters/Tokenizers/Analyzers. Note you have to use the same analyzer for indexing and retrieval, so that handling stemming should start at indexing time.

Thanks, I will try this. – lucene user Jul 31 '09 at 3:53.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions