Match Regular expression from a dictionary in C?

Dictionary myCollection = new Dictionary(); myCollection. Add("(.*)orange(.*)", "Oranges are a fruit."); myCollection. Add("(.*)apple(.*)", "Apples have pips.

"); myCollection. Add("(.*)dog(.*)", "Dogs are mammals. "); // ... string input = "tell me about apples and oranges"; var results = from result in myCollection where Regex.

Match(input, result. Key, RegexOptions. Singleline).

Success select result; foreach (var result in results) { Console. WriteLine(result. Value); } // OUTPUT: // // Oranges are a fruit.

// Apples have pips.

I am going to start with this solution, so far its running pretty fast with a dictionary of about 500 items. If it gets worse, I will look into other alternatives. Thanks!

– Chris Kooken Sep 11 '09 at 13:57.

I'm not sure if you actually need regular expressions for this - you could use a trie. Representing dictionaries is a common application for a trie. (I'm assuming you mean a dictionary as in a list of words, and not the "associative array" meaning).

If it IS a regex and you want to find it in the list, you don't need a Dictionary, those are 2 part containers. You could just use a List or StringCollection, and ask for IndexOf(mytString), -1 meaning it's not in there.

If your regexps aren't trivial single-strings, and you care for efficiency, you'd want to represent them in a single NFA (nondeterministic finite-state automaton, with values in final states. If it's possible for an input to match more than one regexp, then final states would need a set of values. At this point, you're prepared to consider optimizing the automaton.

If it can be practically determinized (this give you a DFA that can be exponentially larger than the NFA), then by all means do that. Once you have a DFA, you can efficiently (and uniquely up to isomorphism) minimize it (but since you have values in your final states, an obvious modification of the usual algorithm is needed). There are also techniques for minimizing NFA directly.

For example, if two states have the same suffix sets ({(rest of string,value)}) they're equivalent and can be combined. Equivalence in an acyclic NFA can be done via hash-consing starting from the final states.

Remember that if you are planning on using a regex more than once you can create a regex object as compiled and re-use it to reduce overhead. Regex RegexObject = new Regex(Pattern, RegexOptions. Compiled); Using this model you would be best storing a regex object rather than the pattern string.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions