Getting to grips with regex in Perl?

Well, the parts you have aren't quite correct. Instead of 0-30-9 I think you mean 0-30-9 and similarly for the other numbers However, usually it suffices to be a little looser and just use d which is equivalent to 0-9 You string the parts together one after the other: d\d (MONTH) \d\d\d\d at \d\d:\d\d Which can be written more succinctly as: d\d (MONTH) \d{4} at \d\d:\d\d Or if you really need it to be more strict as in your formulation: 0-3\d (MONTH) 0-2\d{3} at \d\d:\d\d I've left the month bit for last, since it is the more complicated bit. Again you can be loose or strict Loosely: 0-3\d A-Za-z+ 0-2\d{3} at \d\d:\d\d For a strict match we can use an alternation, each alternative is separated by a '|' and the list of choices is enclosed in parenthesis (although beware, parenthesis also have another extra meaning; don't worry it won't interfere in this case): 0-3\d (January|February|March|April|May|June|July|August|September|October|November|December) 0-2\d{3} at \d\d:\d\d Finally, if the day is not 0-padded (meaning the 1st is just '1' rather than '01') then you need to make that optional: 0-3?

\d (January|February|March|April|May|June|July|August|September|October|November|December) 0-2\d{3} at \d\d:\d\d Crib sheet are used to create a character class, a set of matching characters d is a built-in character class equivalent to 0-9 () are used to create a group, useful for delimiting an alternation (amongst other things) is used to create alternation, a list of alternative character sequences that should be matched {n} is a modifier, saying exactly 'n' of the preceding character or character class should be matched is a modifier, saying 1 or more of the preceding character or character class should be matched? Is a modifier, saying 0 or 1 of the preceding character or character class should be matched.

Well, the parts you have aren't quite correct. Instead of 0-30-9 I think you mean 0-30-9, and similarly for the other numbers. However, usually it suffices to be a little looser and just use \d which is equivalent to 0-9.

You string the parts together one after the other: /\d\d (MONTH) \d\d\d\d at \d\d:\d\d/ Which can be written more succinctly as: /\d\d (MONTH) \d{4} at \d\d:\d\d/ Or if you really need it to be more strict as in your formulation: /0-3\d (MONTH) 0-2\d{3} at \d\d:\d\d/ I've left the month bit for last, since it is the more complicated bit. Again you can be loose or strict. Loosely: /0-3\d A-Za-z+ 0-2\d{3} at \d\d:\d\d/ For a strict match we can use an alternation, each alternative is separated by a '|' and the list of choices is enclosed in parenthesis (although beware, parenthesis also have another extra meaning; don't worry it won't interfere in this case): /0-3\d (January|February|March|April|May|June|July|August|September|October|November|December) 0-2\d{3} at \d\d:\d\d/ Finally, if the day is not 0-padded (meaning the 1st is just '1' rather than '01') then you need to make that optional: /0-3?

\d (January|February|March|April|May|June|July|August|September|October|November|December) 0-2\d{3} at \d\d:\d\d/ Crib sheet are used to create a character class, a set of matching characters \d is a built-in character class equivalent to 0-9 () are used to create a group, useful for delimiting an alternation (amongst other things) | is used to create alternation, a list of alternative character sequences that should be matched {n} is a modifier, saying exactly 'n' of the preceding character or character class should be matched + is a modifier, saying 1 or more of the preceding character or character class should be matched? Is a modifier, saying 0 or 1 of the preceding character or character class should be matched.

0-30-9 doesn't do what you think it does. :) 0-30-9 is what you're after. Similar steps for each of the other inputs... 0-3?

\d (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \d\d\d\d at 012\d:0-5\d The? Is to say the leading digit might be there. The \d means 'digit', sometimes more legible.(foo|bar|baz) is called 'alternation'.

The time is a problem :) This is good and simple, but would match a time like 29:59. Hehe. You could do this better with alternation: (\d|1\d|20-3) -- less legible but more correct.

And my advice for a Perl neophyte working with regexp is to start small and built them iteratively. It takes work. :).

CPAN has some common Regexes in the Regexp::Common::* branch. For your case check out search.cpan.org/perldoc?Regexp::Common::... . Perhaps I should add, since you are so new to Perl, CPAN is Perl's collection of user-contributed modules for tasks.

Many things that people may want to do have already been done before and collected for you. To install things you can do sudo cpan modulename (assuming you are on Linux, I'm sure you can find instructions for CPAN on mac and windows, but I don't know them).

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions