Regex - Match a Pattern Before a Character?

You need a positive lookahead assertion : (A-Z{1,3})(?==).

You need a positive lookahead assertion: (A-Z{1,3})(?==).

Thanks for that! – bplus Jun 30 '09 at 19:29.

What you want is called a zero-width, lookahead assertion. You do: (Match this and capture)(?=before this) In your case, this would be: (A-Z^{1,3})(?==).

The following will group everything before the "=" and everything after. (^=*)=(^=*) it reads something like this: match any amount of characters thats not a "=", followed by a "=", then any amount of characters thats not a "=".

I tried your regex in nregex. Com/nregex/default. Aspx It didn't seem to work- could be something up with the regex engine that site uses?

Anyway I've marked an answer now so not to worry. Thanks though. – bplus Jun 30 '09 at 19:30 The problem with this regular expression might be that, if it's multiline, the second wildcard will match the part after the current equal sign, the newline, and then the characters before the next equal sign.

You'd want to add the delimiter character inside the second pair of square brackets. – Conspicuous Compiler Jun 30 '09 at 20:59.

You can also put the equals ) (ADM{1,3})(?:=) It's been a bit since I did this chapter of the book but I think that since you need both parts of the expression anyway, I did a split on the = resulting in myArray0 == M, myArray1 == A.

The non-capturing parens won't do anything useful. The equals sign will still be "captured" as part of the overall match, which is what the OP was trying to avoid. – Alan Moore Sep 3 '09 at 19:05.

You could use a look-ahead assertion: (?!999)\d{3} This example matches three digits other than 999. But if you happen not to have a regular expression implementation with this feature (see Comparison of Regular Expression Flavors), you probably have to build a regular expression with the basic features on your own. A compatible regular expression with basic syntax only would be: 0-8\d\d|\d0-8\d|\d\d0-8 This does also match any three digits sequence that is not 999.

– Bryan Oakley Mar 4 '09 at 19:45 Thatâ€™s true. But most regex flavors support this feature (see ;). – Gumbo Mar 4 '09 at 19:49 Turns out that the windows findstr function only supports pure DFA-style regex anyway, so I need to just do it all differently.

You still get the answer, though. – notnot Mar 4 '09 at 21:38.

Match against the pattern and use the host language to invert the boolean result of the match. This will be much more legible and maintainable.

Then I just end up with (~A or B) instead of (A and ~B). It doesn't solve my problem. – notnot Mar 4 '09 at 21:06 Pseudo-code: String toTest; if (toTest.

Matches(A) AND!toTest. Matches(B)) { ... } – Ben S Mar 4 '09 at 21:54 I should have been more clear - the pieces are not fully independent. If A matches part of the string, then we care if ~B matches the rest of it (but not necessarily the whole thing).

This was for the windows command-line findstr function, which I found is restricted to true regexs, so moot point. – notnot Mar 4 '09 at 22:07.

This seems somewhat a basic question of Formal Languages or Theoretical Computer Science classes. I'm assuming that it is not homework based on your reputation and previous answers, so I'm answering this. The complement of a regular language is also a regular language, but to construct it you have to build the DFA for the regular language, and make any valid state change into an error.

See this for an example. What the page doesn't say is that it converted /(ac|bd)/ into /(a^c? |b^d?

|^ab)/. The conversion from a DFA back to a regular expression is not trivial. It is easier if you can use the regular expression unchanged and change the semantics in code, like suggested before.

If I were dealing with actual regex's then this would all be moot. Regex now seems to refer to the nebulous CSG-ish (?) space of pattern matching that most langauges support. Since I need to match (A and ~B), there's no way to remove the negation and still do it all in one step.

– notnot Mar 4 '09 at 21:48 Lookahead, as described above, would have done it if findstr did anything beyond true DFA regexs. The whole thing is sort of odd and I don't know why I have to do this command-line (batch now) style. It's just another example of my hands being tied.

– notnot Mar 4 '09 at 21:53 @notnot: You are using findstr from Windows? Then you just need /v. Like: findstr A inputfile | findstr /v B > outputfile.

Txt The first matches all lines with A, the second matches all lines that doesn't have B. – Juliano Mar 4 '09 at 22:04 Thanks! That's actually exactly what I needed.

I didn't ask the question that way, though, so I still giving the answer to Gumbo for the more generalized answer. – notnot Mar 4 '09 at 17:16.

Pattern - re str. Split(/re/g) will return everything except the pattern. Test here.

Won't work, defines a character class, and matches a single character, not a subpattern. – Richard Mar 4 '09 at 18:43 You might try using ( ) then – Nerdling Mar 4 '09 at 18:45 Wouldn't work at all. ^XYZ simply won't match the characters X, Y, or Z.

Meaning not only will it not-match "XYZ", but also "ZXY", which is not a pattern matched by the regex XYZ. Meaning that this "solution" fails the basic requirements. – Devin Jeanpierre Mar 4 '09 at 18:45 he's right, my bad.

I think thats just highlighted a bug in some regex I have! – Andrew Bullock Mar 4 '09 at 21:38 always providing – notnot Mar 4 '09 at 21:39.

A^B literally matches, A, and ~ B so the following match AC AD AF and these doesn't AB Q I think it works by making a character class of everything thats not B (AC-Za-z0-9) I believe that includes all off the asciibet.

B merely says "any character other than B". The original question was related to an expression. You can't just put an arbitrary expression inside .

– Bryan Oakley Mar 4 '09 at 19:00 " I'm faced with a situation where I have to match an (A and ~B) pattern. " technically, A^B is that pattern... – Ape-inago Mar 6 '09 at 20:20.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Regex - Match a Pattern Before a Character?

Related Questions

Regex help: My regex pattern will match invalid strings?

Regex with negative matching (ie, find string that _doesn't_ match regex)?

Match until a certain pattern using regex?

Regex to match a URL pattern for a htaccess file?

Regex - how to match everything except a particular pattern?

Match pattern regex coldfusion?