Regular expression to match one or more of char a or just one of char b?

Begin string Anything not "=" ( to avoid the double "==") One or more blank spaces OR one End of string (^=*\s+|=)$ Should work :-).

Begin string Anything not "=" ( to avoid the double "==") One or more blank spaces OR one "=" End of string ^(^=*\s+|=)$ Should work :-).

Simple explanation, works the way I wanted. Thanks @guidhouse – ram Apr 13 at 18:26.

You can use alternation: (\s+|=)$ This expression means match one or more whitespace character or one equals, at the end of the string. The $ is an anchor which matches the end of the string (as you mentioned you're looking for characters at the end of the string). (As tchrist correctly pointed out in the comments, $ matches the end of line instead of end of string when in multiline mode.

If this is true in your case, and you are indeed looking for the end of the string instead of the end of the line, you can use \Z instead, which matches the end of the string regardless of multiline mode. ) If you want to ensure that there is only one = at the end, you can use a lookaround (in this case, a negative lookbehind, specifically). A lookaround is a zero-width assertion which tells the regex engine that the assertion must pass for the pattern to match, but it does not consume any characters.(\s+|(?=)=)$ In this case, (?=)= means that the = will only match if the previous character is not also a =.

This doesn't check for just one '='. It matches for string ending with two == also – ram Apr 13 at 13:47 1 No, you are mistaken. ‘ \s does not lose its special meaning in a character class.

‘¡ The \ character is not a slash, but a backslash. ‘¢ If you are using Java, \s doesn’t work on Unicode. Perl, in contrast, supports Unicode.

‘£ $ matches both the end of the string and one prior to a newline at the end of a string — or anywhere in multiline mode. – tchrist Apr 13 at 13:47 @ram see my edit. – Daniel Vandersluis Apr 13 at 13:47 I think tchrist is right.

I can't confirm the claim in Perl or Java. – musiKk Apr 13 at 13:48 @tchrist, @musiKk you're both right. I updated my answer to remove that misinformation.

– Daniel Vandersluis Apr 13 at 13:49.

How bizarre! You want the doublequotes to eat your backslash? – tchrist Apr 13 at 13:52.

The metacharacter \b is an anchor like the caret and the dollar sign. It matches at a position that is called a "word boundary". This match is zero-length.

Before the first character in the string, if the first character is a word character. After the last character in the string, if the last character is a word character. Between two characters in the string, where one is a word character and the other is not a word character.

Simply put: \b allows you to perform a "whole words only" search using a regular expression in the form of \bword\b. A "word character" is a character that can be used to form words. All characters that are not "word characters" are "non-word characters".

In all flavors, the characters a-zA-Z0-9_ are word characters. These are also matched by the short-hand character class \w. Flavors showing "ascii" for word boundaries in the flavor comparison recognize only these as word characters.

Flavors showing "YES" also recognize letters and digits from other languages or all of Unicode as word characters. Notice that Java supports Unicode for \b but not for \w. Python offers flags to control which characters are word characters (affecting both \b and \w).

In Perl and the other regex flavors discussed in this tutorial, there is only one metacharacter that matches both before a word and after a word. This is because any position between characters can never be both at the start and at the end of a word. Using only one operator makes things easier for you.

Since digits are considered to be word characters, \b4\b can be used to match a 4 that is not part of a larger number. This regex will not match 44 sheets of a4. So saying "\b matches before and after an alphanumeric sequence" is more exact than saying "before and after a word".

\B is the negated version of \b. \B matches at every position where \b does not. Effectively, \B matches at any position between two word characters as well as at any position between two non-word characters.

Let's see what happens when we apply the regex \bis\b to the string This island is beautiful. The engine starts with the first token \b at the first character T. Since this token is zero-length, the position before the character is inspected.

\b matches here, because the T is a word character and the character before it is the void before the start of the string. The engine continues with the next token: the literal i. The engine does not advance to the next character in the string, because the previous regex token was zero-width.

I does not match T, so the engine retries the first token at the next character position. \b cannot match at the position between the T and the h. It cannot match between the h and the I either, and neither between the I and the s.

The next character in the string is a space. \b matches here because the space is not a word character, and the preceding character is. Again, the engine continues with the I which does not match with the space.

Advancing a character and restarting with the first regex token, \b matches between the space and the second I in the string. Continuing, the regex engine finds that I matches I and s matches s. Now, the engine tries to match the second \b at the position before the l.

This fails because this position is between two word characters. The engine reverts to the start of the regex and advances one character to the s in island. Again, the \b fails to match and continues to do so until the second space is reached.

It matches there, but matching the I fails. But \b matches at the position before the third I in the string. The engine continues, and finds that I matches I and s matches s.

The last token in the regex, \b, also matches at the position before the third space in the string because the space is not a word character, and the character before it is. The engine has successfully matched the word is in our string, skipping the two earlier occurrences of the characters I and s. If we had used the regular expression is, it would have matched the is in This.

Word boundaries, as described above, are supported by most regular expression flavors. Notable exceptions are the POSIX and XML Schema flavors, which don't support word boundaries at all. Tcl uses a different syntax.

In Tcl, \b matches a backspace character, just like \x08 in most regex flavors (including Tcl's). \B matches a single backslash character in Tcl, just like \\ in all other regex flavors (and Tcl too). Tcl uses the letter "y" instead of the letter "b" to match word boundaries.

\y matches at any word boundary position, while \Y matches at any position that is not a word boundary. These Tcl regex tokens match exactly the same as \b and \B in Perl-style regex flavors. They don't discriminate between the start and the end of a word.

Tcl has two more word boundary tokens that do discriminate between the start and end of a word. \m matches only at the start of a word. That is, it matches at any position that has a non-word character to the left of it, and a word character to the right of it.

It also matches at the start of the string if the first character in the string is a word character. \M matches only at the end of a word. It matches at any position that has a word character to the left of it, and a non-word character to the right of it.

It also matches at the end of the string if the last character in the string is a word character. The only regex engine that supports Tcl-style word boundaries (besides Tcl itself) is the JGsoft engine. In PowerGREP and EditPad Pro, \b and \B are Perl-style word boundaries, and \y, \Y, \m and \M are Tcl-style word boundaries.

In most situations, the lack of \m and \M tokens is not a problem. \yword\y finds "whole words only" occurrences of "word" just like \mword\M would. \Mword\m could never match anywhere, since \M never matches at a position followed by a word character, and \m never at a position preceded by one.

If your regular expression needs to match characters before or after \y, you can easily specify in the regex whether these characters should be word characters or non-word characters. If you want to match any word, \y\w+\y will give the same result as \m. Using \w instead of the dot automatically restricts the first \y to the start of a word, and the second \y to the end of a word.

Note that \y. +\y would not work. This regex matches each word, and also each sequence of non-word characters between the words in your subject string.

That said, if your flavor supports \m and \M, the regex engine could apply \m\w+\M slightly faster than \y\w+\y, depending on its internal optimizations. If your regex flavor supports lookahead and lookbehind, you can use (? matches at the end of a word, like Tcl's \M. Did this website just save you a trip to the bookstore?

Please make a donation to support this site, and you'll get a lifetime of advertisement-free access to this site!

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions