Filter non-alphanumeric “repeating” characters?

Based on @sln answer: $str = preg_replace('~(^0-9a-zA-Z)\1+|(?:=*)+|(?:->)+~', '', $str).

The pattern could be something like this : s/(\W_|=\*|->)\1+//g or, if you want to replace by just a single instance: s/(\W_|=\*|->)\1+/$1/g edit ... probably any special sequence should be first in the alternation, incase you need to make something like == special, it won't be grabbed by \W_. So something like s/(==>|=\*|->|\W_)\1+/$1/g where special cases are first.

This is basically what I came up with. If you want to replace with the same character then preg_match('/(\W+)/', '$1', $str); – Jonathan Kuhn Mar 11 at 0:41 Matches _, which is non alphanumeric. – Alix Axel Mar 11 at 0:41 so use \W_ instead of just \W – Jonathan Kuhn Mar 11 at 0:46.

Sin's solution is pretty good but the use of \W "non-word" class includes whitespace. I don't think you wan't to be removing sequences of tabs or spaces! Using a negative class (something like: '^A-Za-z0-9\s') would work better.

This will filter out all symbols code $q = ereg_replace("^A-Za-z0-9 ", "", $q); /code.

Replace(/(^A-Za-z0-9\s+)\1+/, "") will remove repeated patterns of non-alphanumeric non-whitespace strings. However, this is a bad practice because you'll also be removing all non-ASCII European and other international language characters in the Unicode base. The only place where you really won't ever care about internationalization is in processing source code, but then you are not handling text quoted in strings and you may also accidentally de-comment a block.

You may want to be more restrictive in what you try to remove by giving a list of characters to replace instead of the catch-all. Edit: I have done similar things before when trying to process early-version ShoutCAST radio names. At that time, stations tried to call attention to themselves by having obnoxious names like: >.

I used used similar coding to get rid of repeated symbols, but then learnt (the hard way) to be careful in what I eventually remove.

This works for me: preg_replace('/(.)\1{3,}/i', '', $sourceStr); It removes all the symbols that repats 3+ times in row.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions