Why this regular does not Match the html using python [closed]?

You should make the regex case insensitive because the color is #A7A7A7 and you're matching #a7a7a7. You can try it on many sites as: regexpal.com.

Do not use regular expressions to parse HTML. Please! Use an HTML parser!

Why not use regex, you ask? Regular expression to match closing HTML tags Can you provide some examples of why it is hard to parse XML and HTML with a regex? Using regular expressions to parse HTML: why not?

Can regular expressions be used to match nested patterns? If you're not supposed to use Regular Expressions to parse HTML, then how are HTML parsers written? And the ever-famous RegEx match open tags except XHTML self-contained tags.

I would like to point out that BeautifulSoup is basically a bunch of regular expressions. This answer is still correct, but it adds an interesting perspective. – Henry May 17 at 5:22 That's an extreme simplification.

Many parsers for many languages can make use of regular expressions, but that's very different from a parser that consists of a big pile of regex. It is impossible to implement an HTML (or XML, for that matter) parser strictly using regular expressions, because HTML and XML are context-free, not regular, languages. Have you see this question?

– Matt Ball May 17 at 13:16 1 fair enough, 'basically a bunch' was dramatic. There is of course a great deal more to BeautifulSoup, and your answer to the question on this page shows in depth how regular expressions on their own are not suitable. – Henry May 17 at 14:25.

At the very least, you have a case-sensitivity problem in the color. Plus, you might want to meditate on BoltClock's comment.

Like @BoltClock mentions, it is not recommended to use regex like this. If not now, sometime down the line you will regret it. There are lots of corner cases which will make the regex complex and also plain useless at times.

Anyway, at a cursory glance, for background-color you have used a-z0-9 that will only match lower case. But the sample has uppercase. You may want to have upper case as well a-zA-Z0-9 For the other colors also, why don't you use the same?

Why a (.+? ).

In addition to what many other people are saying, you may want to use the re. UNICODE flag, since it looks like you have some Japanese characters in there.

Anyway, at a cursory glance, for background-color you have used a-z0-9 that will only match lower case. But the sample has uppercase. You may want to have upper case as well a-zA-Z0-9 For the other colors also, why don't you use the same?

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions