Localization of numbers within a complex scene image?

You know that your problem is not a simple one, but it seems very interesting! Although I don't have any solutions for you, I will just share my thoughts in hope that you can make something out of it.

You know that your problem is not a simple one, but it seems very interesting! Although I don't have any solutions for you, I will just share my thoughts in hope that you can make something out of it. Let's take 2 of your photos as examples: Photo-A:

/ It shows a single person with a relative "big" green label with numbers in his shirt.

Photo-B:

/ It shows a lot of people with red smaller labels in their shirts. (The labels' height in pixels is about 1/5 of the label in Photo-A) Considering the above photos, I will try to write some random thoughts which may help... (a) Define your scale: There is no point to apply a search algorithm to find labels from 2x2 pixels up-to the full image resolution. You must define the minimum/maximum limits for width & height of a label.

Those limits may depend on many different factors: (1) One factor is the real size of labels (defined by the distance of people from camera) which can be defined as a percentage of the image width & height. (2) Another factor is the actual reading accurracy of the OCR you are going to use. If the numbers' image height is smaller than Y1 pixels or bigger than Y2 pixels the OCR will not be able to read it (it sounds strange but it's true: big images may seem very clear to the human eye, but an OCR may have problems reading it).(b) Find the area(s) of interest: In your case, this is equivalent to "Find the approximate position of labels".

We can define an athlete label roughly as "An (almost) rectangular area, which may be a bit inclined relative to photo borders, and contains: A central area of black + color C1 e.g. Red or green + a white (=neutral) area on top and/or bottom of it". A possible algorithm to find the approximate position of a label is: (1) Traverse all image left-to-right, top-to-bottom and examine a square area of MinHeight/2 x MinHeight/2 (2) Create the histogram of the square area (or posterize it e.g.To 8 levels) and try to find if there is only Black + Another color C1 in a percentage of e.g. Black: 40% +/- 10, Color: 60% +/- 10% (3) If (2) is true try to expand the area to Right and Bottom while the percentages are kept in the specified limits (4) If the square is fully expanded, check if the expanded area size is inside the min/max limits of width/height you specified in (a). If not, go to step 1 (5) Process the expanded area to read the numbers - see (c) bellow (6) Goto to step 1 (c) Process the area(s) of interest: Try the following steps: (1) Convert each image-area to Grayscale by applying a color filter that burn Color C1 to white.(2) Equalize the Grayscale to make the black letters stand-out (3) If an inclination has been detected, perform a reverse rotation on the image-area to make the letters as horizontal as possible.(4) Feed the area to an OCR trained only for numbers Good luck with your project!

Thanks for the detailed info. Will definitely try this, but in the meantime I got some ideas. Can this be done using some kind of template matching?

Especially one which is scale invariant. In that case I can collect the samples of the digits, and run them through the template matching algo. Something similar to eigenfaces.

– Arnolin Jul 23 at 7:50 @Arnolin Yes, an eigenimages-like algorithm seems also like a good idea! You can recognize each digit as a separate pattern, accumulate the digits to numbers according to their position and accept e.g. Only the numbers with more than 3-4 digits. Good luck :-) – Fivos Vilanakis Jul 23 at 11:51.

You could try to contact the author of this software: Yaroslav is an active member of StackOverflow.

Great! I'll contact him, probably he can give me some pointers. BTW, I've seen your answers to some image processing questions here, you have any ideas on this?

– Arnolin Jul 24 at 4:12 @Arnolin I do have some ideas, but not being able to test them (due to the complex nature of the problem) I prefer to stay quiet. Image processing is one of those topics where the theory is simply not enough due to real world conditions. I think Yaro is one of your best bets around.

– belisarius Jul 24 at 4:28.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions