Dividing a plane of points into two equal halves?

I have assumed the points are distinct, otherwise there might not even be such a line.

Up vote 18 down vote favorite 7 share g+ share fb share tw.

Given a 2 dimensional plane in which there are n points. I need to generate the equation of a line that divides the plane such that there are n/2 points on one side and n/2 points on the other. (by the way this not home work, I am just trying to solve the problem) algorithm math geometry computational-geometry link|improve this question edited Jun 24 '10 at 22:35BlueRaja - Danny Pflughoeft17.9k33775 asked Jun 23 '10 at 23:36mousey1,1731420 97% accept rate.

– ChrisW Jun 23 '10 at 23:38 @chrisW It doesn't matter – mousey Jun 23 '10 at 23:39 ^^^^ is a very good point. It can be solved either way, though varying angle is harder. – zebediah49 Jun 23 '10 at 23:40 Are you working with integer only, or floating points are okey?

– Nicolas Viennot Jun 23 '10 at 23:46 @Pafy float is better – mousey Jun 24 '10 at 0:23.

I have assumed the points are distinct, otherwise there might not even be such a line. If points are distinct, then such a line always exists and is possible to find using a deterministic O(nlogn) time algorithm. Say the points are P1, P2, ..., P2n.

Assume they are not all on the same line. If they were, then we can easily form the splitting line. First translate the points so that all the co-ordinates (x and y) are positive.

Now suppose we magically had a point Q on the y-axis such that no line formed by those points (i.e. Any infinite line Pi-Pj) passes through Q. Now since Q does not lie within the convex hull of the points, we can easily see that we can order the points by a rotating line passing through Q.

For some angle of rotation, half the points will lie on one side and the other half will lie on the other of this rotating line, or, in other words, if we consider the points being sorted by the slope of the line Pi-Q, we could pick a slope between the (median)th and (median+1)th points. This selection can be done in O(n) time by any linear time selection algorithm without any need for actually sorting the points. Now to pick the point Q.

Say Q was (0,b). Suppose Q was collinear with P1 (x1,y1) and P2 (x2,y2). Then we have that (y1-b)/x1 = (y2-b)/x2 (note we translated the points so that xi > 0).

Solving for be gives be = (x1y2 - y1x2)/(x1-x2) (Note, if x1 = x2, then P1 and P2 cannot be collinear with a point on the Y axis). Consider |b|. |b| = |x1y2 - y1x2| / |x1 -x2| Now let the xmax be the x-coordinate of the rightmost point and ymax the co-ordinate of the topmost.

Also let D be the smallest non-zero x-coordinate difference between two points (this exists, as not all xis are same, as not all points are collinear). Then we have that |b| b_0 = xmax*ymax/D D can be found in O(nlogn) time. The magnitude of b_0 can get quite large and we might have to deal with precision issues.

Of course, a better option is to pick Q randomly! With probability 1, you will find the point you need, thus making the expected running time O(n). If we could find a way to pick Q in O(n) time (by finding some other criterion), then we can make this algorithm run in O(n) time.

I'd been racking my brain all night trying to figure this one out, but you beat me again, M. As soon as you said "suppose we magically had a point Q on the y-axis..." it came to me; it seems so easy now... but, too late! +1, amazing work – BlueRaja - Danny Pflughoeft Jun 24 '10 at 5:30 @BlueRaja: Thanks!

O(n) seems so close, though... – Aryabhatta Jun 24 '10 at 5:37 @Moron: in 2xmax*ymax/D, the 2 isn't necessary: |x1y2 - y1x2| 0. – Nicolas Viennot Jun 24 '10 at 13:07 @Pafy. You are right.

I have edited the answer to reflect that. Thanks! – Aryabhatta Jun 24 '10 at 13:18 1 @andand: I talked about sorting only to visualize what is going on.

We don't actually need to sort, as we need only the 'middle' element. – Aryabhatta Jun 24 '10 at 16:07.

Create an arbitrary line in that plane. Project each point onto that line a.k. A for each point, get the closest point on that line to that point.

Order those points along the line in either direction, and choose a point on that line such that there is an equal number of points on the line in either direction. Get the line perpendicular to the first line which passes through that point. This line will have half the original points on either side.

There are some cases to avoid when doing this. Most importantly, if all the point are themselves on a single line, don't choose a perpendicular line which passes through it. In fact, choose that line itself so you don't have to worry about projecting the points.

In terms of the actual mathematics behind this, vector projections will be very useful.

Same problem as Chris' solution, multiple points could have same projection. – BlueRaja - Danny Pflughoeft Jun 23 '10 at 23:47 That's only an issue if those points are the two bracketing the middle. In that case, choose a different line in part 1.

– tlayton Jun 23 '10 at 23:50 So you are suggesting "choose a random line, see if you can fit it to split the set in two, otherwise choose another random line? " – BlueRaja - Danny Pflughoeft Jun 23 '10 at 23:54 The final resulting line will be perpendicular to the line you choose. So yes, but the new line will give you a different result.

But if you encounter this problem, it means that that particular set of points can't be split like this with a line in that particular direction. – tlayton Jun 23 '10 at 23:59 1 There is a deterministic way (though O(nlogn) currently) to find without any guessing (see my answer). Guessing is more practical though, as the chances of getting a bad guess are negligible.

– Aryabhatta Jun 23 '107 at 2:39.

I'd guess that a good way is to sort/sequence/order the points (e.g. From left to right), and then choose a line which passes through (or between) the middle points in the sequence.

Also, if exactly half are on one side and half on the other, the line can't pass through any of them (maybe one, if n is odd). – BlueRaja - Danny Pflughoeft Jun 23 '10 at 23:43.

There are obvious cases where no solution is possible. E.g. When you have three heaps of points.

One point at location A, Two points at location B, and five points at location C. If you expect some decent distribution, you can probably get a good result with tlayton's algorithm. To select the initial line slant, you could determine the extent of the whole point set, and choose the angle of the largest diagonal.

I don't find it obvious - in fact, I don't think it's true. I can't imagine a scenario where you couldn't divide C in half such that A and three points from C are in one half, B and two points from C are in the other. – BlueRaja - Danny Pflughoeft Jun 23 '10 at 23:56 You cannot divide C in half.

Either all five points are on the line, or they are not. – relet Jun 23 '10 at 23:58 Oh I see what you are saying - I was under the impression all n points were distinct. – BlueRaja - Danny Pflughoeft Jun 24 '10 at 0:00 Heh, indeed.

That needs to be clarified. I was thinking of n points as distinct objects in a list, which may refer to the same coordinates. :) – relet Jun 24 '10 at 0:00.

The median equally divides a set of numbers in the manner similar to what you're trying to accomplish, and it can be computed in O(n) time using a selection algorithm (the writeup in Cormen et al is better, so you may want to look there instead). So, find the median of your x values Mx (or your y values My if you prefer) and set x = Mx (or y = My) and that line will be axially aligned and split your points equally. If the nature of your problem requires that no more than one point lies on the line (if you have an odd number of points in your set, at least one of them will be on the line) and you discover that's what's happened (or you just want to guard against the possibility), rotate all of your points by some random angle,?

, and compute the median of the rotated points. You then rotate the median line you computed by -? And it will evenly divide points.

The likelihood of randomly choosing? Such that the problem manifests itself again is very small with a finite number of points, but if it does, try again with a different?.

This is a modification of Dividing a plane of points into two equal halves which allows for the case with overlapping points (in which case, it will say whether or not the answer exists). If number of points is odd, return "impossible". Pick a random line (two random points) Project all points onto the line (`O(N)` operation) Perform any median-finding algorithm (`O(N)` or faster-if-desired operation) (returns 2 medians if no overlapping points) Return the line perpendicular to original random line that splits the medians In extremely rare case of overlapping points, repeat.

This is O(N) unlike other proposed solutions. Assuming there are no overlapping points, the probability of this not working is LITERALLY 0%, though it is still possible. (It is actually a bit higher than 0%, due to machine precision, so don't try this with an adversary.

) Try the above algorithm a few times unless you detect overlapping points. If you detect a high number of overlapping points, you may be in for a rough ride, but there is a terribly inefficient brute-force solution: For every "critical slope range", perform the above algorithm by choosing a line with a slope within the range. If all critical slope ranges fail, the solution is impossible.

A critical angle is defined as the angle which could possibly change the result (imagine the solution to a previous answer, rotate the entire set of points until a single point goes through the failed dividing line). There are only finitely many of these, and I think they equal to the number of points, so you're probably looking at something in the range O(N^2)-O(N^2 log(N)) if you have overlapping points, for a brute-force approach.

I don't know how useful this is I have seen a similar problem... If you already have the directional vector (aka the coefficients of the dimensions of your plane). You can then find two points inside that plane, and by simply using the midpoint formula you can find the midpoint of that plane. Then using the coefficients of that plane and the midpoint you can find a plane that is from equal distance from both points, using the general equation of a plane.

A line then would constitute in expressing one variable in terms of the other so you would find a line with equal distance between both planes. There are different methods of doing this such as projection using the distance equation from a plane but I believe that would complicate your math a lot.

To add to M's answer: a method to generate a Q (that's not so far away) in O(n log n). To begin with, let Q be any point on the y-axis ie. Q = (0,b) - some good choices might be (0,0) or (0, (ymax-ymin)/2).

Now check if there are two points (x1, y1), (x2, y2) collinear with Q. The line between any point and Q is y = mx + b; since be is constant, this means two points are collinear with Q if their slopes m are equal. So determine the slopes mi for all points and check if there are any duplicates: (amoritized O(n) using a hash-table) If all the m's are distinct, we're done; we found Q, and M's algorithm above generates the line in O(n) steps.

If two points are collinear with Q, we'll move Q up just a tiny amount? , Qnew = (0, be +? ), and show that Qnew will not be collinear with two other points.

The criterion for? , derived below, is:? 0 and xi > 0, all m's are reduced, and all are reduced by a maximum of?

/xmin. Thus, if? /xmin , ie.? *xmin is true, then two mi which were previously unequal will be guaranteed to remain unequal.

All that's left is to show that if m1,old = m2,old, then m1,new =/= m2,new. Since both mi were reduced by an amount? /xi, this is equivalent to showing x1 =/= x2.

If they were equal, then: y1 = m1,oldx1 + be = m2,oldx2 + be = y2 Contradicting our assumption that all points are distinct. Thus, m1, new =/= m2, new, and no two points are collinear with Q.

We can assume x1 =/= x2 as then the line will be parallel to the y-axis. – Aryabhatta Jun 24 '10 at 21:34 @M: Yep; that's essentially the same thing (it wouldn't necessarily be true if the points weren't distinct) – BlueRaja - Danny Pflughoeft Jun 24 '10 at 22:36.

I picked up the idea from Moron and andand and continued to form a deterministic O(n) algorithm. I also assumed that the points are distinct and n is even (thought the algorithm can be changed so that uneven n with one point on the dividing line are also supported). The algorithm tries to divide the points with a vertical line between them.

This only fails if the points in the middle have the same x value. In that case the algorithm determines how many points with the same x value have to be on the left and lower site and and accordingly rotates the line. I'll try to explain with an example.

Let's asume we have 16 points on a plane. First we need to get the point with the 8th greatest x-value and the point with the 9th greatest x-value. With a selection algorithm this is possible in O(n), as pointed out in another answer.

If the x-value of that points is different, we are done. We create a vertical line between that two points and that splits the points equal. Problematically now is if the x-values are equal.

So we have 3 sets of points. That on the left side (x xa). The idea now is to count the points on the left side and calculate how many points from the middle needs to go there, so that half of the points are on that side.

We can ignore the right side here because if we have half of the points on the left side, the over half must be on the right side. So let's asume we have we have 3 points (c=3) on the left side, 6 in the middle and 7 on the right side (the algorithm doesn't know the count from the middle or right side, because it doesn't need it, but we could also determine it in O(n)). We need 8-3=5 points from the middle to go on the left side.

The points we already got in the first step are useless now, because they are only determined by the x-value and can be any of the points in the middle. We want the 5 points from the middle with the lowest y-value on the left side and the point with the highest y-value on the right side. Again using the selection algorithm, we get the point with the 5th greatest y-value and the point with the 6th greatest y-value.

Both points will have the x-value equal to xa, else we wouldn't get to this step, because there would be a vertical line. Now we can create the point Q in the middle of that two points. Thats one point from the resulting line.

Another point is needed, so that no points from the left or right side are divided. To get that point we need the point from the left side, that has the lowest angle (bh) between the the vertical line at xa and the line determined by that point and Q. We also need that point from the right side (with angle ag).

The new point R is between the point with the lower angle and a point on the vertical line (if the lower angle is on the left side a point above Q and if the lower angle is on the right side a point below Q). The line determined by Q and R divides the points in the middle so that there are a even number of points on both sides. It doesn't divide any points on the left or right side, because if it would that point would have a lower angle and would have been choosen to calculate R.

From the view of a mathematican that should work well in O(n). For computer programs it is fairly easy to find a case where precision becomes a problem. An example with 4 points would be A(0, 100000000), B(0, 100000001), C(0, 0), D(0.

0000001, 0). In this example Q would be (0, 100000000.5) and R (0.00000005, 0). Which gives B and C on the left side and A and D on the right side.

But it is possible that A and B are both on the dividing line, because of rounding errors. Or maybe only one of them. So it belongs to the input values if this algorithm suits to the requirements.

Get that two points Pa(xa, ya) and Pb(xb, yb) which are the medians based on the x values > O(n) if xa! = xb you can stop here because a y-axis parallel line between that two points is the result > O(1) get all points where the x value equals xa > O(n) count points with x value less than xa as c > O(n) get the lowest point Pc based on the y values from the points from 3. > O(n) get the greatest point Pd based on the y values from the points from 3.

> O(n) get the (n/2-c)th greatest point Pe based on the y values from the points from 3. > O(n) also get the next greatest point Pf based on the y values from the points from 3. > O(n) create a new point Q (xa, (ye+yf)/2) between Pe and Pf > O(1) for all points Pi calculate the angle ai between Pc, Q and Pi and the angle bi between Pd, Q and Pi > O(n) get the point Pg with the lowest angle ag (with ag>0° and ag O(n) get the point Ph with the lowest angle bh (with bh>0° and bh O(n) if there aren't any Pg or Ph (all points have same x value) create a new point R (xa+1, 0) anywhere but with a different x value than xa else if ag is lower than bh create a new point R ((xc+xg)/2, (yc+yg)/2) between Pc and Pg else create a new point R ((xd+xh)/2, (yd+yh)/2) between Pd and Ph > O(1) the line determined by Q and R divides the points > O(1).

It is a bit hard to read. Also, few questions, what happens if a_g = b_h? Also, is it possible that the line QR divides points to the right of points in step 3?

(i. E x-cordinate > x_a) in which case you might not get an n/2-n/2 split? – Aryabhatta Jun 24 '10 at 17:51 @Rudi-moore: Rotating the line around Q might not work.

I believe (though not completely sure) we can give you a configuration of points such that no line through Q will make an even split. You seem to be completely ignoring the points to the right of the vertical line. Also, the case when a_h = b_h and c=0 is undiscussed.

– Aryabhatta Jun 24 '10 at 21:13 @M Sorry i'm not a native speaker, mabye that makes it a bit harder to understand. I edited the post and hope to make it clearer with an example. For your question: if a_g = b_h you can choose P_g or P_h.

It doesn't matter. Both will generate the same line. It is not possible that QR divides points on the right side, if it would another R would have been choosen.

I'm not completly ignoring points on the right side. Only in the first steps. C=0 is only problematically if more than n/2 points are in the middle and in that case n/2 points from the middle go to left side.

– rudi-moore Jun 25 '10 at 8:14 @Rudi-moore: Good job! This looks right to me. It was simpler than I imagined it would be.

+1. – Aryabhatta Jun 25 '10 at 12:58 @mousey, @BlueRaja: I suggest you take a look at this answer. I believe this works!

I think this derserves to be the accepted answer. – Aryabhatta Jun 25 '10 at 12:59.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions