A Tricky Bit Implementing Skyum’s Algorithm
My dissertation is on gerrymandering. While I’m very sure it’s not the best way to do it, many people use geometric measures of district shapes as a way to detect gerrymandering.
One of the oldest and most well-used is the Reock measure, first developed in 1960. It’s a bit tougher than other kinds of gerrymandering measures to construct. For instance, the most popular measure is the Polsby-Popper measure, as used in this Washinton Post piece, is actually identical to a shape measure calle the Isoperimetric Quotient. The IPQ/Polsby-Popper measure relates the area of a target shape to the area of the circle enclosed by the target shape’s perimeter. It’s understandable (and indeed, simple to show empirically and theoretically) that the IPQ measures how efficiently a shape “encloses” space, in terms of area enclosed per perimeter.
However, the Reock measure is both more difficult to construct and more difficult to understand. Reock is the ratio of the area of a Minimum Enclosing Circle to the area of the target shape. Now, for arbitrary points in 2 dimensions, this problem is solved with a two relatively good algorithms. A randomized approach of Welzl is O(N), which means the time it takes to compute the MBC is proportional to the number of points in the polygon being enclosed. Another algorithm due to Skyum (1990) is O(n log(n)), but is typically much faster because the factor that makes Welzl’s algorithm proportional to N is rather large, and is occasionally larger than log(n).
But, it’s not perfect, and currently gives slightly incorrect results in some cases.
For instance, take the Columbus, OH dataset used by many of my colleagues who work on PySAL. I can construct the MBC for most of the shapes quickly, but some shapes are only approximately contained by their MBC. That is, my implementation, in a very small number of cases, returns only approximately the MBC.
First, though, the algorithm. Skyum’s algorithm proceeds by picking neighboring triples points on the boundary of the convex hull of a shape that sweep out the largest angle. If multiple triples form the same angle, sort by the radius length of those three points’ circumcircle. In math jargon, this means the set of tuples (radius, angle) are lexicographically ordered. If the largest angle in this ordering is ever at or below 90 degrees, the algorithm has found the MBC. Otherwise, remove the center of the neighbor triple with the largest angle and construct new angles & circumcircles.
To illustrate my implementation’s mixed success, consider the third and fourth shapes in the list. (python is zero index, so the image will have them labelled as shapes 2 and 3, respectively). The shape outline is in black. The constructed candidate for the MBC in each iteration is cyan. The candidate convex hull is in dashed magenta. Points that have been considered are red and get more transparent as they age. The most recent point considered and rejected is magenta.
First, let’s look at shape four, where the algorithm correctly identifies the MBC:
You see, as the circle continues to get constructed, the points in the candidate set always decrease. The circle is not always containing the polygon, but this doesn’t matter, because the algorithm is using a somewhat counterintuitive objective ro preserve optimality over each iteration. Regardless, at the end, it’s pretty apparent the minimum bounding circle has been found, and nothing looks like it’s outside of the cyan circle.
Now for the third shape:
If you look closely, you’ll see that, in the last iteration, a point on the bottom right of the shape is left outside of the circle. It’s barely outside, but it is, in fact, outside when you examine the image closer (and when you check mathematically whether it falls within the circle).
This is important, since the size of the candidate MBC (cyan) does not have consistent, guaranteed behavior from iteration to iteration. The objective maximizes the angle of triples first, and then organizes by a measure proportional to area. Thus, it is occasionally the case that the candidate MBC changes size drastically and (most importantly) in a non-monotonic way between iterations. Thus, it’s probably possible to construct a pathological example where the true MBC is very different from the one my implementation finds.
When I verify the implementation, though, each step proceeds as according to the rules laid out in Skyum (1990). This means I’ve got some arcane implementation error somewhere in the code, and that’s no fun to find.
But, importantly, this MBC solver works pretty quickly, has relatively decent accuracy, and was fun to implement.
imported from: yetanothergeographer