Reflections on a hard-won win for PySAL

As a way to learn Python (and I mean really come to know it), I took on the project of converting PySAL to be compatible with Python 3.

I started work on this two years ago. A tentative candidate for release is available in my fork

Now, I’m glad I did this. I still worry about the day when Apple decides to ship the next OSX (Half Moon Bay? Big Sur? Salton Sea?) as Python 3 only, like many popular Linux distributions are aiming to.

But, this work was not always so easily motivated. There were countless discussions online that the conversion process wasn’t worth it, that it didn’t really matter, that scientific computing wasn’t ever going to take it up. Some even stated that widespread adoption in the scientific community would take quite a bit. None of this was reassuring, no matter how quickly the python 3 Wall of Shame Superpowers was changing to green.

I took a lot of solace in Nick Coghlan’s Q & A, and I stumbled through six documentation. I used Arch Linux, and their move to Python 3 from Python 2 as the default python executable (prompting this PEP) was super encouraging.

But, as I finally started wrapping my head around how Metaclasses worked and how our library was organized, I realized: This wasn’t as hard as it appeard initially. In fact, I was writing Python 3 code almost by default.

By far, the hardest part of the conversion effort was the fact that every division operation was suspect. In typical, non-scientific applications, which is probably not too hard to fix. But, lacking the most basic of indications about whether or not a float or an int were being passed to / meant that figuring out when and where things could fail was tedious. Indeed, small bugs were uncovered, where floor division defaults in Python 2 could yield wrong answers in statistical calculations given certain input data. No warnings, no errors, but incorrect results: a nightmare for any library maintainer.

This was my first real tussle with static v. dynamic typing, and I have to say, I think it really pushed me a little bit closer to actually desiring static typing. Part of the big reason I want PySAL to start being written in python 3 is the benefits of type annotations (Guido’s perspective) to give a gradually-typed flavor to Python.

At this point, I’m firmly in the gradual type camp, if not totally won over by Julia’s speedups due to type inference. Although, I should admit my formative experiences in Haskell’s type inference system may be resurfacing here. These questions are also driving factors behind the speed of Cython and the slowness of Python: dynamic typing is much slower than static typing, and 99% of the time (in scientific computing) we don’t need dynamic typing.

In addition, this was also the first time that I encountered such beautiful things as read-only properties in new-style classes, but not because I wanted to use them. Instead, some base classes we overrode in subclasses would error out when assignment occurred.

The tricky thing is, these errors only arose when assignment happened, so some were working correctly in some cases. In addition, it was difficult to identify which properties needed to be overwritten at what point in the hierarchy, because the underlying inheritance structure was less in code and more in filesystem and replicated (by good ol’ CTRL-c).

Overall, the experience of porting a rather large project between versions has taught me:

  1. explicit is better than implicit” seems a bit at crossed-purposes with duck typing, so be explicit about when a duck is a duck if you can.
  2. Namespaces, inheritance, and composition are amazing ideas, and python handles them better than the file system.
  3. Don’t be concerned about “learning” idiomatic Python 3 if you know Python 2.7.
  4. If given the option to write something in a new feature of a language instead of an old feature, always pick the new feature. By the time you wrap your head around it, it’ll be considered “default” and the old feature will be deprecated.

And, since I’m sitting here, staring at our, trying to figure out why it won’t correctly build from source, nor install correctly using the build_2to3 builder, packaging in Python SUCKS and apparently has since 2008.

imported from: yetanothergeographer