This wasn’t a constraint I wanted. But, just getting pure
pip-installable packages like
PySAL was difficult enough. The admins didn’t want to install one of the big scientific Python distributions and would only let something with minimal impact get used.
So, in composing the workshop, I focused on
Shapely. In it, I had the students build an analogue to the
GeoPandas dataframe by constructing dataframes with series containing a
Shapely object. This was remarkably easy, but helped the attendees understand how to do spatial operations on
Overall, though, I was struck with how easy it was to replicate the core class features of
GeoPandas, notably without the use of a heavy IO library like
Fiona. Every few months, the
PySAL team tends to discuss whether or not to migrate to using
Fiona on the backend, instead of our currently existing, pure Python solutions. I’ve been working on fixing up these solutions to get ready for a release of
PySAL in Python 3, so moving to something like
Fiona looked like it’d save a ton of work. But, the costs in speed and install difficulty for adding a dependency on
OGR has never been a popular option.
So, just now, I pushed up a pseudo-fork of
GeoPandas that incorporates the main functionalities of
GeoPandas while trying to strip any reference or call to
OGR. The idea here would be that this fits in any space
PySAL fits in, and works as an alternative frontend to
PySAL’s current heavy leverage of raw
numpy matrices and vectors.
But, having just recently re-read the ESR column How to be a Hacker, I wondered how productive this conversion/strip job would actually make people. In the essay, Eric Raymond recommends that no problem should be solved twice.
And, in this case, I think I’ve certainly re-solved the problem of reading in spatial data into a GeoDataframe. But, notably, I’ve done it using a backend with much less generality than the other solutions!
So, why did I decide to do this? I can certainly handle an
OGR install. Anyone who wants to use
PySAL on a
GeoPandas-like dataframe should themselves be able to string together simple calls to
numpy.array to cast their dataframes down to vectors that
PySAL can use. In fact, I’m still working on solving a core issue of
PySAL interoperability: construction of a spatial weights object inline with the GeoDataframe.
I guess I’m happy with the fact that this was bourne out of a realization had during that workshop. And, this was a great exercise in (yet more)
io hacking with
PySAL. Where some might not be able to handle an
OGR install and thus not be able to grab
geodf contrib module could work. But, I think, in the future, I’ll be focused more on linking GeoDataframe instances to
PySAL analytics methods (using
Patsy & a fast
shapely->PySAL weights constructor) rather than simply replicating an
io framework that
Fiona makes more general and more easy to use.
In full, swapping
PySAL.core.IOHandlers in as the main engine driving
IO has convinced me that an internal GeoJSON representation fully abstracted from whether the underlying data is geojson, shapefile, coverage, PostGIS, or whatever, as is provided by
Fiona, is still probably the right way to do geospatial
imported from: yetanothergeographer