3  Interfacing with GIS

In order to disrupt existing practices and really develop something new in Julia, we had to make some hard decisions along the way. One of these decisions relates to how we are willing to interface our framework with existing GIS standards and workflows.

On the one hand, we could have followed the path that was followed by other communities such as Python and R, and focus our energy interfacing with well-tested GIS libraries written in C/C++ (e.g., GDAL, GEOS). This is precisely what the JuliaGeo organization has been doing over the years, and it is an important agenda to bring people from other languages that are used to the OGC standards.

On the other hand, we have young geoscientists and first-time programmers who have never studied GIS before, and who really struggle learning the technology as it is today. The widespread emphasis on machine representation and software engineering has created a gap between the developers and the users of GIS software. A typical gap the Julia programming language helps to close.

We decided to limit our interface with existing GIS technology to input and output (IO) of files. This gives users of the framework the chance to

  1. Import geospatial data stored as simple features
  2. Perform geospatial data science with a rich set of tools
  3. Export results to widely used software (e.g., QGIS, ArcGIS)

It creates an ecosystem where users can become contributors and maintainers of the framework, without any knowledge of a second programming language.

3.1 GeoIO.jl

The GeoIO.jl module can load and save geospatial data on disk in a variety of formats, including the most popular formats in GIS (e.g., .shp, .geojson, .kml, .parquet) thanks to various backend packages spread across various Julia organizations. It is designed for users who just want to get their data ready for geospatial data science.

To load a file from disk, we use GeoIO.load:

using GeoIO

geotable = GeoIO.load("file.shp")

The function automatically selects the backend based on the file extension, converts the simple features into a geospatial domain, and returns a GeoTable.

To save the GeoTable to disk, possibly in a different format, we use GeoIO.save:

GeoIO.save("file.geojson", geotable)

The module fixes inconsistencies between formats whenever possible. For example, the GeoJSON format writes Date columns as String because the JSON format has no date types. The Shapefile format has its own limitations, etc.

Over time, we expect to improve the ecosystem as a whole by highlighting various issues with available standards and backend implementations.

3.2 File formats

Most GIS file formats do not preserve topological information. This means that neighborhood information is lost as soon as geometries are saved to disk. To illustrate this issue, we consider a geotable over a CartesianGrid:

geotable = georef((A=rand(10, 10), B=rand(10, 10)))
100×3 GeoTable over 10×10 CartesianGrid{2,Float64}
A B geometry
Continuous Continuous Quadrangle
[NoUnits] [NoUnits]
0.164241 0.562948 Quadrangle((0.0, 0.0), ..., (0.0, 1.0))
0.59019 0.509818 Quadrangle((1.0, 0.0), ..., (1.0, 1.0))
0.720319 0.0188277 Quadrangle((2.0, 0.0), ..., (2.0, 1.0))
0.0416395 0.511655 Quadrangle((3.0, 0.0), ..., (3.0, 1.0))
0.59616 0.392176 Quadrangle((4.0, 0.0), ..., (4.0, 1.0))
0.75092 0.730826 Quadrangle((5.0, 0.0), ..., (5.0, 1.0))
0.00279647 0.636113 Quadrangle((6.0, 0.0), ..., (6.0, 1.0))
0.72922 0.384358 Quadrangle((7.0, 0.0), ..., (7.0, 1.0))
0.534746 0.261772 Quadrangle((8.0, 0.0), ..., (8.0, 1.0))
0.357367 0.712609 Quadrangle((9.0, 0.0), ..., (9.0, 1.0))
â‹® â‹® â‹®

If we save the geotable to a .geojson file on disk, and then load it back, we observe that the CartesianGrid gets replaced by a GeometrySet:

using GeoIO

fname = tempname() * ".geojson"

GeoIO.save(fname, geotable)

GeoIO.load(fname)
100×3 GeoTable over 100 GeometrySet{2,Float32}
A B geometry
Continuous Continuous PolyArea
[NoUnits] [NoUnits]
0.164241 0.562948 PolyArea((0.0, 0.0), ..., (0.0, 1.0))
0.59019 0.509818 PolyArea((1.0, 0.0), ..., (1.0, 1.0))
0.720319 0.0188277 PolyArea((2.0, 0.0), ..., (2.0, 1.0))
0.0416395 0.511655 PolyArea((3.0, 0.0), ..., (3.0, 1.0))
0.59616 0.392176 PolyArea((4.0, 0.0), ..., (4.0, 1.0))
0.75092 0.730826 PolyArea((5.0, 0.0), ..., (5.0, 1.0))
0.00279647 0.636113 PolyArea((6.0, 0.0), ..., (6.0, 1.0))
0.72922 0.384358 PolyArea((7.0, 0.0), ..., (7.0, 1.0))
0.534746 0.261772 PolyArea((8.0, 0.0), ..., (8.0, 1.0))
0.357367 0.712609 PolyArea((9.0, 0.0), ..., (9.0, 1.0))
â‹® â‹® â‹®

Other file formats such as .ply and .msh are widely used in computer graphics to save geospatial data over meshes, and preserve topological information:

beethoven = GeoIO.load("data/beethoven.ply")

viz(beethoven.geometry)
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
â”” @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

3.3 Rationale

Now that we have set expectations for our interface with GIS, let’s address an important question that many readers might have coming from other communities:

Do we gain anything by not adhering to programming interfaces?

The answer is an emphatic YES! It means that we have total freedom to innovate and improve the representation of various geometries and geospatial domains with Julia’s amazing type system. To give a simple example, let’s take a look at the Triangle geometry:

t = Triangle((0, 0), (1, 0), (1, 1))
Triangle{2,Float64}
├─ Point(0.0, 0.0)
├─ Point(1.0, 0.0)
└─ Point(1.0, 1.0)

If we treated this geometry as a generic polygon represented by a vector of vertices in memory, like it is done in GeoInterface.jl for example, we wouldn’t be able to dispatch optimized code that is only valid for a triangle:

@code_llvm isconvex(t)
;  @ /home/runner/.julia/packages/Meshes/5T0qz/src/predicates/isconvex.jl:57 within `isconvex`
define i8 @julia_isconvex_3872([1 x [3 x [1 x [1 x [2 x double]]]]]* nocapture readonly %0) #0 {
top:
  ret i8 1
}
Note

In Julia, the macro @code_llvm shows the underlying code sent to the LLVM compiler. In this case, the code is the single line ret i8 1, which is the instruction to return the constant integer 1.

Notice how the isconvex function is compiled away to the constant 1 (i.e. true) when called on the triangle. The code for a generic polygon is much more complicated and requires runtime checks that are too expensive to afford, especially in 3D.

Having cleared that up, we will now proceed to the last foundational chapter of the book, which covers the advanced geometric processing features of the framework.