Geospatial data

Overview

Given a table or array containing data, we can georeference these objects onto a geospatial domain with the georef function. The result of the georef function is a GeoTable. If you would like to learn this concept in more depth, check out the recording of our JuliaEO 2024 workshop:

GeoTables.georefFunction
georef(table, domain)

Georeference table on domain.

table must implement the Tables.jl interface (e.g., DataFrame, CSV.File, XLSX.Worksheet).

domain must implement the Meshes.jl interface (e.g., CartesianGrid, SimpleMesh, GeometrySet).

Examples

julia> georef((a=rand(100), b=rand(100)), CartesianGrid(10, 10))
georef(table, geoms)

Georeference table on vector of geometries geoms.

geoms must implement the Meshes.jl interface (e.g., Point, Quadrangle, Hexahedron).

Examples

julia> georef((a=rand(10), b=rand(10)), rand(Point2, 10))
georef(table, coords)

Georeference table on PointSet(coords).

Examples

julia> georef((a=rand(10), b=rand(10)), rand(2, 10))
georef(table, names)

Georeference table using coordinate column names.

Examples

georef((a=rand(10), x=rand(10), y=rand(10)), (:x, :y))
georef((a=rand(10), x=rand(10), y=rand(10)), [:x, :y])
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"))
georef((a=rand(10), x=rand(10), y=rand(10)), ["x", "y"])
georef(tuple)

Georeference a named tuple on CartesianGrid(dims), with dims being the dimensions of the tuple arrays.

Examples

julia> georef((a=rand(10, 10), b=rand(10, 10))) # 2D grid
julia> georef((a=rand(10, 10, 10), b=rand(10, 10, 10))) # 3D grid

The functions values and domain can be used to retrieve the table of attributes and the underlying geospatial domain:

Base.valuesFunction
values(geotable, [rank])

Return the values of geotable for a given rank as a table.

The rank is a non-negative integer that specifies the parametric dimension of the geometries of interest:

  • 0 - points
  • 1 - segments
  • 2 - triangles, quadrangles, ...
  • 3 - tetrahedrons, hexahedrons, ...

If the rank is not specified, it is assumed to be the rank of the elements of the domain.

GeoTables.domainFunction
domain(geotable)

Return underlying domain of the geotable.

The GeoIO.jl package can be used to load/save geospatial data from/to various file formats:

GeoIO.loadFunction
load(fname, layer=0, fix=true, kwargs...)

Load geospatial table from file fname and convert the geometry column to Meshes.jl geometries.

Optionally, specify the layer of geometries to read within the file and keyword arguments kwargs accepted by Shapefile.Table, GeoJSON.read GeoParquet.read and ArchGDAL.read.

The option fix can be used to fix orientation and degeneracy issues with polygons.

To see supported formats, use the formats function.

GeoIO.saveFunction
save(fname, geotable; kwargs...)

Save geospatial table to file fname using the appropriate format based on the file extension. Optionally, specify keyword arguments accepted by Shapefile.write and GeoJSON.write. For example, use force = true to force writing on existing .shp file.

To see supported formats, use the formats function.

GeoIO.formatsFunction
formats([io]; sortby=:format)

Displays in io (defaults to stdout if io is not given) a table with all formats supported by GeoIO.jl and the packages used to load and save each of them.

Optionally, sort the table by the :extension, :load or :save columns using the sortby argument.

The GeoArtifacts.jl package provides utility functions to automatically download geospatial data from repositories on the internet.

Examples

using GeoStats
import CairoMakie as Mke

# helper function for plotting two
# variables named T and P side by side
function plot(data)
  fig = Mke.Figure(size = (800, 400))
  viz(fig[1,1], data.geometry, color = data.T)
  viz(fig[1,2], data.geometry, color = data.P)
  fig
end
plot (generic function with 1 method)

Tables

Consider a table (e.g. DataFrame) with 25 samples of temperature T and pressure P:

using DataFrames

table = DataFrame(T=rand(25), P=rand(25))
25×2 DataFrame
RowTP
Float64Float64
10.2297540.78328
20.4018360.100198
30.121270.512855
40.4842410.860052
50.6406470.604212
60.182010.752646
70.9943010.372761
80.8946950.853588
90.360150.673628
100.1940670.750172
110.1404530.486765
120.3419750.696221
130.1966030.288413
140.2048690.769921
150.09005570.434643
160.898420.523237
170.710190.939833
180.7167180.339279
190.4649870.571637
200.1212350.0850448
210.8473330.250275
220.8527850.281409
230.910290.824594
240.385320.790992
250.8861920.330026

We can georeference this table based on a given set of coordinates:

georef(table, PointSet(rand(2,25))) |> plot

or alternatively, georeference it on a 5x5 regular grid (5x5 = 25 samples):

georef(table, CartesianGrid(5, 5)) |> plot

In the first case, the PointSet domain type can be omitted, and the framework will understand that the matrix passed as the second argument contains the coordinates of a point set:

georef(table, rand(2,25))
25×3 GeoTable over 25 PointSet{2,Float64}
T P geometry
Continuous Continuous Point2
[NoUnits] [NoUnits]
0.229754 0.78328 (0.396905, 0.0705349)
0.401836 0.100198 (0.831714, 0.564756)
0.12127 0.512855 (0.0897437, 0.780434)
0.484241 0.860052 (0.79656, 0.487434)
0.640647 0.604212 (0.395423, 0.536595)
0.18201 0.752646 (0.989296, 0.755125)
0.994301 0.372761 (0.19655, 0.695815)
0.894695 0.853588 (0.853091, 0.699933)
0.36015 0.673628 (0.149736, 0.039659)
0.194067 0.750172 (0.472848, 0.599857)

Another common pattern in geospatial data is when the coordinates of the samples are already part of the table as columns. In this case, we can specify the column names as symbols:

table = DataFrame(T=rand(25), P=rand(25), X=rand(25), Y=rand(25), Z=rand(25))

georef(table, (:X, :Y, :Z)) |> plot

Arrays

Consider arrays (e.g. images) with data for various geospatial variables. We can georeference these arrays using a named tuple, and the framework will understand that the shape of the arrays should be preserved in a CartesianGrid:

T, P = rand(5,5), rand(5,5)

georef((T=T, P=P)) |> plot

Alternatively, we can interpret the entries of the named tuple as columns in a table:

georef((T=vec(T), P=vec(P)), rand(2,25)) |> plot

Files

We can easily load geospatial data from disk without any specific knowledge of file formats:

using GeoIO

zone = GeoIO.load("data/zone.shp")
path = GeoIO.load("data/path.shp")

viz(zone.geometry)
viz!(path.geometry, color = :gray90)
Mke.current_figure()