Geospatial data

Overview

Given a table or array containing data, we can georeference these objects onto a geospatial domain with the georef function. For a list of available domains, please see Domains.

GeoStatsBase.georefFunction
georef(table, domain)

Georeference table on geospatial domain.

georef(table, coords)

Georeference table on a PointSet(coords).

georef(table, coordnames)

Georeference table using columns coordnames.

georef(tuple, domain)

Georeference named tuple on geospatial domain.

georef(tuple, coords)

Georefrence named tuple on PointSet(coords).

georef(tuple; origin=(0.,0.,...), spacing=(1.,1.,...))

Georeference named tuple on CartesianGrid(size(tuple[1]), origin, spacing).

In the opposite direction, the functions values and domain can be used to retrieve the table of attributes and the underlying geospatial domain.

Base.valuesFunction
values(data, [rank])

Return the values of data for a given rank as a table.

The rank is a non-negative integer that specifies the parametric dimension of the geometries of interest:

  • 0 - points
  • 1 - segments
  • 2 - triangles, quadrangles, ...
  • 3 - tetrahedrons, hexahedrons, ...

If the rank is not specified, it is assumed to be the rank of the elements of the domain.

Meshes.domainFunction
domain(data)

Return underlying domain of the data.

Examples

Tables

Consider a table (e.g. DataFrame) with 25 samples of temperature and precipitation:

using DataFrames

table = DataFrame(T=rand(25), P=rand(25))

25 rows × 2 columns

TP
Float64Float64
10.704170.747171
20.4215270.864585
30.6886920.21695
40.3938070.199692
50.1786370.969165
60.6140380.550512
70.6196980.994584
80.3137150.228739
90.258120.252486
100.01402860.899314
110.5036650.695648
120.6379510.625345
130.4564980.977333
140.534850.0600936
150.9664740.055822
160.7226870.185636
170.21050.365988
180.03321880.759673
190.7260960.686092
200.08725690.882196
210.5151310.39207
220.4325420.739041
230.9473750.799289
240.3246980.495992
250.8435340.142729

We can georeference this table based on a given set of coordinates:

𝒟 = georef(table, PointSet(rand(2,25)))

plot(𝒟)

or alternatively, georeference it on a 5x5 regular grid (5x5 = 25 samples):

𝒟 = georef(table, CartesianGrid(5,5))

plot(𝒟)

In the first case, the PointSet domain type can be omitted, and GeoStats.jl will understand that the matrix passed as the second argument contains the coordinates of a point set:

𝒟 = georef(table, rand(2,25))
25 MeshData{2,Float64}
  variables (rank 0)
    └─P (Float64)
    └─T (Float64)
  domain: 25 PointSet{2,Float64}

Another common pattern in spatial data sets is when the coordinates of the samples are already part of the table as columns. In this case, we can specify the column names as symbols:

table = DataFrame(T=rand(25), P=rand(25), X=rand(25), Y=rand(25), Z=rand(25))

𝒟 = georef(table, (:X,:Y,:Z))

plot(𝒟)

Arrays

Consider arrays (e.g. images) with data for various spatial variables. We can georeference these arrays using a named tuple, and GeoStats.jl will understand that the shape of the arrays should be preserved in a Cartesian grid:

T, P = rand(5,5), rand(5,5)

𝒟 = georef((T=T, P=P))

plot(𝒟)

We can also specify the origin and spacing of the grid using keyword arguments:

𝒟₁ = georef((T=T, P=P), origin=(0.,0.), spacing=(1.,1.))
𝒟₂ = georef((T=T, P=P), origin=(10.,10.), spacing=(2.,2.))

plot(𝒟₁)
plot!(𝒟₂)

Alternatively, we can interpret the entries of the named tuple as columns in a table:

𝒟 = georef((T=T, P=T), rand(2,25))

plot(𝒟)

Shapefiles

The GeoTables.jl package can be used to load geospatial data from various file formats. It also provides utility functions to automatically download maps given the name of any region in the world.

We can load a shapefile as a geospatial table that is compatible with the framework:

using GeoTables

zone = GeoTables.load("data/zone.shp")
4 GeoTable{2,Float64}
  variables (rank 2)
    └─ACRES (Float64)
    └─Hectares (Float64)
    └─MACROZONA (String)
    └─PERIMETER (Float64)
    └─area_m2 (Float64)
  domain: 4 GeometrySet{2,Float64}
path = GeoTables.load("data/path.shp")
6 GeoTable{2,Float64}
  variables (rank 2)
    └─ZONA (String)
  domain: 6 GeometrySet{2,Float64}

Unlike in previous examples where each row of the table was associated with simple geometries (e.g. Point or Quadrangle), here we have more complicated geometries to consider:

zone.geometry
4-element Vector{Multi{2, Float64, PolyArea{2, Float64, Chain{2, Float64, Vector{Point2}}}}}:
 21 Multi-PolyArea{2,Float64}
 1 Multi-PolyArea{2,Float64}
 1 Multi-PolyArea{2,Float64}
 2 Multi-PolyArea{2,Float64}

We can visualize these geometries as the domain of the geospatial data as usual:

plot(domain(zone), fill=true, color=:gray)
plot!(domain(path), fill=true, color=:gray90)

and most importantly, nothing special needs to be done. This geospatial table can be used with any geostatistical workflow.

Custom data

GeoStats.jl is integrated with the Meshes.jl project. In summary, any type that is a subtype of Meshes.Data and that implements Meshes.domain and Meshes.values is compatible with the framework and can be used in geostatistical workflows.

Please ask in our community channels if you need help implementing custom geospatial data types.