Geospatial data
Overview
Given a table or array containing data, we can georeference these objects onto a geospatial domain with the georef
function. The result of the georef
function is a GeoTable. If you would like to learn this concept in more depth, check out the recording of our JuliaEO 2024 workshop:
GeoTables.georef
— Functiongeoref(table, domain)
Georeference table
on domain
.
table
must implement the Tables.jl interface (e.g., DataFrame
, CSV.File
, XLSX.Worksheet
).
domain
must implement the Meshes.jl interface (e.g., CartesianGrid
, SimpleMesh
, GeometrySet
).
Examples
julia> georef((a=rand(100), b=rand(100)), CartesianGrid(10, 10))
georef(table, geoms)
Georeference table
on vector of geometries geoms
.
geoms
must implement the Meshes.jl interface (e.g., Point
, Quadrangle
, Hexahedron
).
Examples
julia> georef((a=rand(10), b=rand(10)), rand(Point2, 10))
georef(table, coords)
Georeference table
on PointSet(coords)
.
Examples
julia> georef((a=rand(10), b=rand(10)), rand(2, 10))
georef(table, names)
Georeference table
using coordinate column names
.
Examples
georef((a=rand(10), x=rand(10), y=rand(10)), (:x, :y))
georef((a=rand(10), x=rand(10), y=rand(10)), [:x, :y])
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"))
georef((a=rand(10), x=rand(10), y=rand(10)), ["x", "y"])
georef(tuple)
Georeference a named tuple
on CartesianGrid(dims)
, with dims
being the dimensions of the tuple
arrays.
Examples
julia> georef((a=rand(10, 10), b=rand(10, 10))) # 2D grid
julia> georef((a=rand(10, 10, 10), b=rand(10, 10, 10))) # 3D grid
The functions values
and domain
can be used to retrieve the table of attributes and the underlying geospatial domain:
Base.values
— Functionvalues(geotable, [rank])
Return the values of geotable
for a given rank
as a table.
The rank is a non-negative integer that specifies the parametric dimension of the geometries of interest:
- 0 - points
- 1 - segments
- 2 - triangles, quadrangles, ...
- 3 - tetrahedrons, hexahedrons, ...
If the rank is not specified, it is assumed to be the rank of the elements of the domain.
GeoTables.domain
— Functiondomain(geotable)
Return underlying domain of the geotable
.
The GeoIO.jl package can be used to load/save geospatial data from/to various file formats:
GeoIO.load
— Functionload(fname, layer=0, fix=true, kwargs...)
Load geospatial table from file fname
and convert the geometry
column to Meshes.jl geometries.
Optionally, specify the layer
of geometries to read within the file and keyword arguments kwargs
accepted by Shapefile.Table
, GeoJSON.read
GeoParquet.read
and ArchGDAL.read
.
The option fix
can be used to fix orientation and degeneracy issues with polygons.
To see supported formats, use the formats
function.
GeoIO.save
— Functionsave(fname, geotable; kwargs...)
Save geospatial table to file fname
using the appropriate format based on the file extension. Optionally, specify keyword arguments accepted by Shapefile.write
and GeoJSON.write
. For example, use force = true
to force writing on existing .shp
file.
To see supported formats, use the formats
function.
GeoIO.formats
— Functionformats([io]; sortby=:format)
Displays in io
(defaults to stdout
if io
is not given) a table with all formats supported by GeoIO.jl and the packages used to load and save each of them.
Optionally, sort the table by the :extension
, :load
or :save
columns using the sortby
argument.
The GeoArtifacts.jl package provides utility functions to automatically download geospatial data from repositories on the internet.
Examples
using GeoStats
import CairoMakie as Mke
# helper function for plotting two
# variables named T and P side by side
function plot(data)
fig = Mke.Figure(size = (800, 400))
viz(fig[1,1], data.geometry, color = data.T)
viz(fig[1,2], data.geometry, color = data.P)
fig
end
plot (generic function with 1 method)
Tables
Consider a table (e.g. DataFrame) with 25 samples of temperature T
and pressure P
:
using DataFrames
table = DataFrame(T=rand(25), P=rand(25))
Row | T | P |
---|---|---|
Float64 | Float64 | |
1 | 0.229754 | 0.78328 |
2 | 0.401836 | 0.100198 |
3 | 0.12127 | 0.512855 |
4 | 0.484241 | 0.860052 |
5 | 0.640647 | 0.604212 |
6 | 0.18201 | 0.752646 |
7 | 0.994301 | 0.372761 |
8 | 0.894695 | 0.853588 |
9 | 0.36015 | 0.673628 |
10 | 0.194067 | 0.750172 |
11 | 0.140453 | 0.486765 |
12 | 0.341975 | 0.696221 |
13 | 0.196603 | 0.288413 |
14 | 0.204869 | 0.769921 |
15 | 0.0900557 | 0.434643 |
16 | 0.89842 | 0.523237 |
17 | 0.71019 | 0.939833 |
18 | 0.716718 | 0.339279 |
19 | 0.464987 | 0.571637 |
20 | 0.121235 | 0.0850448 |
21 | 0.847333 | 0.250275 |
22 | 0.852785 | 0.281409 |
23 | 0.91029 | 0.824594 |
24 | 0.38532 | 0.790992 |
25 | 0.886192 | 0.330026 |
We can georeference this table based on a given set of coordinates:
georef(table, PointSet(rand(2,25))) |> plot
or alternatively, georeference it on a 5x5 regular grid (5x5 = 25 samples):
georef(table, CartesianGrid(5, 5)) |> plot
In the first case, the PointSet
domain type can be omitted, and the framework will understand that the matrix passed as the second argument contains the coordinates of a point set:
georef(table, rand(2,25))
T | P | geometry |
---|---|---|
Continuous | Continuous | Point2 |
[NoUnits] | [NoUnits] | |
0.229754 | 0.78328 | (0.396905, 0.0705349) |
0.401836 | 0.100198 | (0.831714, 0.564756) |
0.12127 | 0.512855 | (0.0897437, 0.780434) |
0.484241 | 0.860052 | (0.79656, 0.487434) |
0.640647 | 0.604212 | (0.395423, 0.536595) |
0.18201 | 0.752646 | (0.989296, 0.755125) |
0.994301 | 0.372761 | (0.19655, 0.695815) |
0.894695 | 0.853588 | (0.853091, 0.699933) |
0.36015 | 0.673628 | (0.149736, 0.039659) |
0.194067 | 0.750172 | (0.472848, 0.599857) |
⋮ | ⋮ | ⋮ |
Another common pattern in geospatial data is when the coordinates of the samples are already part of the table as columns. In this case, we can specify the column names as symbols:
table = DataFrame(T=rand(25), P=rand(25), X=rand(25), Y=rand(25), Z=rand(25))
georef(table, (:X, :Y, :Z)) |> plot
Arrays
Consider arrays (e.g. images) with data for various geospatial variables. We can georeference these arrays using a named tuple, and the framework will understand that the shape of the arrays should be preserved in a CartesianGrid
:
T, P = rand(5,5), rand(5,5)
georef((T=T, P=P)) |> plot
Alternatively, we can interpret the entries of the named tuple as columns in a table:
georef((T=vec(T), P=vec(P)), rand(2,25)) |> plot
Files
We can easily load geospatial data from disk without any specific knowledge of file formats:
using GeoIO
zone = GeoIO.load("data/zone.shp")
path = GeoIO.load("data/path.shp")
viz(zone.geometry)
viz!(path.geometry, color = :gray90)
Mke.current_figure()