Geospatial data
Overview
Given a table or array containing data, we can georeference these objects onto a geospatial domain with the georef
function. The result of the georef
function is a GeoTable. If you would like to learn this concept in more depth, check out the recording of our JuliaEO 2024 workshop:
GeoTables.georef
— Functiongeoref(table, domain)
Georeference table
on domain
from Meshes.jl
Examples
julia> georef((a=rand(100), b=rand(100)), CartesianGrid(10, 10))
georef(table, geoms)
Georeference table
on vector of geometries geoms
from Meshes.jl
Examples
julia> georef((a=rand(10), b=rand(10)), rand(Point, 10))
georef(table, coords; [crs], [lenunit])
Georeference table
using coordinates coords
of points.
Optionally, specify the coordinate reference system crs
, which is set by default based on heuristics, and the lenunit
(default to meters for unitless values) that can only be used in CRS types that allow this flexibility. Any CRS
or EPSG
/ESRI
code from CoordRefSystems.jl is supported.
Examples
julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)])
julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)], crs=LatLon)
julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)], crs=EPSG{4326})
julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)], lenunit=u"cm")
georef(table, names; [crs], [lenunit])
Georeference table
using coordinates of points stored in column names
.
Optionally, specify the coordinate reference system crs
, which is set by default based on heuristics, and the lenunit
(default to meters for unitless values) that can only be used in CRS types that allow this flexibility. Any CRS
or EPSG
/ESRI
code from CoordRefSystems.jl is supported.
Examples
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"))
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"), crs=LatLon)
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"), crs=EPSG{4326})
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"), lenunit=u"cm")
georef(tuple)
Georeference a named tuple
on CartesianGrid(dims)
, with dims
obtained from the arrays stored in the tuple.
Examples
julia> georef((a=rand(10, 10), b=rand(10, 10))) # 2D grid
julia> georef((a=rand(10, 10, 10), b=rand(10, 10, 10))) # 3D grid
The functions values
and domain
can be used to retrieve the table of attributes and the underlying geospatial domain:
Base.values
— Functionvalues(geotable, [rank])
Return the values of geotable
for a given rank
as a table.
The rank is a non-negative integer that specifies the parametric dimension of the geometries of interest:
- 0 - points
- 1 - segments
- 2 - triangles, quadrangles, ...
- 3 - tetrahedrons, hexahedrons, ...
If the rank is not specified, it is assumed to be the rank of the elements of the domain.
GeoTables.domain
— Functiondomain(geotable)
Return underlying domain of the geotable
.
Examples
using GeoStats
import CairoMakie as Mke
# helper function for plotting two
# variables named T and P side by side
function plot(data)
fig = Mke.Figure(size = (800, 400))
viz(fig[1,1], data.geometry, color = data.T)
viz(fig[1,2], data.geometry, color = data.P)
fig
end
plot (generic function with 1 method)
Tables
Consider a table (e.g. DataFrame) with 25 samples of temperature T
and pressure P
:
using DataFrames
table = DataFrame(T=rand(25), P=rand(25))
Row | T | P |
---|---|---|
Float64 | Float64 | |
1 | 0.791168 | 0.752989 |
2 | 0.702574 | 0.0936641 |
3 | 0.47882 | 0.456286 |
4 | 0.209769 | 0.729406 |
5 | 0.123141 | 0.322685 |
6 | 0.789843 | 0.930773 |
7 | 0.508285 | 0.26811 |
8 | 0.0644423 | 0.171855 |
9 | 0.850016 | 0.424472 |
10 | 0.110945 | 0.667494 |
11 | 0.367835 | 0.565091 |
12 | 0.232536 | 0.317064 |
13 | 0.765879 | 0.0153944 |
14 | 0.151182 | 0.708185 |
15 | 0.494448 | 0.843985 |
16 | 0.158814 | 0.263698 |
17 | 0.245266 | 0.247371 |
18 | 0.805644 | 0.747046 |
19 | 0.26638 | 0.653114 |
20 | 0.551835 | 0.429277 |
21 | 0.875816 | 0.396554 |
22 | 0.460369 | 0.79361 |
23 | 0.184034 | 0.721191 |
24 | 0.474176 | 0.555584 |
25 | 0.856812 | 0.659673 |
We can georeference this table based on a given set of points:
georef(table, rand(Point, 25)) |> plot
or alternatively, georeference it on a 5x5 regular grid (5x5 = 25 samples):
georef(table, CartesianGrid(5, 5)) |> plot
Another common pattern in geospatial data is when the coordinates of the samples are already part of the table as columns. In this case, we can specify the column names:
table = DataFrame(T=rand(25), P=rand(25), X=rand(25), Y=rand(25), Z=rand(25))
georef(table, ("X", "Y", "Z")) |> plot
Arrays
Consider arrays (e.g. images) with data for various geospatial variables. We can georeference these arrays using a named tuple, and the framework will understand that the shape of the arrays should be preserved in a CartesianGrid
:
T, P = rand(5, 5), rand(5, 5)
georef((T=T, P=P)) |> plot
Alternatively, we can interpret the entries of the named tuple as columns in a table:
georef((T=vec(T), P=vec(P)), rand(Point, 25)) |> plot
Files
The GeoIO.jl package can be used to load/save geospatial data from/to various file formats:
GeoIO.load
— Functionload(fname, repair=true, layer=0, lenunit=nothing, kwargs...)
Load geospatial table from file fname
stored in any format.
Various repair
s are performed on the stored geometries by default, including fixes of orientation in rings of polygons, removal of zero-area triangles, etc.
Some of the repairs can be expensive on large data sets. In that case, we recommend setting repair=false
. Custom repairs can be performed with the Repair
transform from Meshes.jl.
Optionally, specify the layer
to read within the file, and the length unit lenunit
of the coordinates when the format does not include units in its specification. Other kwargs
are forwarded to the backend packages.
Please use the formats
function to list all supported file formats.
Options
OFF
defaultcolor
: default color of the geometries if the file does not have this data (default toRGBA(0.666, 0.666, 0.666, 0.666)
);
CSV
coords
: names of the columns with point coordinates (required option);- Other options are passed to
CSV.File
, see the CSV.jl documentation for more details;
VTK formats (.vtu
, .vtp
, .vtr
, .vts
, .vti
)
- mask: name of the boolean column that encodes the indices of a grid view (default to
:MASK
). If the column does not exist in the file, the full grid is returned;
Common Data Model formats (NetCDF, GRIB)
x
: name of the column with x coordinates (default to"x"
,"X"
,"lon"
, or"longitude"
);y
: name of the column with y coordinates (default to"y"
,"Y"
,"lat"
, or"latitude"
);z
: name of the column with z coordinates (default to"z"
,"Z"
,"depth"
, or"height"
);t
: name of the column with time measurements (default to"t"
,"time"
, or"TIME"
);
GeoJSON
numbertype
: number type of geometry coordinates (default toFloat64
)- Other options are passed to
GeoJSON.read
, see the GeoJSON.jl documentation for more details;
GSLIB
- Other options are passed to
GslibIO.load
, see the GslibIO.jl documentation for more details;
Shapefile
- Other options are passed to
Shapefile.read
, see the Shapefile.jl documentation for more details;
GeoParquet
- Other options are passed to
GeoParquet.read
, see the GeoParquet.jl documentation for more details;
GeoTIFF, GeoPackage, KML
- Other options are passed to
ArchGDAL.read
, see the ArchGDAL.jl documentation for more details;
Examples
# load coordinates of geojson file as Float64 (default)
GeoIO.load("file.geojson")
# load coordinates of geojson file as Float32
GeoIO.load("file.geojson", numbertype=Float32)
GeoIO.save
— Functionsave(fname, geotable; kwargs...)
Save geotable
to file fname
of given format based on the file extension.
Other kwargs
are forwarded to the backend packages.
Please use the formats
function to list all supported file formats.
Options
OFF
color
: name of the column with geometry colors, ifnothing
the geometries will be saved without colors (default tonothing
);
MSH
vcolumn
: name of the column in vertex table with node data, ifnothing
the geometries will be saved without node data (default tonothing
);ecolumn
: name of the column in element table with element data, ifnothing
the geometries will be saved without element data (default tonothing
);
STL
ascii
: defines whether the file will be saved in ASCII format, otherwise Binary format will be used (default tofalse
);
CSV
coords
: names of the columns where the point coordinates will be saved (default to"x"
,"y"
,"z"
);floatformat
: C-style format string for float values (default to no formatting);- Other options are passed to
CSV.write
, see the CSV.jl documentation for more details;
NetCDF
x
: name of the column where the coordinate x will be saved (default to CRS coordinate name);y
: name of the column where the coordinate y will be saved (default to CRS coordinate name);z
: name of the column where the coordinate z will be saved (default to CRS coordinate name);t
: name of the column where the time measurements will be saved (default to"t"
);
GeoTIFF
options
: list with options that will be passed to GDAL;
GeoPackage
layername
: name of the layer where the data will be saved (default to"data"
);options
: dictionary with options that will be passed to GDAL;
GSLIB
- Other options are passed to
GslibIO.save
, see the GslibIO.jl documentation for more details;
Shapefile
- Other options are passed to
Shapefile.write
, see the Shapefile.jl documentation for more details;
GeoJSON
- Other options are passed to
GeoJSON.write
, see the GeoJSON.jl documentation for more details;
GeoParquet
- Other options are passed to
GeoParquet.write
, see the GeoParquet.jl documentation for more details;
Examples
# overwrite an existing shapefile
GeoIO.save("file.shp", force = true)
GeoIO.formats
— Functionformats([io]; sortby=:format)
Displays in io
(defaults to stdout
if io
is not given) a table with all formats supported by GeoIO.jl and the packages used to load and save each of them.
Optionally, sort the table by the :extension
, :load
or :save
columns using the sortby
argument.
using GeoIO
zone = GeoIO.load("data/zone.shp")
PERIMETER | ACRES | MACROZONA | Hectares | area_m2 | geometry |
---|---|---|---|---|---|
Continuous | Continuous | Categorical | Continuous | Continuous | MultiPolygon |
[NoUnits] | [NoUnits] | [NoUnits] | [NoUnits] | [NoUnits] | 🖈 GeodeticLatLon{SAD69} |
5.8508e6 | 3.23145e7 | Estuario | 1.30772e7 | 1.30772e11 | Multi(21×PolyArea) |
9.53947e6 | 2.50594e8 | Fronteiras Antigas | 1.01412e8 | 1.01412e12 | Multi(1×PolyArea) |
1.01743e7 | 2.75528e8 | Fronteiras Intermediarias | 1.11502e8 | 1.11502e12 | Multi(1×PolyArea) |
7.09612e6 | 1.61293e8 | Fronteiras Novas | 6.5273e7 | 6.5273e11 | Multi(2×PolyArea) |
Artifacts
The GeoArtifacts.jl package provides utility functions to automatically download geospatial data from repositories on the internet.
using GeoArtifacts
# download artifacts from naturalearthdata.com
earth = NaturalEarth.naturalearth1("water")
borders = NaturalEarth.borders()
airports = NaturalEarth.airports()
ports = NaturalEarth.ports()
# initialize viewer with a coarse "raster"
earth |> Upscale(10, 5) |> viewer
# add other elements to the visualization
viz!(borders.geometry, color = "cyan")
viz!(airports.geometry, color = "black", pointsize=4, pointmarker='✈')
viz!(ports.geometry, color = "blue", pointsize=4, pointmarker='⚓')
# display current figure
Mke.current_figure()