Geospatial data
Overview
Given a table or array containing data, we can georeference these objects onto a geospatial domain with the georef
function. The result of the georef
function is a GeoTable. If you would like to learn this concept in more depth, check out the recording of our JuliaEO 2024 workshop:
GeoTables.georef
— Functiongeoref(table, domain)
Georeference table
on domain
from Meshes.jl
Examples
julia> georef((a=rand(100), b=rand(100)), CartesianGrid(10, 10))
georef(table, geoms)
Georeference table
on vector of geometries geoms
from Meshes.jl
Examples
julia> georef((a=rand(10), b=rand(10)), rand(Point, 10))
georef(table, coords; [crs], [lenunit])
Georeference table
using coordinates coords
of points.
Optionally, specify the coordinate reference system crs
, which is set by default based on heuristics, and the lenunit
(default to meters for unitless values) that can only be used in CRS types that allow this flexibility. Any CRS
or EPSG
/ESRI
code from CoordRefSystems.jl is supported.
Examples
julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)])
julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)], crs=LatLon)
julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)], crs=EPSG{4326})
julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)], lenunit=u"cm")
georef(table, names; [crs], [lenunit])
Georeference table
using coordinates of points stored in column names
.
Optionally, specify the coordinate reference system crs
, which is set by default based on heuristics, and the lenunit
(default to meters for unitless values) that can only be used in CRS types that allow this flexibility. Any CRS
or EPSG
/ESRI
code from CoordRefSystems.jl is supported.
Examples
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"))
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"), crs=LatLon)
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"), crs=EPSG{4326})
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"), lenunit=u"cm")
georef(tuple)
Georeference a named tuple
on CartesianGrid(dims)
, with dims
obtained from the arrays stored in the tuple.
Examples
julia> georef((a=rand(10, 10), b=rand(10, 10))) # 2D grid
julia> georef((a=rand(10, 10, 10), b=rand(10, 10, 10))) # 3D grid
The functions values
and domain
can be used to retrieve the table of attributes and the underlying geospatial domain:
Base.values
— Functionvalues(geotable, [rank])
Return the values of geotable
for a given rank
as a table.
The rank is a non-negative integer that specifies the parametric dimension of the geometries of interest:
- 0 - points
- 1 - segments
- 2 - triangles, quadrangles, ...
- 3 - tetrahedrons, hexahedrons, ...
If the rank is not specified, it is assumed to be the rank of the elements of the domain.
GeoTables.domain
— Functiondomain(geotable)
Return underlying domain of the geotable
.
Examples
using GeoStats
import CairoMakie as Mke
# helper function for plotting two
# variables named T and P side by side
function plot(data)
fig = Mke.Figure(size = (800, 400))
viz(fig[1,1], data.geometry, color = data.T)
viz(fig[1,2], data.geometry, color = data.P)
fig
end
plot (generic function with 1 method)
Tables
Consider a table (e.g. DataFrame) with 25 samples of temperature T
and pressure P
:
using DataFrames
table = DataFrame(T=rand(25), P=rand(25))
Row | T | P |
---|---|---|
Float64 | Float64 | |
1 | 0.888158 | 0.259644 |
2 | 0.17174 | 0.41279 |
3 | 0.381217 | 0.617203 |
4 | 0.529668 | 0.469265 |
5 | 0.833948 | 0.114354 |
6 | 0.669915 | 0.571324 |
7 | 0.0243105 | 0.216162 |
8 | 0.0065159 | 0.156883 |
9 | 0.954275 | 0.915785 |
10 | 0.596287 | 0.106082 |
11 | 0.138569 | 0.229454 |
12 | 0.728487 | 0.225839 |
13 | 0.27266 | 0.231184 |
14 | 0.505449 | 0.226092 |
15 | 0.135178 | 0.759987 |
16 | 0.710001 | 0.174766 |
17 | 0.0650785 | 0.966856 |
18 | 0.919294 | 0.742708 |
19 | 0.912138 | 0.473233 |
20 | 0.789781 | 0.341687 |
21 | 0.956157 | 0.602556 |
22 | 0.2342 | 0.0175214 |
23 | 0.945265 | 0.679348 |
24 | 0.27799 | 0.424153 |
25 | 0.731118 | 0.0524933 |
We can georeference this table based on a given set of points:
georef(table, rand(Point, 25)) |> plot

or alternatively, georeference it on a 5x5 regular grid (5x5 = 25 samples):
georef(table, CartesianGrid(5, 5)) |> plot

Another common pattern in geospatial data is when the coordinates of the samples are already part of the table as columns. In this case, we can specify the column names:
table = DataFrame(T=rand(25), P=rand(25), X=rand(25), Y=rand(25), Z=rand(25))
georef(table, ("X", "Y", "Z")) |> plot

Arrays
Consider arrays (e.g. images) with data for various geospatial variables. We can georeference these arrays using a named tuple, and the framework will understand that the shape of the arrays should be preserved in a CartesianGrid
:
T, P = rand(5, 5), rand(5, 5)
georef((T=T, P=P)) |> plot

Alternatively, we can interpret the entries of the named tuple as columns in a table:
georef((T=vec(T), P=vec(P)), rand(Point, 25)) |> plot

Files
The GeoIO.jl package can be used to load/save geospatial data from/to various file formats:
GeoIO.load
— FunctionGeoIO.load(fname; repair=true, layer=1, lenunit=nothing, numtype=Float64, kwargs...)
Load geospatial table from file fname
stored in any format.
Please use the formats
function to list all supported file formats. GIS formats with missing geometries are considered bad practice. In this case, we provide the auxiliary function loadvalues
with similar options.
Various repair
s are performed on the stored geometries by default, including fixes of orientation in rings of polygons, removal of zero-area triangles, etc. Some of the repairs can be expensive on large data sets. In this case, we recommend setting repair=false
. Custom repairs can be performed with the Repair
transform from Meshes.jl.
It is also possible to specify the layer
to read within the file, the length unit lenunit
of the coordinates when the format does not include units in its specification, and the number type numtype
of the coordinate values.
Other kwargs
options are forwarded to the backend packages and are documented below.
Options
CSV
coords
: names of the columns with point coordinates (required option);- Other options are passed to
CSV.File
, see the CSV.jl documentation for more details;
GSLIB
- Other options are passed to
GslibIO.load
, see the GslibIO.jl documentation for more details;
GeoJSON
- Other options are passed to
GeoJSON.read
, see the GeoJSON.jl documentation for more details;
GeoParquet
- Other options are passed to
GeoParquet.read
, see the GeoParquet.jl documentation for more details;
Shapefile
- Other options are passed to
Shapefile.read
, see the Shapefile.jl documentation for more details;
Formats handled by GDAL (GeoPackage, KML)
- Other options are passed to
ArchGDAL.read
, see the ArchGDAL.jl documentation for more details;
OFF
defaultcolor
: default color of the geometries if the file does not have this data (default toRGBA(0.666, 0.666, 0.666, 0.666)
);
VTK formats (.vtu
, .vtp
, .vtr
, .vts
, .vti
)
- mask: name of the boolean column that encodes the indices of a grid view (default to
:mask
). If the column does not exist in the file, the full grid is returned;
Common Data Model formats (NetCDF, GRIB)
x
: name of the column with x coordinates (default to"x"
,"X"
,"lon"
, or"longitude"
);y
: name of the column with y coordinates (default to"y"
,"Y"
,"lat"
, or"latitude"
);z
: name of the column with z coordinates (default to"z"
,"Z"
,"depth"
, or"height"
);t
: name of the column with time measurements (default to"t"
,"time"
, or"TIME"
);
Examples
# load GeoJSON file
GeoIO.load("file.geojson")
# load GeoPackage file
GeoIO.load("file.gpkg")
# load GeoTIFF file
GeoIO.load("file.tiff")
# load NetCDF file
GeoIO.load("file.nc")
GeoIO.loadvalues
— FunctionGeoIO.loadvalues(fname; rows=:all, layer=1, numtype=Float64, kwargs...)
Load values
of geospatial table from file fname
stored in any GIS format, skipping the steps to build the domain
(i.e., geometry column).
The function is particularly useful when geometries are missing. In this case, the option rows=:invalid
can be used to retrieve the values of the rows that were dropped by load
. All other kwargs
options documented therein for GIS formats are supported, including the layer
and the numtype
options.
Examples
# load all values of shapefile, ignoring geometries
GeoIO.loadvalues("file.shp")
# load values of shapefile where geometries are missing
GeoIO.loadvalues("file.shp"; rows=:invalid)
GeoIO.save
— FunctionGeoIO.save(fname, geotable; warn=true, kwargs...)
Save geotable
to file fname
of given format based on the file extension.
Please use the formats
function to list all supported file formats.
The function displays a warning whenever the format is obsolete (e.g. Shapefile). The warning can be disabled with the warn
keyword argument.
Other kwargs
options are forwarded to the backend packages and are documented below.
Options
CSV
coords
: names of the columns where the point coordinates will be saved (default to"x"
,"y"
,"z"
);floatformat
: C-style format string for float values (default to no formatting);- Other options are passed to
CSV.write
, see the CSV.jl documentation for more details;
GSLIB
- Other options are passed to
GslibIO.save
, see the GslibIO.jl documentation for more details;
GeoJSON
- Other options are passed to
GeoJSON.write
, see the GeoJSON.jl documentation for more details;
GeoPackage
layername
: name of the layer where the data will be saved (default to"data"
);options
: dictionary with options that will be passed to GDAL;
GeoParquet
- Other options are passed to
GeoParquet.write
, see the GeoParquet.jl documentation for more details;
Shapefile
- Other options are passed to
Shapefile.write
, see the Shapefile.jl documentation for more details;
MSH
vcolumn
: name of the column in vertex table with node data, ifnothing
the geometries will be saved without node data (default tonothing
);ecolumn
: name of the column in element table with element data, ifnothing
the geometries will be saved without element data (default tonothing
);
OFF
color
: name of the column with geometry colors, ifnothing
the geometries will be saved without colors (default tonothing
);
STL
ascii
: defines whether the file will be saved in ASCII format, otherwise Binary format will be used (default tofalse
);
Common Data Model formats (NetCDF, GRIB)
x
: name of the column where the coordinate x will be saved (default to CRS coordinate name);y
: name of the column where the coordinate y will be saved (default to CRS coordinate name);z
: name of the column where the coordinate z will be saved (default to CRS coordinate name);t
: name of the column where the time measurements will be saved (default to"t"
);
Examples
# set layer name in GeoPackage
GeoIO.save("file.gpkg", layername = "mylayer")
GeoIO.formats
— Functionformats([io]; sortby=:extension)
Displays in io
(defaults to stdout
if io
is not given) a table with all formats supported by GeoIO.jl and the packages used to load and save each of them.
Optionally, sort the table by the :extension
, :load
or :save
columns using the sortby
argument.
using GeoIO
zone = GeoIO.load("data/zone.shp")
PERIMETER | ACRES | MACROZONA | Hectares | area_m2 | geometry |
---|---|---|---|---|---|
Continuous | Continuous | Categorical | Continuous | Continuous | MultiPolygon |
[NoUnits] | [NoUnits] | [NoUnits] | [NoUnits] | [NoUnits] | 🖈 GeodeticLatLon{SAD69} |
5.8508e6 | 3.23145e7 | Estuario | 1.30772e7 | 1.30772e11 | Multi(21×PolyArea) |
9.53947e6 | 2.50594e8 | Fronteiras Antigas | 1.01412e8 | 1.01412e12 | Multi(1×PolyArea) |
1.01743e7 | 2.75528e8 | Fronteiras Intermediarias | 1.11502e8 | 1.11502e12 | Multi(1×PolyArea) |
7.09612e6 | 1.61293e8 | Fronteiras Novas | 6.5273e7 | 6.5273e11 | Multi(2×PolyArea) |
Artifacts
The GeoArtifacts.jl package provides utility functions to automatically download geospatial data from repositories on the internet.
using GeoArtifacts
# download artifacts from naturalearthdata.com
earth = NaturalEarth.naturalearth1("water")
borders = NaturalEarth.borders()
airports = NaturalEarth.airports()
ports = NaturalEarth.ports()
# initialize viewer with a coarse "raster"
earth |> Upscale(10, 5) |> viewer
# add other elements to the visualization
viz!(borders.geometry, color = "cyan")
viz!(airports.geometry, color = "black", pointsize=4, pointmarker='✈')
viz!(ports.geometry, color = "blue", pointsize=4, pointmarker='⚓')
# display current figure
Mke.current_figure()
