Geospatial data

Overview

Given a table or array containing data, we can georeference these objects onto a geospatial domain with the georef function. The result of the georef function is a GeoTable. If you would like to learn this concept in more depth, check out the recording of our JuliaEO 2024 workshop:

GeoTables.georefFunction
georef(table, domain)

Georeference table on domain from Meshes.jl

Examples

julia> georef((a=rand(100), b=rand(100)), CartesianGrid(10, 10))
source
georef(table, geoms)

Georeference table on vector of geometries geoms from Meshes.jl

Examples

julia> georef((a=rand(10), b=rand(10)), rand(Point, 10))
source
georef(table, coords; [crs], [lenunit])

Georeference table using coordinates coords of points.

Optionally, specify the coordinate reference system crs, which is set by default based on heuristics, and the lenunit (default to meters for unitless values) that can only be used in CRS types that allow this flexibility. Any CRS or EPSG/ESRI code from CoordRefSystems.jl is supported.

Examples

julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)])
julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)], crs=LatLon)
julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)], crs=EPSG{4326})
julia> georef((a=[1, 2, 3], b=[4, 5, 6], [(0, 0), (1, 1), (2, 2)], lenunit=u"cm")
source
georef(table, names; [crs], [lenunit])

Georeference table using coordinates of points stored in column names.

Optionally, specify the coordinate reference system crs, which is set by default based on heuristics, and the lenunit (default to meters for unitless values) that can only be used in CRS types that allow this flexibility. Any CRS or EPSG/ESRI code from CoordRefSystems.jl is supported.

Examples

georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"))
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"), crs=LatLon)
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"), crs=EPSG{4326})
georef((a=rand(10), x=rand(10), y=rand(10)), ("x", "y"), lenunit=u"cm")
source
georef(tuple)

Georeference a named tuple on CartesianGrid(dims), with dims obtained from the arrays stored in the tuple.

Examples

julia> georef((a=rand(10, 10), b=rand(10, 10))) # 2D grid
julia> georef((a=rand(10, 10, 10), b=rand(10, 10, 10))) # 3D grid
source

The functions values and domain can be used to retrieve the table of attributes and the underlying geospatial domain:

Base.valuesFunction
values(geotable, [rank])

Return the values of geotable for a given rank as a table.

The rank is a non-negative integer that specifies the parametric dimension of the geometries of interest:

  • 0 - points
  • 1 - segments
  • 2 - triangles, quadrangles, ...
  • 3 - tetrahedrons, hexahedrons, ...

If the rank is not specified, it is assumed to be the rank of the elements of the domain.

source

Examples

using GeoStats
import CairoMakie as Mke

# helper function for plotting two
# variables named T and P side by side
function plot(data)
  fig = Mke.Figure(size = (800, 400))
  viz(fig[1,1], data.geometry, color = data.T)
  viz(fig[1,2], data.geometry, color = data.P)
  fig
end
plot (generic function with 1 method)

Tables

Consider a table (e.g. DataFrame) with 25 samples of temperature T and pressure P:

using DataFrames

table = DataFrame(T=rand(25), P=rand(25))
25×2 DataFrame
RowTP
Float64Float64
10.6684130.358432
20.2395870.928157
30.7293710.848227
40.6483930.793203
50.2239170.754546
60.5440710.610672
70.2030080.692429
80.661380.326016
90.7130070.858275
100.001849540.153215
110.3404640.432245
120.8504870.462478
130.2514080.631657
140.7547490.536147
150.4128290.162076
160.6143180.787116
170.7287110.464817
180.6549470.333201
190.8627780.701644
200.1451690.415566
210.7468390.898542
220.7261260.599648
230.3978580.328505
240.08133440.0770999
250.7298950.617106

We can georeference this table based on a given set of points:

georef(table, rand(Point, 25)) |> plot
Example block output

or alternatively, georeference it on a 5x5 regular grid (5x5 = 25 samples):

georef(table, CartesianGrid(5, 5)) |> plot
Example block output

Another common pattern in geospatial data is when the coordinates of the samples are already part of the table as columns. In this case, we can specify the column names:

table = DataFrame(T=rand(25), P=rand(25), X=rand(25), Y=rand(25), Z=rand(25))

georef(table, ("X", "Y", "Z")) |> plot
Example block output

Arrays

Consider arrays (e.g. images) with data for various geospatial variables. We can georeference these arrays using a named tuple, and the framework will understand that the shape of the arrays should be preserved in a CartesianGrid:

T, P = rand(5, 5), rand(5, 5)

georef((T=T, P=P)) |> plot
Example block output

Alternatively, we can interpret the entries of the named tuple as columns in a table:

georef((T=vec(T), P=vec(P)), rand(Point, 25)) |> plot
Example block output

Files

The GeoIO.jl package can be used to load/save geospatial data from/to various file formats:

GeoIO.loadFunction
load(fname, repair=true, layer=0, lenunit=nothing, kwargs...)

Load geospatial table from file fname stored in any format.

Various repairs are performed on the stored geometries by default, including fixes of orientation in rings of polygons, removal of zero-area triangles, etc.

Some of the repairs can be expensive on large data sets. In that case, we recommend setting repair=false. Custom repairs can be performed with the Repair transform from Meshes.jl.

Optionally, specify the layer to read within the file, and the length unit lenunit of the coordinates when the format does not include units in its specification. Other kwargs are forwarded to the backend packages.

Please use the formats function to list all supported file formats.

Options

OFF

  • defaultcolor: default color of the geometries if the file does not have this data (default to RGBA(0.666, 0.666, 0.666, 0.666));

CSV

  • coords: names of the columns with point coordinates (required option);
  • Other options are passed to CSV.File, see the CSV.jl documentation for more details;

VTK formats (.vtu, .vtp, .vtr, .vts, .vti)

  • mask: name of the boolean column that encodes the indices of a grid view (default to :MASK). If the column does not exist in the file, the full grid is returned;

Common Data Model formats (NetCDF, GRIB)

  • x: name of the column with x coordinates (default to "x", "X", "lon", or "longitude");
  • y: name of the column with y coordinates (default to "y", "Y", "lat", or "latitude");
  • z: name of the column with z coordinates (default to "z", "Z", "depth", or "height");
  • t: name of the column with time measurements (default to "t", "time", or "TIME");

GeoJSON

  • numbertype: number type of geometry coordinates (default to Float64)
  • Other options are passed to GeoJSON.read, see the GeoJSON.jl documentation for more details;

GSLIB

  • Other options are passed to GslibIO.load, see the GslibIO.jl documentation for more details;

Shapefile

  • Other options are passed to Shapefile.read, see the Shapefile.jl documentation for more details;

GeoParquet

  • Other options are passed to GeoParquet.read, see the GeoParquet.jl documentation for more details;

GeoTIFF, GeoPackage, KML

  • Other options are passed to ArchGDAL.read, see the ArchGDAL.jl documentation for more details;

Examples

# load coordinates of geojson file as Float64 (default)
GeoIO.load("file.geojson")
# load coordinates of geojson file as Float32
GeoIO.load("file.geojson", numbertype=Float32)
source
GeoIO.saveFunction
save(fname, geotable; kwargs...)

Save geotable to file fname of given format based on the file extension.

Other kwargs are forwarded to the backend packages.

Please use the formats function to list all supported file formats.

Options

OFF

  • color: name of the column with geometry colors, if nothing the geometries will be saved without colors (default to nothing);

MSH

  • vcolumn: name of the column in vertex table with node data, if nothing the geometries will be saved without node data (default to nothing);
  • ecolumn: name of the column in element table with element data, if nothing the geometries will be saved without element data (default to nothing);

STL

  • ascii: defines whether the file will be saved in ASCII format, otherwise Binary format will be used (default to false);

CSV

  • coords: names of the columns where the point coordinates will be saved (default to "x", "y", "z");
  • floatformat: C-style format string for float values (default to no formatting);
  • Other options are passed to CSV.write, see the CSV.jl documentation for more details;

NetCDF

  • x: name of the column where the coordinate x will be saved (default to CRS coordinate name);
  • y: name of the column where the coordinate y will be saved (default to CRS coordinate name);
  • z: name of the column where the coordinate z will be saved (default to CRS coordinate name);
  • t: name of the column where the time measurements will be saved (default to "t");

GeoTIFF

  • options: list with options that will be passed to GDAL;

GeoPackage

  • layername: name of the layer where the data will be saved (default to "data");
  • options: dictionary with options that will be passed to GDAL;

GSLIB

  • Other options are passed to GslibIO.save, see the GslibIO.jl documentation for more details;

Shapefile

  • Other options are passed to Shapefile.write, see the Shapefile.jl documentation for more details;

GeoJSON

  • Other options are passed to GeoJSON.write, see the GeoJSON.jl documentation for more details;

GeoParquet

  • Other options are passed to GeoParquet.write, see the GeoParquet.jl documentation for more details;

Examples

# overwrite an existing shapefile
GeoIO.save("file.shp", force = true)
source
GeoIO.formatsFunction
formats([io]; sortby=:format)

Displays in io (defaults to stdout if io is not given) a table with all formats supported by GeoIO.jl and the packages used to load and save each of them.

Optionally, sort the table by the :extension, :load or :save columns using the sortby argument.

source
using GeoIO

zone = GeoIO.load("data/zone.shp")
4×6 GeoTable over 4 GeometrySet
PERIMETER ACRES MACROZONA Hectares area_m2 geometry
Continuous Continuous Categorical Continuous Continuous MultiPolygon
[NoUnits] [NoUnits] [NoUnits] [NoUnits] [NoUnits] 🖈 GeodeticLatLon{SAD69}
5.8508e6 3.23145e7 Estuario 1.30772e7 1.30772e11 Multi(21×PolyArea)
9.53947e6 2.50594e8 Fronteiras Antigas 1.01412e8 1.01412e12 Multi(1×PolyArea)
1.01743e7 2.75528e8 Fronteiras Intermediarias 1.11502e8 1.11502e12 Multi(1×PolyArea)
7.09612e6 1.61293e8 Fronteiras Novas 6.5273e7 6.5273e11 Multi(2×PolyArea)

Artifacts

The GeoArtifacts.jl package provides utility functions to automatically download geospatial data from repositories on the internet.

using GeoArtifacts

# download artifacts from naturalearthdata.com
earth    = NaturalEarth.naturalearth1("water")
borders  = NaturalEarth.borders()
airports = NaturalEarth.airports()
ports    = NaturalEarth.ports()

# initialize viewer with a coarse "raster"
earth |> Upscale(10, 5) |> viewer

# add other elements to the visualization
viz!(borders.geometry, color = "cyan")
viz!(airports.geometry, color = "black", pointsize=4, pointmarker='✈')
viz!(ports.geometry, color = "blue", pointsize=4, pointmarker='⚓')

# display current figure
Mke.current_figure()
Example block output