11  Simple interpolation

A very common task in geospatial data science is geospatial interpolation, i.e., predicting variables on geometries that lie between two or more geometries that have measurements. In this chapter, we will exploit geospatial correlation to make good predictions of continuous variables over an entire domain based on sparse measurements, which are usually stored on a GeometrySet.

The basic idea behind most geostatistical interpolation methods is weighted combination of values from neighboring geometries. Given a geometry \(u\), we want to estimate the value of the variable at this geometry \(z(u)\) using weighted combinations of measurements from neighboring geometries \(u_i,\ i=1,2,\ldots,n\):

\[ z(u) = \lambda_1 z(u_1) + \lambda_2 z(u_2) + \cdots + \lambda_n z(u_n) \]

The methods differ in the way they compute the weights \(\lambda_i,\ i=1,2,\ldots,n\), and we will cover two basic methods from classical literature: IDW and Kriging.

11.1 IDW

In Inverse Distance Weighting (IDW), the weights are computed in terms of distances \(d(u, u_i)\) to the neighboring geometries:

\[ \lambda_i = \frac{1}{{d(u, u_i)}^\beta} \]

This basic idea was proposed by Shepard (1968), who also studied the effect of the exponent \(\beta\) in the interpolation results. Here, we will visualize the results using synthetic data:

data = georef((z=[1.0, 0.0, 1.0],), [(25, 25), (50, 75), (75, 50)])
3×2 GeoTable over 3 PointSet{2,Float64}
z geometry
Continuous Point2
[NoUnits]
1.0 (25.0, 25.0)
0.0 (50.0, 75.0)
1.0 (75.0, 50.0)
viewer(data, pointsize = 10)
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

First, we need to the define the domain of interpolation, i.e., the geometries where we want to estimate the variable z. In this case, we will perform interpolation on a 2D CartesianGrid:

grid = CartesianGrid(100, 100)
100×100 CartesianGrid{2,Float64}
  minimum: Point(0.0, 0.0)
  maximum: Point(100.0, 100.0)
  spacing: (1.0, 1.0)

With the measurements of the variable z in the geotable, and the domain of interpolation, we can use the Interpolate transform with the IDW model:

interp = data |> Interpolate(grid, IDW())
10000×2 GeoTable over 100×100 CartesianGrid{2,Float64}
z geometry
Continuous Quadrangle
[NoUnits]
0.781732 Quadrangle((0.0, 0.0), ..., (0.0, 1.0))
0.783631 Quadrangle((1.0, 0.0), ..., (1.0, 1.0))
0.785528 Quadrangle((2.0, 0.0), ..., (2.0, 1.0))
0.787419 Quadrangle((3.0, 0.0), ..., (3.0, 1.0))
0.789298 Quadrangle((4.0, 0.0), ..., (4.0, 1.0))
0.79116 Quadrangle((5.0, 0.0), ..., (5.0, 1.0))
0.792998 Quadrangle((6.0, 0.0), ..., (6.0, 1.0))
0.794805 Quadrangle((7.0, 0.0), ..., (7.0, 1.0))
0.796573 Quadrangle((8.0, 0.0), ..., (8.0, 1.0))
0.798295 Quadrangle((9.0, 0.0), ..., (9.0, 1.0))
interp |> viewer
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

To visualize the effect of the exponent, let’s extract the interpolation results along the line segment between two of the measurements, and visualize it:

seg = Segment((25, 25), (50, 75))
Segment{2,Float64}
├─ Point(25.0, 25.0)
└─ Point(50.0, 75.0)
z = interp[seg, "z"]
51-element Vector{Float64}:
 0.9878058074478854
 0.9876143292632583
 0.972495221818037
 0.9564199524248658
 0.9361611041027588
 0.9216832603980624
 0.9021987666253538
 0.8888032891965897
 0.870354691640647
 0.8577880405850806
 0.8403212677723804
 0.828402085571967
 0.8118014293422096
 ⋮
 0.409930020727715
 0.3827414007527533
 0.3610487511864091
 0.32990370885751297
 0.3057282754699494
 0.26975994006108955
 0.24282226179926458
 0.20095741787691726
 0.17114300512175212
 0.1219443018616857
 0.09029188985211117
 0.03175705173700492
Mke.lines(z)
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

We observe that the exponent \(\beta=1\) leads to a gradual transition from the value \(z=1\) to the value \(z=0\). Let’s repeat the process with increasing values of the exponent:

fig = Mke.Figure()
Mke.Axis(fig[1,1])
for β in [1,2,3,4,5]
  interp = data |> Interpolate(grid, IDW(β))
  Mke.lines!(interp[seg, "z"], label = "β=$β")
end
Mke.axislegend(position = :lb)
Mke.current_figure()
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

The larger is the exponent, the more abrupt is the transition of values between the two locations. In addition, the IDW solution will converge to the nearest neighbor solution as \(\beta \to \infty\):

data |> Interpolate(grid, IDW(100)) |> viewer
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

Custom distances from Distances.jl may be used in place of the Euclidean distance to meet specific application requirements (e.g. Haversine distance on the sphere):

data |> Interpolate(grid, IDW(1, Chebyshev())) |> viewer
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

11.2 Kriging

In Kriging (Matheron 1971), the weights are computed using geospatial correlation. More specifically, they are the solution to a linear system of equations produced with a theoretical variogram model \(\gamma\):

\[ \begin{bmatrix} \mathbf{G} & \mathbf{1} \\ \mathbf{1}^\top & 0 \end{bmatrix} \begin{bmatrix} \mathbf{\lambda} \\ \nu \end{bmatrix} = \begin{bmatrix} \mathbf{g} \\ 1 \end{bmatrix} \]

where \(\mathbf{G}_{ij} = \gamma(u_i, u_j)\) and \(\mathbf{g}_i = \gamma(u, u_i)\) and \(\nu\) is the Lagrange multiplier associated with the constraint \(\mathbf{1}^\top \mathbf{\lambda} = 1\). The system of equations above is known as Ordinary Kriging, but many other variants are supported by the framework.

Note

The book by Olea (1999) is a good resource to learn the different systems of of equations associated with Kriging interpolation. Names such as Simple Kriging Ordinary Kriging, Universal Kriging are quite popular.

Unlike IDW, the Kriging solution is a function of pairwise evaluations of distances between geometries with measurements, represented in the matrix \(\mathbf{G}\). The pairwise evaluations account for possible redundancy in the measurements, which leads to improvements in the estimates:

γ = GaussianVariogram(range=30.0)

data |> Interpolate(grid, Kriging(γ)) |> viewer
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

In the previous chapter, we learned how the range of the variogram determines the average size of the “blobs” in the image. Let’s illustrate this concept again for increasing values of this parameter:

fig = Mke.Figure()
Mke.Axis(fig[1,1])
for r in [10,20,30,40,50]
  γ = GaussianVariogram(range=r)
  interp = data |> Interpolate(grid, Kriging(γ))
  Mke.lines!(interp[seg, "z"], label = "range=$r")
end
Mke.axislegend(position = :lb)
Mke.current_figure()
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

The larger is the range, the less abrupt is the transition of values between the two locations. Similar visualizations can be produced by varying the sill, the nugget and the model of the variogram.

11.3 Example

In order to solidify the concepts learned so far, let’s look into an example. We will cover all the steps that a geospatial data scientist has to perform to extract geospatial correlation from samples and to use this information in geospatial interpolation.

Let’s consider an image of the Walker Lake by Mariethoz and Caers (2014) as groundtruth. To avoid visualization of large images with CairoMakie.jl, we will consider a subdomain within a Box:

using GeoArtifacts

img = GeoArtifacts.image("WalkerLakeTruth")

img = img[Box((0, 0), (200, 200)), :]
40401×2 GeoTable over 40401 view(::CartesianGrid{2,Float64}, [1, 2, 3, 4, ..., 60198, 60199, 60200, 60201])
Z geometry
Continuous Quadrangle
[NoUnits]
0.127143 Quadrangle((0.0, 0.0), ..., (0.0, 1.0))
0.126681 Quadrangle((1.0, 0.0), ..., (1.0, 1.0))
0.126143 Quadrangle((2.0, 0.0), ..., (2.0, 1.0))
0.125604 Quadrangle((3.0, 0.0), ..., (3.0, 1.0))
0.144938 Quadrangle((4.0, 0.0), ..., (4.0, 1.0))
0.169887 Quadrangle((5.0, 0.0), ..., (5.0, 1.0))
0.219131 Quadrangle((6.0, 0.0), ..., (6.0, 1.0))
0.302491 Quadrangle((7.0, 0.0), ..., (7.0, 1.0))
0.351556 Quadrangle((8.0, 0.0), ..., (8.0, 1.0))
0.365325 Quadrangle((9.0, 0.0), ..., (9.0, 1.0))
img |> viewer
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

Let’s assume that we only have access to 10000 samples from the image:

using Random

samples = img |> Sample(10000, replace=false, rng=MersenneTwister(123))

samples |> viewer
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

Our goal is to interpolate the variable Z over the original domain. Let’s start by estimating the EmpiricalVariogram from the samples. Because the distribution of values in the Walker Lake is skewed, the default :matheron estimator of the variogram shows a high nugget effect:

g = EmpiricalVariogram(samples, "Z", maxlag = 100.0)
EmpiricalVariogram
├─ abscissa: [3.4245, 7.81981, 12.6851, ..., 87.5005, 92.4895, 97.4891]
├─ ordinate: [0.0269364, 0.0400784, 0.0518122, ..., 0.0907346, 0.0893383, 0.0880097]
├─ distance: Euclidean(0.0)
├─ estimator: MatheronEstimator()
└─ npairs: 24150788
Mke.plot(g)
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

A better alternative in this case is to use the robust :cressie estimator:

g = EmpiricalVariogram(samples, "Z", maxlag = 100.0, estimator = :cressie)
EmpiricalVariogram
├─ abscissa: [3.4245, 7.81981, 12.6851, ..., 87.5005, 92.4895, 97.4891]
├─ ordinate: [0.0132249, 0.0269318, 0.0404237, ..., 0.0970837, 0.0961198, 0.095076]
├─ distance: Euclidean(0.0)
├─ estimator: CressieEstimator()
└─ npairs: 24150788
Mke.plot(g)
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

After estimating the empirical variogram, the next step consists of fitting a theoretical model. The behavior near the origin resembles a SphericalVariogram:

γ = GeoStatsFunctions.fit(SphericalVariogram, g)
SphericalVariogram
├─ sill: 0.09816267067372791
├─ nugget: 2.1498590971806487e-11
├─ range: 42.63154340447526
└─ distance: Euclidean
Mke.plot(γ, maxlag = 100.0)
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

Now that we extracted the geospatial correlation from the samples, we can use this information in Kriging interpolation. Instead of fitting all the samples at once like it is done in the Interpolate transform, we will fit the Kriging model with a maximum number of neighbors with the InterpolateNeighbors transform:

interp = samples |> InterpolateNeighbors(img.geometry, Kriging(γ))
40401×2 GeoTable over 40401 view(::CartesianGrid{2,Float64}, [1, 2, 3, 4, ..., 60198, 60199, 60200, 60201])
Z geometry
Continuous Quadrangle
[NoUnits]
0.127143 Quadrangle((0.0, 0.0), ..., (0.0, 1.0))
0.125893 Quadrangle((1.0, 0.0), ..., (1.0, 1.0))
0.126143 Quadrangle((2.0, 0.0), ..., (2.0, 1.0))
0.148675 Quadrangle((3.0, 0.0), ..., (3.0, 1.0))
0.177866 Quadrangle((4.0, 0.0), ..., (4.0, 1.0))
0.21315 Quadrangle((5.0, 0.0), ..., (5.0, 1.0))
0.248825 Quadrangle((6.0, 0.0), ..., (6.0, 1.0))
0.280572 Quadrangle((7.0, 0.0), ..., (7.0, 1.0))
0.314659 Quadrangle((8.0, 0.0), ..., (8.0, 1.0))
0.326309 Quadrangle((9.0, 0.0), ..., (9.0, 1.0))
interp |> viewer
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/ND0gA/src/scenes.jl:220

Tip for all users

The InterpolateNeighbors is recommended in 3D applications with hundreds of thousands of measurements and very large grids.

Tip for all users

The InterpolateMissing transform can be used to interpolate missing values in a geotable using the same algorithm of InterpolateNeighbors. Likewise, the InterpolateNaN can be used to interpolate NaN values.

11.4 Congratulations!

Congratulations on finishing Part IV of the book. The interpolation models introduced here are simple, yet very useful. Before we start our journey with real-world applications of the framework, let’s review what we learned:

  • Geospatial interpolation can be achieved with the Interpolate and InterpolateNeighbors transforms, and geostatistical models such as IDW and Kriging.
  • Models such as Kriging exploit geospatial correlation to improve interpolation results. We can extract this information from samples using a two-step procedure:
    1. Estimate the EmpiricalVariogram from the available samples
    2. Perform fit of theoretical model with result from previous step
  • Interpolate and InterpolateNeighbors are examples of geostatistical transforms. They can be easily inserted in more advanced pipelines as discussed in Part II.

In the next chapters, we will use the framework that we learned with real data to illustrate how advanced geospatial data science can be done with just a few lines of code. Once a solution is written in terms of the high-level tools covered in previous chapters, it is trivial to improve computational performance in pure Julia.

Note

Feature and performance requests are very welcome. We invite all users of the framework to submit issues and contribute with our open source software stack.