8  Split-apply-combine

In Part II of the book, we introduced transform pipelines to process the values and the domain of geotables using high-level abstractions that preserve geospatial information. In this chapter, we start to introduce other tools to query geotables after they have been pre-processed by pipelines.

8.1 Motivation

In geospatial data science, geoscientific questions are often posed in terms of both the values and the domain of a geotable. For example:

  1. Where are the areas with high probability of landslide?
  2. What is the average rainfall per watershed over the last year?
  3. How much lithium will be mined from each geological unit?
  4. What is the variation of log-permeability per depositional facies?

The word “where” is often present in these questions to indicate that answers must be georeferenced. If the variable of interest is already present in the geotable, then we can effectively answer “where” questions using the viewer and the Filter transform from previous chapters:

geotable |> Filter(row -> row.probability > 0.9) |> viewer

If the variable of interest is not present in the geotable, or if the “where” word is not present in the original question, then there will be some reference to “geospatial units” on which geostatistics must be computed. These questions can be answered with a geospatial version of the split-apply-combine strategy (Wickham 2011) from data science. Our framework provides the @groupby, @transform and @combine macros to split-apply-combine geotables.

8.2 Bonnie data set

We will use the Bonnie data set to illustrate our geospatial split-apply-combine:

The Bonnie Project Example is under copyright of Transmin Metallurgical Consultants, 2019. It is issued under the Creative Commons Attribution-ShareAlike 4.0 International Public License.

using GeoIO

gt = GeoIO.load("data/bonnie.csv", coords = ("EAST", "NORTH", "RL"))
19618×9 GeoTable over 19618 PointSet
Auppm Agppm Cuppm Asppm Sper CODE OX ISBD geometry
Continuous Continuous Continuous Continuous Continuous Categorical Categorical Continuous Point
[NoUnits] [NoUnits] [NoUnits] [NoUnits] [NoUnits] [NoUnits] [NoUnits] [NoUnits] 🖈 Cartesian{NoDatum}
0.407 2.335 1311.0 27.0 0.01 C1 TR1 2.42 (x: 243940.0 m, y: 1.00223e6 m, z: 1855.0 m)
0.407 2.335 1311.0 27.0 0.01 C1 TR1 1.98 (x: 243945.0 m, y: 1.00223e6 m, z: 1855.0 m)
0.407 2.335 1311.0 27.0 0.01 C1 TR1 2.068 (x: 243950.0 m, y: 1.00226e6 m, z: 1855.0 m)
0.385 2.1 572.0 0.0 0.0 C1 TR1 2.222 (x: 243965.0 m, y: 1.00226e6 m, z: 1855.0 m)
0.385 2.1 572.0 0.0 0.0 C1 TR1 2.156 (x: 243970.0 m, y: 1.00226e6 m, z: 1855.0 m)
0.385 2.1 572.0 0.0 0.0 C1 OX1 2.08 (x: 243960.0 m, y: 1.00227e6 m, z: 1855.0 m)
0.385 2.1 572.0 0.0 0.0 C1 OX1 2.06 (x: 243965.0 m, y: 1.00227e6 m, z: 1855.0 m)
0.385 2.1 572.0 0.0 0.0 C1 OX1 1.9 (x: 243970.0 m, y: 1.00227e6 m, z: 1855.0 m)
0.42152 1.422 852.36 22.85 2.4078 C1 TR1 2.024 (x: 243930.0 m, y: 1.00222e6 m, z: 1860.0 m)
0.41747 1.429 832.36 21.87 1.9388 C1 TR1 2.112 (x: 243935.0 m, y: 1.00222e6 m, z: 1860.0 m)

It represents a 3D mineral deposit with grades in parts per million (ppm), sulfur contaminant in percent, and other categorical variables with geological and lithological units:

names(gt)
9-element Vector{String}:
 "Auppm"
 "Agppm"
 "Cuppm"
 "Asppm"
 "Sper"
 "CODE"
 "OX"
 "ISBD"
 "geometry"

The “EAST”, “NORTH” and “RL” coordinates used to georef the CSV file represent the centroids of the mining blocks. The sides of these blocks are provided as metadata (e.g., 5x5x5):

gt |> viewer
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/We6MY/src/scenes.jl:227

Let’s clean the geotable using what we learned in previous chapters. We will reject the column with sulfur, will rename the variables for greater readability, and will drop missing values. We will also add units to some of the variables using bracket notation:

clean = Reject("Sper") 
        Rename("Agppm" => "Ag [ppm]",
               "Auppm" => "Au [ppm]",
               "Asppm" => "As [ppm]",
               "Cuppm" => "Cu [ppm]",
               "ISBD" => "ρ [Mg/m^3]",
               "CODE" => "geo",
               "OX" => "litho") 
        DropMissing() 
        Unitify()

gt = gt |> clean
19548×8 GeoTable over 19548 view(::PointSet, [1, 2, 3, 4, ..., 19615, 19616, 19617, 19618])
Au Ag Cu As geo litho ρ geometry
Continuous Continuous Continuous Continuous Categorical Categorical Continuous Point
[ppm] [ppm] [ppm] [ppm] [NoUnits] [NoUnits] [Mg m^-3] 🖈 Cartesian{NoDatum}
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 2.42 Mg m^-3 (x: 243940.0 m, y: 1.00223e6 m, z: 1855.0 m)
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 1.98 Mg m^-3 (x: 243945.0 m, y: 1.00223e6 m, z: 1855.0 m)
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 2.068 Mg m^-3 (x: 243950.0 m, y: 1.00226e6 m, z: 1855.0 m)
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 TR1 2.222 Mg m^-3 (x: 243965.0 m, y: 1.00226e6 m, z: 1855.0 m)
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 TR1 2.156 Mg m^-3 (x: 243970.0 m, y: 1.00226e6 m, z: 1855.0 m)
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 2.08 Mg m^-3 (x: 243960.0 m, y: 1.00227e6 m, z: 1855.0 m)
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 2.06 Mg m^-3 (x: 243965.0 m, y: 1.00227e6 m, z: 1855.0 m)
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 1.9 Mg m^-3 (x: 243970.0 m, y: 1.00227e6 m, z: 1855.0 m)
0.42152 ppm 1.422 ppm 852.36 ppm 22.85 ppm C1 TR1 2.024 Mg m^-3 (x: 243930.0 m, y: 1.00222e6 m, z: 1860.0 m)
0.41747 ppm 1.429 ppm 832.36 ppm 21.87 ppm C1 TR1 2.112 Mg m^-3 (x: 243935.0 m, y: 1.00222e6 m, z: 1860.0 m)
Note

The Unitify transform recognizes unit bracket notation in column names. It adds the units specified within brackets to the values of the columns, and removes the brackets from the column names.

That is a lot better! Let’s assume that we want to answer the following business question:

What is the total mass of gold that will be mined from each geological unit?

8.3 Splitting geotables

We can split geotables into lazy geospatial partitions based on values stored in a column. In this case, we want to split the mineral deposit in geological units stored in the geo column:

groups = @groupby(gt, "geo")
2 Partition
├─ 17153×8 SubGeoTable over 17153 view(::PointSet, [1, 2, 3, 4, ..., 17203, 17204, 17205, 17206])
└─ 2395×8 SubGeoTable over 2395 view(::PointSet, [17207, 17208, 17209, 17210, ..., 19615, 19616, 19617, 19618])
metadata: rows, names

There are two geological units in this deposit, represented as SubGeoTable. We can access these units by indexing into the geospatial partition:

groups[1]
17153×8 SubGeoTable over 17153 view(::PointSet, [1, 2, 3, 4, ..., 17203, 17204, 17205, 17206])
Au Ag Cu As geo litho ρ geometry
Continuous Continuous Continuous Continuous Categorical Categorical Continuous Point
[ppm] [ppm] [ppm] [ppm] [NoUnits] [NoUnits] [Mg m^-3] 🖈 Cartesian{NoDatum}
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 2.42 Mg m^-3 (x: 243940.0 m, y: 1.00223e6 m, z: 1855.0 m)
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 1.98 Mg m^-3 (x: 243945.0 m, y: 1.00223e6 m, z: 1855.0 m)
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 2.068 Mg m^-3 (x: 243950.0 m, y: 1.00226e6 m, z: 1855.0 m)
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 TR1 2.222 Mg m^-3 (x: 243965.0 m, y: 1.00226e6 m, z: 1855.0 m)
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 TR1 2.156 Mg m^-3 (x: 243970.0 m, y: 1.00226e6 m, z: 1855.0 m)
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 2.08 Mg m^-3 (x: 243960.0 m, y: 1.00227e6 m, z: 1855.0 m)
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 2.06 Mg m^-3 (x: 243965.0 m, y: 1.00227e6 m, z: 1855.0 m)
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 1.9 Mg m^-3 (x: 243970.0 m, y: 1.00227e6 m, z: 1855.0 m)
0.42152 ppm 1.422 ppm 852.36 ppm 22.85 ppm C1 TR1 2.024 Mg m^-3 (x: 243930.0 m, y: 1.00222e6 m, z: 1860.0 m)
0.41747 ppm 1.429 ppm 832.36 ppm 21.87 ppm C1 TR1 2.112 Mg m^-3 (x: 243935.0 m, y: 1.00222e6 m, z: 1860.0 m)
viz(groups[1].geometry, color = "teal")
viz!(groups[2].geometry, color = "slategray3")
Mke.current_figure()
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/We6MY/src/scenes.jl:227

8.4 Applying expressions

The mass of gold in a mining block is a function of the gold grade (Au), the rock density (ρ), and the volume of the block:

\[ m = Au \times \rho \times V \]

Let’s use an auxiliary function to convert the Points in the geometry column into Boxes of sides 5x5x5. The function takes a centroid point as input and produces a box centered at the point with corners that are 2.5x2.5x2.5 units away in both directions:

box(point) = Box(point - Vec(2.5, 2.5, 2.5), point + Vec(2.5, 2.5, 2.5))
box (generic function with 1 method)
box.(gt.geometry)
19548-element Vector{Box{𝔼{3}, Cartesian3D{NoDatum, Quantity{Float64, 𝐋, Unitful.FreeUnits{(m,), 𝐋, nothing}}}}}:
 Box(min: (x: 2.43938e5 m, y: 1.00223e6 m, z: 1852.5 m), max: (x: 2.43942e5 m, y: 1.00223e6 m, z: 1857.5 m))
 Box(min: (x: 2.43942e5 m, y: 1.00223e6 m, z: 1852.5 m), max: (x: 2.43948e5 m, y: 1.00223e6 m, z: 1857.5 m))
 Box(min: (x: 2.43948e5 m, y: 1.00226e6 m, z: 1852.5 m), max: (x: 2.43952e5 m, y: 1.00226e6 m, z: 1857.5 m))
 Box(min: (x: 2.43962e5 m, y: 1.00226e6 m, z: 1852.5 m), max: (x: 2.43968e5 m, y: 1.00226e6 m, z: 1857.5 m))
 Box(min: (x: 2.43968e5 m, y: 1.00226e6 m, z: 1852.5 m), max: (x: 2.43972e5 m, y: 1.00226e6 m, z: 1857.5 m))
 Box(min: (x: 2.43958e5 m, y: 1.00227e6 m, z: 1852.5 m), max: (x: 2.43962e5 m, y: 1.00227e6 m, z: 1857.5 m))
 Box(min: (x: 2.43962e5 m, y: 1.00227e6 m, z: 1852.5 m), max: (x: 2.43968e5 m, y: 1.00227e6 m, z: 1857.5 m))
 Box(min: (x: 2.43968e5 m, y: 1.00227e6 m, z: 1852.5 m), max: (x: 2.43972e5 m, y: 1.00227e6 m, z: 1857.5 m))
 Box(min: (x: 2.43928e5 m, y: 1.00222e6 m, z: 1857.5 m), max: (x: 2.43932e5 m, y: 1.00222e6 m, z: 1862.5 m))
 Box(min: (x: 2.43932e5 m, y: 1.00222e6 m, z: 1857.5 m), max: (x: 2.43938e5 m, y: 1.00222e6 m, z: 1862.5 m))
 Box(min: (x: 2.43938e5 m, y: 1.00222e6 m, z: 1857.5 m), max: (x: 2.43942e5 m, y: 1.00222e6 m, z: 1862.5 m))
 Box(min: (x: 2.43942e5 m, y: 1.00222e6 m, z: 1857.5 m), max: (x: 2.43948e5 m, y: 1.00222e6 m, z: 1862.5 m))
 Box(min: (x: 2.43912e5 m, y: 1.00223e6 m, z: 1857.5 m), max: (x: 2.43918e5 m, y: 1.00223e6 m, z: 1862.5 m))
 ⋮
 Box(min: (x: 2.43782e5 m, y: 1.0022e6 m, z: 2027.5 m), max: (x: 2.43788e5 m, y: 1.0022e6 m, z: 2032.5 m))
 Box(min: (x: 2.43788e5 m, y: 1.0022e6 m, z: 2027.5 m), max: (x: 2.43792e5 m, y: 1.0022e6 m, z: 2032.5 m))
 Box(min: (x: 2.43792e5 m, y: 1.0022e6 m, z: 2027.5 m), max: (x: 2.43798e5 m, y: 1.0022e6 m, z: 2032.5 m))
 Box(min: (x: 2.43752e5 m, y: 1.00221e6 m, z: 2027.5 m), max: (x: 2.43758e5 m, y: 1.00221e6 m, z: 2032.5 m))
 Box(min: (x: 2.43758e5 m, y: 1.00221e6 m, z: 2027.5 m), max: (x: 2.43762e5 m, y: 1.00221e6 m, z: 2032.5 m))
 Box(min: (x: 2.43762e5 m, y: 1.00221e6 m, z: 2027.5 m), max: (x: 2.43768e5 m, y: 1.00221e6 m, z: 2032.5 m))
 Box(min: (x: 2.43768e5 m, y: 1.00221e6 m, z: 2027.5 m), max: (x: 2.43772e5 m, y: 1.00221e6 m, z: 2032.5 m))
 Box(min: (x: 2.43772e5 m, y: 1.00221e6 m, z: 2027.5 m), max: (x: 2.43778e5 m, y: 1.00221e6 m, z: 2032.5 m))
 Box(min: (x: 2.43778e5 m, y: 1.00221e6 m, z: 2027.5 m), max: (x: 2.43782e5 m, y: 1.00221e6 m, z: 2032.5 m))
 Box(min: (x: 2.43782e5 m, y: 1.00221e6 m, z: 2027.5 m), max: (x: 2.43788e5 m, y: 1.00221e6 m, z: 2032.5 m))
 Box(min: (x: 2.43788e5 m, y: 1.00221e6 m, z: 2027.5 m), max: (x: 2.43792e5 m, y: 1.00221e6 m, z: 2032.5 m))
 Box(min: (x: 2.43792e5 m, y: 1.00221e6 m, z: 2027.5 m), max: (x: 2.43798e5 m, y: 1.00221e6 m, z: 2032.5 m))

The @transform macro modifies or creates new columns in the geotable based on expressions with existing column names. In this case, we want to replace the geometry column by calling the auxiliary function above:

gt = @transform(gt, :geometry = box(:geometry))
19548×8 GeoTable over 19548 GeometrySet
Au Ag Cu As geo litho ρ geometry
Continuous Continuous Continuous Continuous Categorical Categorical Continuous Box
[ppm] [ppm] [ppm] [ppm] [NoUnits] [NoUnits] [Mg m^-3] 🖈 Cartesian{NoDatum}
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 2.42 Mg m^-3 Box(min: (x: 2.43938e5 m, y: 1.00223e6 m, z: 1852.5 m), max: (x: 2.43942e5 m, y: 1.00223e6 m, z: 1857.5 m))
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 1.98 Mg m^-3 Box(min: (x: 2.43942e5 m, y: 1.00223e6 m, z: 1852.5 m), max: (x: 2.43948e5 m, y: 1.00223e6 m, z: 1857.5 m))
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 2.068 Mg m^-3 Box(min: (x: 2.43948e5 m, y: 1.00226e6 m, z: 1852.5 m), max: (x: 2.43952e5 m, y: 1.00226e6 m, z: 1857.5 m))
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 TR1 2.222 Mg m^-3 Box(min: (x: 2.43962e5 m, y: 1.00226e6 m, z: 1852.5 m), max: (x: 2.43968e5 m, y: 1.00226e6 m, z: 1857.5 m))
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 TR1 2.156 Mg m^-3 Box(min: (x: 2.43968e5 m, y: 1.00226e6 m, z: 1852.5 m), max: (x: 2.43972e5 m, y: 1.00226e6 m, z: 1857.5 m))
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 2.08 Mg m^-3 Box(min: (x: 2.43958e5 m, y: 1.00227e6 m, z: 1852.5 m), max: (x: 2.43962e5 m, y: 1.00227e6 m, z: 1857.5 m))
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 2.06 Mg m^-3 Box(min: (x: 2.43962e5 m, y: 1.00227e6 m, z: 1852.5 m), max: (x: 2.43968e5 m, y: 1.00227e6 m, z: 1857.5 m))
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 1.9 Mg m^-3 Box(min: (x: 2.43968e5 m, y: 1.00227e6 m, z: 1852.5 m), max: (x: 2.43972e5 m, y: 1.00227e6 m, z: 1857.5 m))
0.42152 ppm 1.422 ppm 852.36 ppm 22.85 ppm C1 TR1 2.024 Mg m^-3 Box(min: (x: 2.43928e5 m, y: 1.00222e6 m, z: 1857.5 m), max: (x: 2.43932e5 m, y: 1.00222e6 m, z: 1862.5 m))
0.41747 ppm 1.429 ppm 832.36 ppm 21.87 ppm C1 TR1 2.112 Mg m^-3 Box(min: (x: 2.43932e5 m, y: 1.00222e6 m, z: 1857.5 m), max: (x: 2.43938e5 m, y: 1.00222e6 m, z: 1862.5 m))

The macro will broadcast the expression to all rows of the geotable.

Note

We used the :geometry symbol to refer to the geometry column instead of the usual "geometry" string. The @transform macro understands that strings can also appear as valid values in the right-hand-side of the expression, which are not columns in the geotable. To mark a string as a column name, we need to use curly braces:

@transform(gt, {"geometry"} = box({"geometry"}))
Tip for all users

The use of symbols to represent column names is preferred in macros.

The mass of gold on each mining block can be computed now that the geometries have volume:

@transform(gt, :m = :Au * :ρ * volume(:geometry))
19548×9 GeoTable over 19548 GeometrySet
Au Ag Cu As geo litho ρ m geometry
Continuous Continuous Continuous Continuous Categorical Categorical Continuous Continuous Box
[ppm] [ppm] [ppm] [ppm] [NoUnits] [NoUnits] [Mg m^-3] [Mg ppm] 🖈 Cartesian{NoDatum}
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 2.42 Mg m^-3 123.117 Mg ppm Box(min: (x: 2.43938e5 m, y: 1.00223e6 m, z: 1852.5 m), max: (x: 2.43942e5 m, y: 1.00223e6 m, z: 1857.5 m))
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 1.98 Mg m^-3 100.732 Mg ppm Box(min: (x: 2.43942e5 m, y: 1.00223e6 m, z: 1852.5 m), max: (x: 2.43948e5 m, y: 1.00223e6 m, z: 1857.5 m))
0.407 ppm 2.335 ppm 1311.0 ppm 27.0 ppm C1 TR1 2.068 Mg m^-3 105.209 Mg ppm Box(min: (x: 2.43948e5 m, y: 1.00226e6 m, z: 1852.5 m), max: (x: 2.43952e5 m, y: 1.00226e6 m, z: 1857.5 m))
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 TR1 2.222 Mg m^-3 106.934 Mg ppm Box(min: (x: 2.43962e5 m, y: 1.00226e6 m, z: 1852.5 m), max: (x: 2.43968e5 m, y: 1.00226e6 m, z: 1857.5 m))
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 TR1 2.156 Mg m^-3 103.758 Mg ppm Box(min: (x: 2.43968e5 m, y: 1.00226e6 m, z: 1852.5 m), max: (x: 2.43972e5 m, y: 1.00226e6 m, z: 1857.5 m))
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 2.08 Mg m^-3 100.1 Mg ppm Box(min: (x: 2.43958e5 m, y: 1.00227e6 m, z: 1852.5 m), max: (x: 2.43962e5 m, y: 1.00227e6 m, z: 1857.5 m))
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 2.06 Mg m^-3 99.1375 Mg ppm Box(min: (x: 2.43962e5 m, y: 1.00227e6 m, z: 1852.5 m), max: (x: 2.43968e5 m, y: 1.00227e6 m, z: 1857.5 m))
0.385 ppm 2.1 ppm 572.0 ppm 0.0 ppm C1 OX1 1.9 Mg m^-3 91.4375 Mg ppm Box(min: (x: 2.43968e5 m, y: 1.00227e6 m, z: 1852.5 m), max: (x: 2.43972e5 m, y: 1.00227e6 m, z: 1857.5 m))
0.42152 ppm 1.422 ppm 852.36 ppm 22.85 ppm C1 TR1 2.024 Mg m^-3 106.645 Mg ppm Box(min: (x: 2.43928e5 m, y: 1.00222e6 m, z: 1857.5 m), max: (x: 2.43932e5 m, y: 1.00222e6 m, z: 1862.5 m))
0.41747 ppm 1.429 ppm 832.36 ppm 21.87 ppm C1 TR1 2.112 Mg m^-3 110.212 Mg ppm Box(min: (x: 2.43932e5 m, y: 1.00222e6 m, z: 1857.5 m), max: (x: 2.43938e5 m, y: 1.00222e6 m, z: 1862.5 m))
Note

The @transform macro can be used with both geotables and geospatial partitions.

8.5 Combining results

We can use the @combine macro to reduce columns of geotables in a geospatial partition obtained with the @groupby macro. The macro is similar to the @transform macro, but expects valid reduction functions such as sum, mean, std. Reduction functions take a vector of values as input and produce a single scalar as output:

groups = @groupby(gt, :geo)

@combine(groups, :μ = mean(:Au), :σ = std(:Au))
2×4 GeoTable over 2 GeometrySet
geo μ σ geometry
Categorical Continuous Continuous Multi
[NoUnits] [ppm] [ppm] 🖈 Cartesian{NoDatum}
C1 0.581937 ppm 0.136264 ppm Multi(17153×Box)
C2 0.484622 ppm 0.095799 ppm Multi(2395×Box)

Note that the macro reduces the geometry column and produces a new complex Multi geometry with all the Boxes that are inside each geological unit. This is a very advanced feature of the framework that cannot be represented with the simple features standard. It is also possible to use a custom reduction function for the geometry column:

groups = @groupby(gt, :geo)

@combine(groups, :μ = mean(:Au), :σ = std(:Au), :geometry = first(:geometry))
2×4 GeoTable over 2 GeometrySet
geo μ σ geometry
Categorical Continuous Continuous Box
[NoUnits] [ppm] [ppm] 🖈 Cartesian{NoDatum}
C1 0.581937 ppm 0.136264 ppm Box(min: (x: 2.43938e5 m, y: 1.00223e6 m, z: 1852.5 m), max: (x: 2.43942e5 m, y: 1.00223e6 m, z: 1857.5 m))
C2 0.484622 ppm 0.095799 ppm Box(min: (x: 2.43792e5 m, y: 1.00214e6 m, z: 1872.5 m), max: (x: 2.43798e5 m, y: 1.00214e6 m, z: 1877.5 m))

8.6 Answering questions

Let’s recall our original business question:

What is the total mass of gold that will be mined from each geological unit?

We can now answer this question with three lines of code:

groups = @groupby(gt, :geo)

mass = @transform(groups, :m = :Au * :ρ * volume(:geometry))

answer = @combine(mass, :m = sum(:m))
2×3 GeoTable over 2 GeometrySet
geo m geometry
Categorical Continuous Multi
[NoUnits] [Mg ppm] 🖈 Cartesian{NoDatum}
C1 2.53302e6 Mg ppm Multi(17153×Box)
C2 3.29241e5 Mg ppm Multi(2395×Box)
Tip for all users

We can simplify the code further using the @chain macro. It forwards the resulting geotable or geospatial partition to the next macro:

@chain gt begin
  @groupby(:geo)
  @transform(:m = :Au * :ρ * volume(:geometry))
  @combine(:m = sum(:m))
end
2×3 GeoTable over 2 GeometrySet
geo m geometry
Categorical Continuous Multi
[NoUnits] [Mg ppm] 🖈 Cartesian{NoDatum}
C1 2.53302e6 Mg ppm Multi(17153×Box)
C2 3.29241e5 Mg ppm Multi(2395×Box)