Preface

Who this book is for

Anyone interested in geospatial data science will benefit from reading this book. If you are a student with basic-to-intermediate programming experience, you will learn a valuable set of tools for your career. If you are an experienced data scientist, you may still be surprised by the generality of the framework presented here.

This is not a book on geostatistics. Although some chapters and examples will cover concepts from geostatistical theory, that is only to illustrate what is possible after you master geospatial data science with the Julia programming language.

Why Julia?

An effective implementation of the framework presented in this book requires a language that can:

  • Generate high-performance code
  • Specialize on multiple arguments
  • Evaluate code interactively
  • Exploit parallel hardware

This list of requirements eliminates Python, R and other mainstream languages used for data science.

How to read this book

If this is your first encounter with Julia or with programming in general, consider reading the open source book Think Julia: How to Think Like a Computer Scientist by Lauwens and Downey (2018). It introduces the language to first-time programmers and explains basic concepts that you will need to know to master the material here.

If you are an experienced programmer who just wants to quickly learn the syntax of the language, consider checking the Learn Julia in Y minutes website. If you are seeking more detailed information, consider reading the official documentation.

Assuming that you learned the basics of the language, you can proceed and read this book. It is organized in five parts as follows:

flowchart LR
  preread["Julia basics ✅"] --> partI
  subgraph partI["Part I: Foundations"]
    chapter1["1. What is geospatial data?"]
    chapter2["2. Scientific visualization"]
    chapter3["3. Interfacing with GIS"]
    chapter4["4. Geometric processing"]
    chapter1 --> chapter2
    chapter2 --> chapter3
    chapter3 --> chapter4
  end
  subgraph partII["Part II: Transforms"]
    chapter5["5. What are transforms?"]
    chapter6["6. Map projections"]
    chapter7["7. Building pipelines"]
    chapter5 --> chapter6
    chapter6 --> chapter7
  end
  subgraph partIII["Part III: Queries"]
    chapter8["8. Split-apply-combine"]
    chapter9["9. Geospatial joins"]
    chapter8 --> chapter9
  end
  subgraph partIV["Part IV: Interpolation"]
    chapter10["10. Geospatial correlation"]
    chapter11["11. Simple interpolation"]
    chapter10 --> chapter11
  end
  subgraph partV["Part V: Applications"]
    chapter12["12. Mineral deposits"]
    chapter13["13. Agricultural fields"]
    chapter14["14. Petroleum reservoirs"]
    chapter12 --> chapter13
    chapter13 --> chapter14
  end
  partI --> partII
  partI --> partIII
  partI --> partIV
  partII --> partV
  partIII --> partV
  partIV --> partV

The chapters were written to be read in sequence, but there is some flexibility on how to read the parts. I recommend reading Part I first to understand the framework and vision. After that, you will have the necessary background to follow the code examples in Part II, Part III and Part IV. Finally, you can explore the applications in Part V to solidify the concepts.

Software installation

If you want to reproduce the examples in the book, copy and paste the code below in the Julia REPL:

using Pkg

# assert Julia version
@assert VERSION  v"1.10" "requires Julia v1.10 or later"

# create fresh environment
pkg"activate @GDSJL"

# install framework
pkg"add GeoStats@0.56.0"

# install IO module
pkg"add GeoIO@1.12.12"

# install artifacts
pkg"add GeoArtifacts@0.3.1"

# install viz modules
pkg"add CairoMakie@0.11.10"
pkg"add PairPlots@2.7.1"

# install other modules
pkg"add DataFrames@1.6.1"
pkg"add CSV@0.10.14"

If you need to reproduce the exact same environment with fixed versions of indirect dependencies, please download the Project.toml and Manifest.toml files that are stored on GitHub.

Some examples require data files that are also stored on GitHub at this link.

Click on any file of interest and press the download button.

Acknowledgements

I would like to acknowledge all the contributors of the GeoStats.jl framework. You are the reason this book exists! JuliaHearts The implementation of the framework is only possible thanks to the amazing programming language advances by Bezanson et al. (2017).

A special thanks to Elias Carvalho for his outstanding contributions to the software stack, to Prof. Douglas Mazzinghy (UFMG), Prof. Leandro Martínez (UNICAMP) and Prof. Fernando Moraes (UENF) for organizing the first training courses at universities, to Prof. Francisco Heron (UFC) for his contributions on high-performance computing, to Dr. João Pinelo (AIRCentre) for organizing the JuliaEO workshop, and to colleagues in industry who support this work through research and development projects, including Patrice Mazzoni, Keila Gonçalves, Fabio Duarte, Mariana Menezes, Givago Azevedo, Fernando Villanova, Luis Gomide.

Thanks to all the reviewers of the first draft, including Maciel Zortea, Max de Bayser, Bogumił Kamiński, Ronan Arraes, Erick Chacón Montalván, Kyle Beggs.