Spatial Machine Learning

An intro in applying machine learning techniques to spatial data with R

Justin Morgan Williams
14 min readDec 13, 2022
Photo by DeepMind on Unsplash


I became interested in data analysis/science by way of a Geographic Information Systems (GIS) course during my Environmental Policy and Sustainability Management Master’s at The New School. We utilized the industry standard proprietary software program ArcGIS, however after graduation, I lost access to this costly program. This is when I became obsessed with replicating the GIS workflow within an open source environment (see some of my other blogs, namely GIS project with Python and GeoPandas).

Naturally, having a love for GIS and data science, I began to tackle a few projects that applied machine learning concepts to spatial data. However, initially I wasn’t aware of the types of challenges unique to spatial data. According to Jiang¹, the following are aspects that make the application of machine learning concepts to spatial data a challenge:

  • Spatial autocorrelation — autocorrelation due to the similarity in location of the data’s spatial component
  • Spatial Heterogeneity — data not following identical distribution within the sample area
  • Limited Ground Truth — many explanatory variables, limited ground truth



Justin Morgan Williams

Data scientist passionate about the intersectionality of sustainability and data.