Don't know where your data is from? Bayesian modeling for unknown coordinates
Key takeaways
- An especially strong motivating case for the usage of spatial probability models comes from the mining industry.
- To illustrate the problem better, we will use a dataset of uranium and vanadium point-referenced concentration measurements from Walker Lake.
- Up to this point, you may have seen lots of examples of how Gaussian process models are use in robotics, spatial statistics, neuroscience, etc.
An especially strong motivating case for the usage of spatial probability models comes from the mining industry. During exploration for mineral resources, prospectors will take geologic samples by drilling holes and examining the resulting material for presence or concentration of valuable ores. These data typically show strong spatial correlation, but constructing a fully-detailed geophysical model is at times infeasible as we are able to observe very little of the underground conditions, though the advent of remote sensing techniques like ground-penetrating radar and gravimetry has dramatically improved our ability to characterize Earth’s subsurface. To address this challenge, we would like to construct a probability model which uses nearby data to predict a variable of interest at a new location.
To illustrate the problem better, we will use a dataset of uranium and vanadium point-referenced concentration measurements from Walker Lake. The data originate from Isaaks and Srivastava’s An Introduction to Applied Geostatistics and are distributed with the R package gstat.
Up to this point, you may have seen lots of examples of how Gaussian process models are use in robotics, spatial statistics, neuroscience, etc. Now, we work through a more exotic example which modifies the Gaussian process model to accommodate the case in which the actual location of our data points is not known precisely, and is only observed with substantial measurement noise. Spatial location error changes the covariance and prediction problem itself, a point emphasized in geostatistical work on location error and GP regression work on noisy spatial inputs by Cressie and Kornak and Cervone and Pillai.