Thoughts on Spatial Data Mining Chapter (Shekhar et al.)

This chapter provided a review of several spatial data mining techniques, example datasets, and how equations can be adapted to deal specifically with spatial information. In the very beginning, the authors state that to address the uniqueness of spatial data, researchers would have to “create new algorithms or adapt existing ones.” Immediately, I thought about how these algorithms would be adapted; would the inputs be standardized to meet the pre-conditions of non-spatial statistics? Or would the equations themselves be adapted by adding new variables to account for differences in spatial data? The authors address these questions later in their explication of the different parts of the Logistic Spatial Autoregressive Model(SAR). 

When discussing location prediction, the authors state that “Crime analysis, cellular networks, and natural disasters such as fires, floods, droughts, vegetation diseases, and earthquakes are all examples of problems which require location prediction.” (Shankar et al. 5/23) Given the heterogeneity and diversity in these various data inputs, I was wondering how any level of standardization is achieved in SDA, and how interoperability is achieved when performing the same operations on such different data types. 

What I gathered from this chapter was that there is considerable nuance and specificity within each SDM technique. Given the diversity of applications for each technique, from species growth analysis to land use change, to urban transportation data, the choice of attribute that is included in the model greatly influences the subsequent precision of any observed correlation. (See example of Vegetation Durability over Vegetation Species for Location Prediction example) 

There was a clear link between SDM and data visualization, as illustrated by the following statement about visualizing outliers; “ there is a research need for effective presentations to facilitate the visualization of spatial relationships while highlighting spatial outliers.” Clearly, there is overlap between accurate spatial models and the  effective presentation of that data for the intended audience. 


Comments are closed.