Dungan et al. (2002) explicitly define various terms related to scale. They offer a statistical approach to demonstrate that changes in the size of sampling or analysis units can affect detection of a phenomena.
The authors’ emphasis that the issue of scale is one of choosing the correct unit size is an important one. As geographers, we may take this distinction for granted as we may know through procedures like georeferencing that an acceptable Root Mean Square Error depends on the map’s scale, rather than aiming for the lowest RMSE. It can be difficult to convey to people from other walks of life that the best answer is not the most precise, but that it depends. Coming from a statistical approach, this is often the case when trying to train classifiers for remote sensing. Having a finely tuned sample spectral signature can result in overfitting, or a ‘pure’ sample being non-representative of the heterogeneity of units of the same kind. This overfitting may be statistically accurate and good, but produces results that are nonsensical in reality.
The issue of the MAUP is discussed and geostatistical methods considered. It is assumed that a full count or census of the extent is the control method and captures all significant patterns. If this is the case, I wonder if increased computational speed, data, and the general realm of data mining (or technological advances from geospatial cyberinfrastructure – parallel computing) can avoid the MAUP. This exploratory method need only consider a large extent to find all the patterns within it (arbitrary large extents are easy to choose). Does having all the data and being able to compute it all negate the need to consider appropriate sample unit sizes?