Thoughts on “Spatial data mining and geographic knowledge discovery—An introduction”

In this paper, Guo and Mennis explore common spatial data-mining tasks and their development. They point out towards the end that “we often claim to ‘let the data speak for themselves,'” but “data cannot tell stories unless we formulate appropriate questions to ask and use appropriate methods to solicit the answers from the data” (Guo & Mennis 407). They go on to claim that “data mining is data-driven but also, more importantly, human-centered… the abundance of spatial data provides exciting opportunities for new research directions but also demands caution in using these data” (Guo & Mennis 407). These contentions get to the heart of why GIS is a science and not merely a tool; if it were, then the sorts of questions asked and how they are answered would not matter nearly as much, since as a tool GIS would be primarily concerned with input and output instead of the process and context involved. These quotes also brought to mind the incredible potential of open-sourced data from social media sites, as well as their potential limitations. While such great quantities of data have been and continue to be quite useful in GIS research, it will always be important to understand the conclusions that can be taken from studies conducted with such data. One example of this comes from my father, who’s a geography professor. He had a student who wanted to use data from a social media platform (I think it was Instagram or some similar image-based site) to map where a certain plant was distributed in a national park. She planned to do this by taking the locations of all pictures mentioning this plant to get a sense of where it predominantly grows in the park. When she had conducted her analysis, she discovered something incredible: the plants grow in straight, long, narrow lines within the park. However, my father immediately saw the flaw in her conclusion: these geotagged pictures were taken along paths, and that’s what she was seeing! Therefore, as spatial data become larger and more accessible, we must be increasingly cautious with how we use them and draw conclusions from them, just as Guo and Mennis point out.

Comments are closed.