Making Use of Heterogeneous, Qualitative Data on the Neogeoweb

Source: xkcd

Elwood reviews areas of GIScience that are crying out for advancements in the wake of new, Web 2.0 technologies and the flood of big data coming with them. More and more user-generated content is being created every day, and a great deal of it is embedded with explicit or implicit geospatial information.  Two key stumbling blocks to harnessing these new sources of data are the heterogeneous, unstandardized nature of Web 2.0 content, and the qualitative nature of spatial knowledge within the content.

Heterogeneity of geospatial data on the web is an acute issue: because content is user-generated, the range of standards and conventions used in terms of communicating information is huge.  When you factor in the equally diverse, and growing, number of Web 2.0 platforms and services, things are even more daunting.  If there’s a certain class of content you’re interested in analyzing, how do you make a structure out of what is mostly unstructured text? How do you standardize when the number of standards is longer than your arm? As the XKCD comic suggests, creating a new standard us typically an exercise in futility.

The qualitative nature of spatial information and meaning is another stumbling block in the use of data from the Geoweb. As an example of this, suppose you want to geocode a (location-disabled! Ha!) tweet of mine that says “Home, finally :)”. Even if you have detailed biographical and address information about me, you will still encounter ambiguities. By “Home”, do I mean my house? My neighbourhood? My hometown? My residence while I’m at school? My parents’ region of origin where my extended family lives? How do you even begin to automate the processing of such context-dependent information? And is there any existing spatial data model that can effectively represent the concept of home?

It is often said that while Geoweb 2.0 technologies are accessible, engaging and intuitive for a much greater number of people than traditional GIS has ever been, analysis tools for this exciting new medium lag behind GIS.  This is in no small part due to the issues associated with organizing the web of heterogeneous, qualitative data that these new tools are producing.  Though this challenge looms large, Elwood is optimistic that GIScience can rise to the occasion if we take a multidisciplinary approach.


Comments are closed.