Archive for the ‘506’ Category

Remote sensing uncertainty in GIS

Friday, April 5th, 2013

The article of G. G. Wilkinson is dated, and this is significant in a field that is rapidly evolving. Nonetheless, in my point of view, the author’s argument is still valid today. He talks about uncertainty and data structures in remote sensing and GIS. Sophisticated technologies and remote sensing don’t automatically solve the problem of delimitating boundaries. Even with technology development, classification is still a complex task. It is like trying to create boundaries where the world is actually maybe more like a continuous landscape. We are trying to define distinctive class of land cover or topographic zones for example, but in reality is there a frontier between different types of land? It partially explains why uncertainty is attach to any kind of techniques in remote sensing. Taking the limits of remote sensing techniques into account, the author evaluate different procedure and use of data structure. He thus suggests that part of the further development is to identifying the best techniques and technology development that will allow the best representation of the phenomenon that is intended to be represented by the remote sensing data. Although the problems of errors and uncertainty are unlikely to be solved easily even with technical development in data structures or with visualization techniques such as 3d environment and virtual reality.


Certainty of Uncertainty!

Thursday, April 4th, 2013

Helen Couclelis wrote an article called Certainty of Uncertainty and I think that David J. Unwin is making a similar point. The problem of uncertainty is not merely technical. Uncertainty doesn’t only come from data and information but it is also about geographical knowledge that is sometimes inevitably uncertain. There are things that we simply can’t know. The literature focus on finding technical solutions, but the author explains that “at the heart of all the contributions is a concern for exactly how we can usefully represent our geographic knowledge in the primitive world of the digital computer”.

As mentioned in previous discussion about ontology, we conceptualize the world as field or object based which correspond to raster or vector in GIS. The author shows that both representation comes with specific uncertainties. Furthermore, we discussed how delimitating boundaries is often a difficult task and uncertainty is inevitable. The conclusion is bringing us back to the first discussion in class about GIS as a tool or as science and the determinism of the technology. The author suggest that rethinking the way we use the technology and the way we structure problems and databases is essential to achieve sensitivity in GIS. It is about adapting the technology to represent knowledge in a way that would take into consideration our conceptualization of the world and not merely relying on GIS technology to calculate the world for us.

Couclelis, H. (2003). The Certainty of Uncertainty: GIS and the Limits of Geographic Knowledge. Transactions in GIS, 7(2), 165-175.



Thursday, April 4th, 2013

Uncertainty lies at the core of GISci where MacEachren et al. acknowledges the GISci community has given more attention to formalizing approaches to uncertainty than in other communities such as information visualization communities (p. 144). The authors go through several examples of how uncertainty can be visualized from changes in hue to symbols with different transparencies to depict where uncertain data may exist. What peaked my interest was the interactive visualization techniques that users can control depictions of uncertainty. Instead of permanently adding a layer of complexity that can obstruct and confuse the readers from what the data is trying to depict, the user is in full control of how much or little information (with regards to uncertainty) is available to them. To me this seems like a better solution than to simply find a single “ideal” ways to represent uncertainty visually in a static manner – especially since every individual will have their own preferences on what they think “best” means (context matters!). What I don’t quite agree with is the authors’ assertion that humans are not adept to using statistical information to make decisions and base on heuristics (based on a study in 1974). Since the quantitative revolution, hasn’t statistics been bought to the forefront of geography such that we may rely on statistics too much at this point? That being said, visualizing uncertainty can take on many forms, from charts, changes in opacity, 3D graphics where the way in which uncertainty should be viewed will ultimately be context specific to meet the goals of the researcher.


Integrating RS and GIS

Thursday, April 4th, 2013

Brivio et al. provides a case study where the integration of GIS and RS is able to compensate for limitations that may exist in each technology. The study provides a good example of how these two closely related fields can combine together to produce a more realistic representation of various phenomenon. While this case study specifically used additional GIS data as a supplementary component to improve on the RS classification of flooded areas, RS data can similarly be used to as a tool to produce GIS data (ex. land cover classification dataset derived from remote sensing data). However while there are many advantages in integrating the two, several issues come to mind. RS data is pixel based, while spatial data can be vector or raster based. To have to convert one to the other in order to do analysis would compound issues of accuracy and uncertainty. We know RS is already well acquainted with their own issues related to scale, noise and technological limitations, but these issues can quickly get amplified, and I can imagine that recognizing these sources of uncertainty will be difficult once the data thoroughly entangled in one another.  Also, what kind of data models is required for this integration? Spatial data is generally represented in 2D, while RS hyperspectral cubes are in several dimensions.  For the researcher whose interested in integrated such technologies, they have to be well versed in the inherent issues that each type of data presents to provide a comprehensive analysis – definitely no small feat.


Are We Certain that Uncertainty is the Problem?

Thursday, April 4th, 2013

Unwin‘s 1995 paper on uncertainty in GIS was a solid overview of some of the issues with data representation that might fly under the radar or be assumed without further comment in day-to-day analysis.  He discussed vector (or object) and raster (or field) data representations, and the underlying error inherent in the formats themselves, rather than the data, per se.

While the paper itself is clear and fairly thorough, I can’t help but question whether error and uncertainty are worth fretting over. Of course there is error, and there will always be error in a digital representation of a real-world phenomenon. Those people, such as scientists and policy makers, who rely on GIS outputs, are not oblivious to these representation flaws. For instance, raster data is constrained by resolution. It is foolhardy to assume that the land cover in every inch of a 30-meter grid cell is exactly uniform. It is also wrong to suggest that some highly mobile data (like a flu outbreak) would remain stationary over the course of the interval between sensing/mapping. There are ways around this, such as spatial and temporal interpolation algorithms and other spatial statistics, and I feel like estimates are often sufficient. If they aren’t, then perhaps the problem isn’t with the GIS, but rather in the data collection. Better data collection techniques, perhaps involving more remote sensing (physical geography) or closer fieldwork (social geography) would go far in lessening error and uncertainty.

With all of that said, I am not about to suggest that GIS is perfect. There is always room for growth and improvement. But, after all, the ultimate purpose of visualizing data is for understanding and gaining a mental picture of what is happening in the real world. An error-free or completely “certain” data representation is not only impossible within human limitations, but it is not particular necessary.

- JMonterey

Thursday, April 4th, 2013

No matter how good technology becomes, we will always face challenges in data uncertainty and error; the question is, can we develop appropriate techniques to mitigate the effects of these noises, and come away with the correct signal. As MacEachren et al. (2005) point out in their article titled “Visualizing Geospatial Information Uncertainty”, we use this information to base decisions off of, and the uncertainty is inherent in the data and must be taken into account.

There are multiple dimensions of uncertainty, as the authors point out, ranging from credibility of a source to precision of a physical variable, and these will compound, effecting the amount of correctness the end result will have. They function across many scales, including the direct attribute of the information, the specific context or location of the information (which may not be what you want to apply the information to), as well as temporally. It all seems very complicated when examined through this framework… but it is important to take these into account in order to have confidence in your product.


Personally, i have experienced a lot of uncertainty while trying to create a global map of administrative subdivisions. Every County collects data at different resolutions and time, however these countries are supposed to be contiguous as we well know. The borders do not always align, but who is right? Furthermore, this issue is compounded when you consider the global land mass as a whole. We want to have an accurate total area of land surface, however if you trust each country to represent their land correctly and then end up with an incorrect total, who is wrong? Where do you remove land? Where do you add it? These are some of the challenges I have faced with uncertainty, and I was not qualified to make the adequate decision.


What I didn’t do at the time was try to quantify and visualize the uncertainty, which as the authors say, is  crucial to making sure the data is useable, and that you are confident it is correct for answering the questions you are trying to answer.


Pointy McPolygon

What’s the hard part now?

Thursday, April 4th, 2013

Remote Sensing and GIS technology has changed significantly since Wilkinson (2007) wrote his review on how the two fields overlap. Hyperspectral imagery is now commonplace, and the software is well equipped to deal with it. Currently, we still struggle with handling error and uncertainty, but there are prescribed ways for dealing with each issue. Atmospheric conditions, topography, angle, sensor, and georeferencing are now done to eliminate some of the error caused through data collections. Things like fuzzy logic help to deal with uncertainty, although it remains an issue. As data collection techniques further improve, our ability to deal with this uncertainty will become less and less important.

Most of the current issues still lie in data models. The complementary nature of GIS and Remote Sensing is evident, however these two technologies speak different languages in situations where we expect them to communicate and enforce their complimentary relationship. This becomes even more difficult when we try to represent more complex relationships that are no longer 2-dimensional with hierarchical classifications. Personally, I find that the 2 commercial softwares for each technology interact quite well when performing simple tasks, like making a supervised classification and turning that into a GIS layer. However, when the data becomes more complex, and the classifications with them, the ability of the softwares to communicate with each other becomes increasingly bad.


Pointy McPolygon

Problems of classification

Thursday, April 4th, 2013

Since the paper by Wilkinson in 1996 many satellites have been put into orbits and several million GBs of satellite image have been collected. But more importantly, with the coming of the digital camera there has been an explosion in the amount of digital images that have been captured. Consequently, people were quick to spot the opportunity in leveraging the data from the images; hence a lot of research has been conducted in the image processing domain (mainly in biometrics and security). This being said, some of the most successful approaches in other domains have not been as well, when applied to satellite images. And the  challenges outlined in the paper still hold true today.

According to my understanding this is mainly because of the great diversity in satellite images. The resolution is only one part of the equation. The main problem lies in the diversity of the things being imaged. This makes it very difficult to come up with training samples that are a good fit. Thus, traditional Machine Learning techniques based on supervised learning have a hard time. Moreover, the problem is compounded by the fact that when we are classifying satellite images, we are generally interested in extracting not one, but several classes simultaneously with great accuracy. However, the algorithms do perform well when classification is performed one image at a time but significant human involvement is needed to select good training samples for each image. But to the best of my knowledge no technique exists which can completely automatically classify satellite images.

-Dipto Sarkar

Visualizing Uncertainty: mis-addressed?

Wednesday, April 3rd, 2013

“Visualizing Geospatial Information Uncertainty…” by  MacEachren et al. presents a good overall view of geospatial information uncertainty and how to visualize it. However that said, many parts seemed to convey that all uncertainty must be defined in order to make correct decisions. In my realm of study, although it would be nice to eliminate or place uncertainty in a category, just the recognition that there is uncertainty is often definition enough to make informed decisions based on the observed trends. Furthermore, the authors seem to separate the different aspects of the environment or factors that lead to uncertainty and how it may be visualized. The use of many definitions and descriptions convolutes what is really the factors that result in uncertainty and the resulting issues with visualization. The way visualized uncertainty is presented greatly contrasts the ambiguity of the definitions behind uncertainty and its representation presented by the authors. The studies and ways uncertainty can be visualized is a great help in decision making and the recognition of further uncertainties.

One aspect that would have help in addressing uncertainty and its visualization would have been to integrate ideas and knowledge from the new emerging field of ecological stoichiometry, which looks at uncertainty, the flow of nutrients and energy, and the balance within ecosystems to answer and depict uncertainty. I believe that ecological stoichiometry would address many of the challenges in identification, representation and translation of uncertainty within GIS and help to clarify many problems. This stoichiometric approach falls along the scheme of the multi-disciplinary approach to uncertainty visualization described within the article.  However, as the article is limited to more generally understood approaches, rather than more complex ones, such as stoichiometry, do some of the proposed challenges in recognition and visualization of uncertainty not exist?  I would argue yes, but then again more challenges may arise in depiction, understanding and translation of uncertainty.


Error prone GIS

Monday, April 1st, 2013

In any data related field great efforts are put into ensuring the quality and integrity of the data being used. It has long been recognized that the quality of results can only be as good as the data itself, moreover, the quality of data is no better than the worst apple in the lot. Hence, for any data intensive field great efforts are put into data pre-processing to understand and improve the quality of the data. GIS is no exception when it comes to being cautious about the data.

The various kinds of data being handled in GIS makes the problem of errors more profound. Not only does GIS work with vector and raster data, it also needs to handle data in forms of tables. Moreover, the way the data is procured and converted is also a concern. Many a times data is obtained from external sources in the form of tables of incidences that have some filed(s) containing the location of the event. Usually this data was not collected with the specific purpose of being analysed for spatial patterns, hence, the location accuracy of the events are greatly varied. Thus, when these files are converted into shapefiles, it inherits the inaccuracy inbuilt in the data-set.

One of the things to remember however is, that the aim of GIS is to abstract reality to a form which can be understood and analysed efficiently. Thus it is important not to lay too much emphasis on how accurately the data fits the real world. The emphasis on the other hand should be to find out the level of abstraction that is ideal for the application scenario and then understand the errors that can be accepted at that level of abstraction.

-Dipto Sarkar

To Digitize or Not to Digitize?

Thursday, March 21st, 2013

No process in GIS perfect. There are always limitations, many of which can be ignored, while others must at least be acknowledged. The application of the results of an analysis can have a drastic impact on whether errors are ignored, acknowledged, or painstakingly resolved. Consider  the difference between geocoding addresses at the national level to analyze socioeconomic trends. The power of numbers will outweigh the error generated during the processing, but it is enough to acknowledge the limitation. On the other hand, the example of the Enhanced 911 system requires that all addresses be geocoded as precisely as possible for times of emergency.

As a way of increasing the accuracy of geocoding processes, would it be sufficient to input a number of intermediary points as a way to accommodate for the uneven distribution of addresses within a given address segment. It would essentially act as the middle ground between leaving the results entirely up to geocoding and digitizing all addresses manually. After all, why do the corners of blocks have to act as the only reference point? It’s possible that there is an inherent topology that would be lost if this was to be implemented, but I cannot speak to that.

While reading Goldberg et al., one geocoding nightmare kept running through my head. It surprised me that it was not touched on directly. How has Japan addressed the situation? As an OECD country, it likely possesses sufficient GIS infrastructure. If I’m not mistaken, though, house addresses are not based on location so much as time. Within prefectures, addresses are assigned temporally, whereby the oldest structure has a lower value than a newer structure, even if they are immediately adjacent, two structures can have significantly different addresses. Just a thought.


Statutory warning: Geocoding may be prone to errors

Thursday, March 21st, 2013

The last few years have seen tremendous growth in the usage of Spatial Data. Innumerable applications have contributed to the gathering of spatial information from the public. Application’s people use every day like Facebook and Flickr have also introduced features with which one can report their location. However, people are not generally interested in geographic lat-long. Names of places make more sense in a day to day life. Hence, all the applications report not the spatial co-ordinates but the named location (at different scale) where the person is. The tremendous amounts of location information generated have not gone unnoticed and several researches have been conducted to leverage this information. But, one issue that is frequently overlooked in researches that use these locations is the accuracy of the geocoding service that was used to get the named locations. Not only is displacement a problem but scale at which the location was geocoded will also have an effect on the study. The comparison of the various accuracy of the available geocoding services done by Roongpiboonsopit et. al. serves as a warning to anyone using the geocoded results.

-Dipto Sarkar


Is There a Problem with “Authoritative” Geocoding?

Thursday, March 21st, 2013

Roongpiboonsopit and Karimi provide a very interesting study on the quality of five relatively known geocoding services.  Google Maps is something I use very often, however I never really critically thought about the possible errors that may exist and their consequences.  A study such as this allows us to understand the underlying parameters that go into these geocoding services and how they may differ from provider to provider.  One aspect that was really interesting to me was the difference in positional accuracy of different land uses.  Obviously, there tends to be an “urban bias,” of sorts, when geocoding addresses.  As a result, one is more likely to get an incorrect result when searching for address in rural/suburban areas.  While this makes sense due to spatial issues, I thought that this could theoretically be extending to other criteria.  While LBS becomes more popular and geocoding increases in importance, will certain companies offer “better” geocoding services to businesses that are willing to pay for it?  For example, Starbucks could make a deal with Google to ensure that all of their locations are correctly and precisely geocoded.  Taking it to the extreme, Google could even make a deal to deliberately sabotage the addresses of other coffee shops.  While I think this specific case may be unlikely, it does raise issues about having completely authoritative geocoding services.  As we increasingly rely on these geocoding services, the companies offering them have a large influence on the people who use them.

This leads into the idea of possibly relying on participatory services, such as Open Street Map.  OSM has made leaps and bounds in terms of quantity and quality of spatial data over the past few years.  I am curious to see how it would match up with the five services in this paper.  OSM relies on the ability of users to edit data if they feel it is incorrect.  Therefore, the service is theoretically consistently being updated depending on the number of users editing a certain area.  As a result, errors may be less likely to be consistently missed, as with the case of a more authoritative geocoding service.  It would also be interesting to see the type of buildings that may be geocoded more or less accurately.  As we continue to enter this age of open and crowd sourced spatial data, I believe it has the potential to provide us with even better services.



Unloading on Geocoding

Thursday, March 21st, 2013

Geocoding, like many of the concepts that we study in GIScience, is very dependant on the purpose of the process. The act of geocoding is often confused with address matching, which is sometimes correct, however it can also be georeferencing any geographic object and not just postal codes. This implies that the perception of geocoding will affect the ways that we go about doing it.


There are many ways to geocode, as described by Goldberg, Wilson, and Knoblock, and no single one of them is universally correct. Each method uses different algorithms to try to match some identifier to a geographic reference. For example, it might find the length and endpoints of a street and then use a linear interpolation to find the location of a given postal code. The geographical context also bears a great importance in determining which algorithm to use. For example, the method described above may work better in a city with short, rectangular blocks, however it may be less applicable in rural China. These are some of the things that one has to consider when choosing a method of geocoding.


The future of geocoding is perhaps less certain than many of the other GISciences, because as technology and georeferencing becomes more ingrained in our society, the algorithms used to match these objects with a geographic location will become less important. Things like GPS are becoming more and more commonplace in many appliances, however this brings up questions of privacy. Ultimately, the future of geocoding will be a balancing acts of tradeoffs between public acceptance of technology and the development of more powerful and purpose-driven algorithms.


Pointy McPolygon


Initiate the Geocoding Sequence

Thursday, March 21st, 2013

Geocoding is the fascinating process of associating an address or place name with geographic coordinates. Traditionally, Geocoding was solely the realm of specialists, requiring a specific set of skills and equipment. However, with the advent of modern technology, including Web 2.0 applications, Geocoding is now easier than ever for the everyday user. However, despite the multitude of Geocoding services, such as Google and MapQuest, each service uses different algorithms, databases, etc. to code their locations. Therefore, users might not be aware of which services offer the best quality results, or on the contrary, may offer innacurate results. The quality of Geocoding results may in turn affect subsequent decisions, modeling, analysis, etc.

Overall, one of the biggest problems facing Geocoding is the accuracy of the results. In particular, one problem mentioned by the authors was the poor accuracy of addresses located in rural, agricultural, and urban areas. On the other hand, most urban locations tended to be geocoded similarly across platforms. In addition, it was also interesting to note that several platforms consistently offered more accurate results: Yahoo!, MapPoint, and Google. It would be fascinating to investigate what type of geocoding algorythms, databases, etc. these services use, and if they are similar or relatively different.

Another fascinating trend to consider is the future of direction of Geocoding. One possibility could be the standardization of geocoding databases, algorithms etc. On the other hand, this in turn may lead to redundancies in geocoding services, which might not be a realistic outcome. Overall, the future of Geocoding as a useful tool is heavily dependent on how useful and accurate the results can be.

-Victor Manuel


Thursday, March 21st, 2013

The article of Roongpiboonsopit and Karimi highlights the fact that ubiquitous mapping and new practices facilitated by the technology allow everybody to geocode data without really knowing what is happening ‘behind’ the geocoder tools (, MapQuest, Google, MapPoint, and Yahoo!).

Coding is defined in the article as “applying a rule for converting a piece of information into another”. But who controls the rules ? Users have little control over the process since they don’t interact with the geocoding algorithms and the reference databases.

The authors’ analysis shows the importance of questioning the tool used to produce data because errors and uncertainties related to the data produced have an impact on further analysis and decision making. These points relate to the neogeography literature and critics of GIS discussed during the class. More specifically the debates over ethics and practices of ‘open participation’ as a democratization or as an exploitation of user-generated production of surplus of data. In VGI, is geographic information being produced by ‘citizen censors’ or by ‘cognizant individuals’ as mentioned by Andrew Turner (in Wilson and Graham, 2013)? The two different terms underline the question of how aware citizens are when they produce data and geocoding information. This leads to the question of accuracy and how much we need accuracy. It is probably not always important to achieve a perfect accuracy. However, I think that it is crucial to be aware of the lack of accuracy and to make the uncertainties explicit in contrast of leaving it ‘behind’ the tools.

Furthermore, the uneven results generated by the geocoding processes depending on location can also be linked to the debates discussed earlier in class about the digital divide in GIScience. Data are more accurate in urban and suburban areas than in rural areas due to the quality of reference databases in urban areas. Again, I think that there is a need to make these differences more explicit to users or/and producers in order to bridge the gap between experts and amateurs’ production of knowledge.

Wilson, M. W. et M. Graham (2013). Neogeography and volunteered geographic information: a conversation with Michael Goodchild and Andrew Turner. Environment and Planning A, 45(1), 10-18.



Thursday, March 21st, 2013

Goldberg et al. provides an overview of geocoding and touches on several aspects that relate to GISci. For instance, the authors elaborate on how the evolution of the reference dataset has allowed for more complex and resourceful queries to geocode postal codes, landmarks, buildings and so on. The underlying cause of the shift has been attributed to the reference dataset. Not only have datasets become more complex, the increasing technological power has paved the way for intricate matching algorithms, and interpolation processes. Perhaps the next shift within the geocoding realm will be the increasing integration of context aware technologies. Now that people are progressively contributing geospatial content, issues of cost, money and time are significantly reduced. The authors suggest that once GPS technology is accurate (and affordable) enough to be equipped in all mobile phones, postal codes may become obsolete. But instead of rendering postal addresses obsolete, are we at the moment where LBS and AR can add to the accuracy of reference files, especially if volunteered geographic information is on the rise? I’m curious to see how the new ways in which data is being provided/created (in large amounts, provided by many people) intersects with geocoding today. Now that we’ve seen an evolution in the data models and algorithm complexities that contribute to geocoding, I can imagine that volunteered geographic information from a large number of people can be a new method to geocode but ultimately bringing with it a new set of complexities in data accuracy and reliability issues.


Putting Geography on the Map

Wednesday, March 20th, 2013

Roongpiboonsopit and Karimi’s 2010 comparison study of five free online geocoders is an example of an important process in weeding out poor geocoders and creating one that works accurately and reliably. Considering geocoding is at the heart of many GIS applications, especially those involving the Geoweb, spatial accuracy is key. The authors used empirical analysis to determine the “best” geocoders based on accuracy and a number of other metrics. They concluded that Google, MapPoint (Microsoft) and Yahoo! are all quite effective, while and MapQuest are less so.

In thinking about geocoding and the development aspects of the geocoding process, I realized that geocoding development is much like development in other topics we’ve covered, such as LBS and AR. As all of these progress and become more accurate and lifelike, they are approaching a level of artificial intelligence that is simultaneously creepy and cool. For instance, if a geocoder like Google’s uses all of the information it already has on us, there will not be any need for formal geographic indicators like street address, coordinates, or even official place names. If the reference database were to include our own vernacular and preferences along with the official names and spatial reference algorithms, then simply typing or saying “home” would pinpoint our location on a map quickly and easily. This isn’t even about spatial accuracy anymore, but rather “mental” accuracy. Perhaps I’m getting ahead of myself, but I see the possibilities with geocoding not just in terms of developing a usable application for plotting points on a map, but also in terms of expanding how we think of space (ontologies) and how we conceptualize our environment (cognition). Integrating new tools into the pre-existing algorithms has and will continue to change how we live our lives.

- JMonterey

Geocoding Errors

Tuesday, March 19th, 2013

Goldberg et al.’s article “ From Text to Geographic Coordinates: The Current State of Geocoding” demonstrates an in-depth view of recent geocoding, its process and the errors involved in geocoding. That said, I wish that more discussed on the errors occurring in geocoding resulting from database managers “correcting” system errors in coding. In essence, when an error is “corrected” by a database manager, future geocoding tries to reconcile the changes and often leads to more error as the system tries to place other data into the “corrected” group or arrange the data to make sense next to the “correction”.

I have experienced this problem first hand when trying to geocode new points within a previous geocoding database. What happened  is that  a previous manager “corrected” different geocoding errors by manually entering data points as a new category, which conflicted with a several other previous categories within the database. Therefore, when I entered my points they were all coded as the new category, and  located in the wrong areas, since the manual fixes superseded  the software operational coding for placement when they were not equal to a known location. If I had not gone back to cross reference the points, I would never have found the geocoding discrepancies and “corrected” the points (although this may cause future errors, which were not apparent) .

In the article, the mention of E911 helping with address accuracy is an important step to reducing error, but I believe is irrelevant since technology with GPS sensors are becoming standard on most utility entry points at every address. For example, Hydro-Quebec is installing digital meters with GPS, pinging capability and wireless communications. These devices are then geocoded to a database, and therefore could provide accurate accessible location referencing that is self correcting, and thus reducing error for every address. As such, is the proliferation of intelligent technology reducing the error or adding to the complexity of geocoding?

Geocoding and public health

Monday, March 18th, 2013

I think Geocoding is one of the most central issues in GIS (science and systems) and yet is probably one of the less well understood issues for lay people and non-experts who use spatial data (along with map projections). As the authors mentioned, there has been research in public health and epidemiology on the accuracy of street addresses. In fact, a PhD student in my lab lead a research project on this very issue here in Montreal (Zinszer et al. 2010, cited below). The team found that address errors were present in about 10% of public health records, the same ones that were used to perform spatial analysis to look for space-time clustering of campylobacteriosis in Montreal. Geocoding has all kinds of repercussions in public health research; while errors are an issue, anyone who performs epidemiological research with administrative databases are prepared to have some amount of error. However, when the error becomes differential with respect to some factor of interest, this can result in a huge problem (bias). For example, as mentioned in the discussion, the accuracy of geocoding was differential between urban, suburban, and rural areas. There is a lot of spatial epi research done with the urban-rural health divide in mind, and differential accuracy in geocoded addresses like this could pose a huge problem. I think papers like this one by Roongpiboonsopit and Karimi are very useful for people outside of GIS because they help us understand the scope of the issue. I also think Roongpiboonsopit is a super awesome name.

Zinszer K, Jauvin C, Verma A, Bedard L, Allard R, Schwartzman K, de Montigny L, Charland K, Buckeridge DL (2010). Residential address errors in public health surveillance data: A description and analysis of the impact on geocoding. Spatial and Spatio-temporal Epidemiology. 1(2-3): 163-168.