Archive for October, 2019

Reflections on Government Data and the Invisible Hands

Sunday, October 6th, 2019

The core proposal of Robinson et al’s work is to promote operational change on how government should share its public data. They point out that the reason for U.S. government agencies tend to have out-of-date website and unusable data is due to regulation and spending too much effort on improving each agency’s own website. Thus, they propose to hand the interaction part of public data, to third-party innovators, who has far superior technology and experience on creating better user interface, innovative reusable data, and collection of users’ feedback.

Although, under current trend of U.S.’s regulation and laws of sharing public data, it is true if the distribution of public data is better operated by third party innovators for better distribution and surplus value creation. I would argue, however, their work is missing some perspective on U.S’s current public data.

The first is standardization, it is more urgent for a public data standard to come out from the government, to ensure data quality and usability, rather than distribution. The top complaining of public data is that even data from the same realm (economic data), can end up very differently from different agencies who published it. This create more severe issue on the usability and accountability of the data, than distributing the data. So. in order for government agencies to become good public data “publishers” in Robinson et al’s proposal, all government agencies have to come up with a universal understandable and usable data standard, rather than each agencies using their own standard, or left the most basic part of data handling to private sector.

The second issue from their proposal is credibility of the data. If all public data is handed over to the public by third-party innovators, for increasing their own competitiveness, they will modify the original data to match what the public want, in stead of the original unmodified data. This create credibility issue, since there is way less legislation and regulation on what third-party distributors can and cannot do to the originally published government data. And this modification is inevitable for third-party distributors, since at least they need to modify the original public data to fit in their database.

At the end, I do think commercializing public data distribution can promote effective use and reuse of public data. Meanwhile create problems in all business, privacy issue, “rat race”, and intended leading on the exposure of more public-interested product, etc.. It will have its pros and cons, but before government agencies can solve their data standardization issue, and regulations are built to supervise third-party distribution of public data. Whether there will be more pros of Robinson et al’s proposal than cons remains questionable.

Reflecting on The Cost(s) of Geospatial Open Data (Johnson et al, 2017)

Saturday, October 5th, 2019

This paper examines the rise of geospatial open data, particularly at the federal level. It looks at very concrete, monetary costs, such as resource costs, and staff time costs; it also looks at the less concrete and maybe less obvious, indirect costs of open data, such as when expectations are not met, and the potential for more corporate influence in the government.

 

In an economics class that I am currently taking, we discussed the seven methodological sins of economic research, and I believe some of these points can transcend disciplines. For instance, one of the sins is reliance on a single metric, such as a price or index. I think it’s important to note that when the authors of this paper were discussing costs, they did not just include monetary costs in their analysis. I believe the addition of the indirect costs is an important component to their argument and that these indirect costs present even more pressing issues than the direct costs do. I think it is very important to acknowledge the far-reaching and even harder-to-solve problems of the effects and influences of citizen engagement, the uneven access to information across regions, the influence of the private sector on government open data services, and the risks of public-private collusion through software and service licensing. 

 

A critique I have of the paper is that I believe the title to be a bit misleading in its simplicity. The title implies that the paper addresses geospatial open data cost across disciplines, whereas the paper addresses the costs only at the government level, and not any other level (for instance, perhaps looking at OSM or Zooniverse, if crowdsourcing/VGI falls under the same category as open data). The abstract, however, makes it very clear that the paper is only addressing issues caused by government-provided open data.

Thoughts on “The Cost(s) of Geospatial Open Data”

Friday, October 4th, 2019

This article framed the direct and indirect costs of geospatial open data provision, with the main focus on the four types of indirect costs. I found this article very thought-provoking because we often think of the benefits provided by open data whereas neglecting the pitfalls that it brings.

One point that particularly interests me was the data literacy issue. The article points out that there exist a number of barriers for users so that even though the data is open there is no guarantee of its use. Similarly, Janssen et al.’s (2012) article argues that these barriers pose the risk that open data is only publicized data in name but is still private in practice. Two points that I want to make here. First, while I understand the advocacy for better data quality and standardized data format, what I want to hear more about is that why does it matter for both researchers and the public to be able to use the data. One could argue that not many people would actually care and researchers are the group that those data meant for. Is public engagement in using and interpreting the open data instinctively good, or does it provide greater returns for the public? I think this could be better clarified here. Second, I’m curious about if VGI or crowdsourcing data belongs to the category of open data.  Dose the costs discussed in the article still apply to VGI and crowdsourcing data? It’s clear that some direct costs such as the cost of data collection could be avoided, but it seems to me that some other issues such as privacy and data quality could be intensified. I think this a question that worth to be discussed.