Archive for November, 2019

Thoughts on Parker et al., (2007) “Class Places and Place Classes: Geodemographics and the Spatialization of Class”

Sunday, November 10th, 2019

The article “Class Places and Place Classes: Geodemographics and the Spatialization of Class” by Parker et al., (2007) introduced the concept of geodemographics and on a small research study that focused on geodemographic classification, the relationship between ‘class places’ and ‘places classes’ and their co-construction.

I found this article to be quite interesting, particularly the parts that touched on the merge of this type of data with web technologies. This would allow for data to be interacted with by a larger portion of the population with greater ease.

After reading this article I was left with a few questions. One of these being what the effects of this type of data are. Taking the case of this article and the research study done, this sort of generalization of residents in urban neighborhoods to create a classification seems problematic. As the information gets used, people’s perceptions of places are based on census data. I feel this highlights socioeconomic differences in urban settings and further divides populations based on differences.

Thoughts on complexity

Sunday, November 10th, 2019

Steven’s article gives an overview of the complexity theory. The author argues that there is no single complexity theory because there are different kinds of complexity that have different or even conflicting assumptions and conclusions. Three types of complexity are discussed by the authors: algorithmic complexity, deterministic complexity, and aggregate complexity.

I am not sure if I fully understood the concept of complexity even though the title of this article is “simplifying complexity”. Tons of questions remain to me after reading this article. Before talking bout my questions, there are certain points that interest me. First, the author states that complexity theory and general systems theory are both anti-reductionism and interconnectedness of the system, whereas one of the differences is that complexity research uses techniques such as artificial intelligence to examine quantitative characteristics, while general systems theory that only concerns qualities. I’ve never thought about it this way before as I believe that AI is a quantitative method that can make inferences about qualitative attributes. In this sense, the qualitative and quantitative parts do not differentiate the two, because general systems theory also has the ability to make qualitative inferences. Second, the author mentioned the deterministic complexity, which means a few key variables related through a set of known equations can describe the behavior of a complex system. I wonder deterministic complexity is also a kind of reductionism because it tires to describing a complex system by equations and variables, which goes against the anti-reductionism notion of complexity. Third, the author mentions that a complex system is not beholden to the environment – it actively shapes, reacts and anticipates. This reminds me of the machine learning algorithm that activity adapts to the data it saw. It seems that this is a way of approaching complexity.

Main questions I have are
1. If there are different kinds of complexity that sometimes conflict with each other, what is actually the complexity?
2. Is every generalization we made a reductionism in some way? If so, isn’t all the research, even the complexity research anti-complexity?
3. What can complexity theory offer us? Does it complicate the analysis or does it offers us a more sophisticated way of approaching a problem?

Poking holes in parkers “Class Place and Place Class”

Sunday, November 10th, 2019

Based on Parker et als 2007 paper “CLASS PLACES AND PLACE CLASSES Geodemographics and the spatialization of class”, geodemographics is clearly a well constructed field that does an excellent job of addressing the nature of spatial clustering of different demographics. However, the paper itself has a few flaws that confuse me as to why it was written. Two point especially gave me pause as I read this paper.

Within the second section, the authors write “Now, this is all very interesting, but what does it have to do with the analytic concerns of this journal and, in particular, the current focus in this issue on the social science of urban informatics? Our argument is that this ‘spatial turn’ in the sociology of class – the clustering of people with a similar a habitus into what we might think of as ‘class places’ – is connected in a number of important ways with the ongoing informatization of place (Burrows & Ellison 2004), particularly as manifest in the urban informatics technology of geodemographics (Harris et al. 2005; Burrows & Gane 2006).” The tone of the writing in this section is the first hint of issue with this section. By striking a conversational tone in a paper that purports to prove the analytical worth of geodemographics, I believe the authors are taking away from their final argument. They also imply here that the spatial clustering of class places is a new phenomenon, something that is not addressed anywhere else in the paper.

Parker et als paper presents itself as a research paper, but I believe it might be better classified as a overview/write up of the field. Their research methods were also questionable, as the highly qualitative nature of their work and their extremely small sample size meant that the robustness of their research was not particularly strong. While there is nothing wrong with a paper that takes this approach per say, the authors have stated that their overall goal is to show the quantitative value of geodemographic techniques, something that they do not accomplish here.

Thoughts on “Turcotte – Modeling geocomplexity?: “ A new kind of science .””

Saturday, November 9th, 2019

This article by Turcotte emphasized the importance of fractals in the understanding of geological processes as opposed to statistical equations, which cannot always explain geological patterns.

Although this reading provided insight into how various situations are modeled and how statistical modelling plays an important role into understanding the geophysics or our planet, geocomplexity as a whole still remains a rather abstract concept to me. The article provided some illustrations that greatly helped my comprehension, but more would be necessary to better comprehend some concepts. Illustrating complexity may be complex in itself, but

Will we find new statistical formulas to model problems we couldn’t model in the past? How we understand and conceptualize Earth plays a vital role into how GIScientists are able to push for further knowledge. Recent technological advances in quantum computing, artificial intelligence and increasing supercomputing capabilities open the door for further innovation in the field. For example, geological instability could better be understood. In those scenarios, could weather or earthquakes become more predictable? Further advances in related fields such as geophysics and geology will also greatly contribute to GIScience.

The concept of chaos theory is also very intriguing to me, a theory I’d never heard of before. A quote from Lorenz greatly helped me understand the concept: “When the present determines the future, but the approximate present does not approximately determine the future”, meaning small changes in the initial state have an effect on the final state of a particular event.

Reflection on “The Impact of Social Factors….On Carbon Dioxide Emissions” (Baiocchi et al., 2010)

Saturday, November 9th, 2019

In Baiocchi et al.’s piece they analyze geodemographic data to better understand the direct and indirect CO2 emissions associated with different lifestyles in the UK. They open the piece by listing criticisms in the field of environmental input-output models, namely that there is too much literature dependent on top-down classification, too much emphasis on consumer responsibility, too much literature with entirely descriptive analyses, and that the term ‘lifestyle’ is defined by expenditures, which ignores human activity patterns. Using geodemographic data as a basis for their study mitigates the potential harm from these criticisms.

One thing I noticed about this paper was how it used geodemographic data as a way to create a bottom-up procedure for their research. Historically, the fields of geography and cartography have been very top-down in nature, with little, if any input from “non-experts”. One of the ways GIS has been so revolutionary and popular is that it is redefining how and what people can contribute, and today there is ample opportunity for “non-experts” to be involved. As geodemographic data was around long before GIS existed, I did not initially realize how it could contribute to more bottom-up approaches. Now, I know, among other reasons, that there is open data almost everywhere, making it much easier to access and understand, and that GIS technology in general is much easier to access and understand than ever before.

I’ll end my reflection with a few general questions about geodemographics. Specifically, what is the difference between demographics and geodemographics? Doesn’t all demographic data have some sort of location/geographical component? 


The Impact of Social Factors and Consumer Behavior on CO2 Emissions in the UK

Saturday, November 9th, 2019

This is an interesting case study using geodemographic data to analyze social economical factors’ impact on carbon dioxide emissions. Regardless of how carbon dioxide emissions affected by different social economic determinants, I am curious about the original geodemographic data used for further analysis. The study uses geodemographic data in ACORN database and conducts research based on lifestyle classification, and my question is what lifestyle exactly means and what rules are based on for ACORN classification defined. Also, the socioeconomic variables used in regression analysis are from wider categories of housing, families, education, work and finance, etc. Are these the typical variables or objects which geodemographic theory usually deals with and are these the research contents that demographic researchers focus on? Moreover, the geodemographic data are coded at the postal code level which could be explained as scale that the data built on. Is there any possibility that regression analysis results of what impact CO2 emissions would change if the scale changed? Does policy districts or postal code allocation rule play a role of noise in the analysis.

Another thing I want to point out is that we did learn about human mobility in last week seminar and could movement theory study be applied into studying geodemographic in aspect of changing over time, or it does not matter in developing the geodemographic theory.

Thoughts on “Parker et. al – Class Places and Place Classes: Geodemographics and the spatialization of class”

Friday, November 8th, 2019

As with a wide variety of other research fields within the confines of GIScience, it will be interesting to see how geodemographics may change with technological advances in machine-learning. An example could be with the delineation of boundaries between clusters, which could be fractured or combined based on reasoning that could be quite difficult to understand for humans. These geodemographic generalizations of space could also be continuously computerized in a not so distant future, which could lead to an ever changing assessment of neighborhoods on a very short temporal scale. Micro-level analysis could also allow for a better representation of a neighborhood based on recent population inflow or outflow data, data that becomes increasingly accessible in the era of the Internet of Things (IoT).

The thresholds used to assess whether a neighborhood is more closely related to x rather than to need to be defined quantitatively, which forces a certain cutoff and brings in a little subjectivity. An example could be demonstrated with the occurrence of a natural disaster in a hypothetical neighborhood, which could lead to a sufficient devaluation of houses to warrant changing how the neighborhood is characterized. In that case, a population possibly once seen as energetic and lively (or as defined by Parker et. al as a live/tame zone) could be completely changed to a dead/wild zone from one day to the next. Although these would be reassessed at some point in time by corporations or the government, technological advancements grant the ability to reassess neighborhoods much more rapidly.

As someone not well versed in the conceptualization of geodemographics, it becomes apparent that a balance needs to be made between the number of classes needed and the level of representativity desired; after all, every household could be considered unique enough to warrant its own neighborhood. Future advances in the field might incorporate a three-dimensional analysis of neighborhoods in densely populated urban centers, as residential skyscrapers present vertical spatial clustering.

Simplify complexity: A review of complexity theory

Friday, November 8th, 2019

It is really a good paper that reviews the principle research fields in complexity theory with clear structure and simplified explanation, uncovering the nature of complexity. Generally, complexity theory describes objects with nonlinear relationships between changing entities with qualitative characteristics examined and how interactions change over time, etc. And the author breaks the complexity theory into three major parts: algorithm complexity, deterministic complexity and aggerate complexity. However, I am still a little confused about why it should be divided into those three parts? Is that possible if we think about and explain well complexity theory in time complexity and spatial complexity?

Should most of research question have to take into account the complexity theory because most of objects in natural environment and human society do have the general characteristics that complexity theory deals with? And it is really interesting when talking about self-organized system that will receive balance between randomness and stasis like peatlands ecosystem. But how complexity theory that helps explore the self-organization in physical geosicence be applied in social economical study is really appealing. There is still some unclear space in complexity theory study. How could new developing techniques like GeoAI and spatial data mining that extract more hidden knowledge help complexity step further? These are all interesting and exciting questions to be answered in the future.

Thoughts on “Simplifying Complexity” (Manson, 2001)

Thursday, November 7th, 2019

As someone with very little knowledge on complexity theory before reading this article, I think Manson’s piece offers a solid introduction to the concept. I can see how complexity theory directly relates to geographic and GIScience problems. It all comes back to Tobler’s First Law of Geography, as geography creates complexity not replicated in other disciplines. 

“The past is not the present” and “complicated does not equal complex” are two concepts we have discussed at length in class. Regarding the first statement, complexity theory looks at entities as in a constant state of flux, and could thus reduce the problems associated with the “past = present” assumption; for instance, a common issue here is assuming that the locations of past actions will be the same as the present ones, among others. Regarding the second statement, this article was written in 2001, before big, complex, data was around like it is today, especially concerning its variety, veracity, value, volume, and velocity. Big data is complex, but not complicated. There are methods and technologies to more easily analyze this data; however, technology and complexity theory must keep up for researchers to continue to adequately analyze it. 

Other forms of movement?

Monday, November 4th, 2019

After reading Millers overview of the field of movement theory, I’m left wondering why certain objects are not included in this field. We live in a dynamic universe, in which almost everything is constantly in motion. While only a certain set of actors operate on a human/animal scale and have similar patterns, I wonder if the proposed field of movement theory might benefit from a broader perspective.

Other types of movement might include biological movement on a small scale, such as viruses and bacteria inside the body. On a large scale, geomorphologists examine changes in landscape and the evolution of vegetation patterns over time. Avalanches represent that same type of movement sped up, and glaciers represent it at its slowest. The lines between all these types of movement interact, and while they may operate on different spatiotemporal scales they heavily influence each other.

On the grandest of scales, this phenomenon can be abstracted even further. Planets, star systems, and galaxies all are in constant movement and interact with each other heavily. At a subatomic scale we see a similar dynamism as particles bounce off each other at unimaginable speeds. While we are separated from both these types of movement by logarithmic scales of space and time, they represent the same constant flow we surround ourselves with.

How do researchers decide what types of movement are worthy of being included within the study of movement? Where exactly are the edges of this field? All of these systems interact with each other enormously. While I see the value in having a limited definition of movement that allows for comparison between different biological models of movement, I feel that a grand theory may be difficult to create due to the enormous complexity of the system that surrounds us.

Uncertainty about uncertainty

Monday, November 4th, 2019

Fischers “Approaches to spatial data quality” is a fascinating paper, as it manages to become something of an academic onomatopoeia. The paper explores various definitions of spatial data quality, and attempts to tease out the distinctions between different forms of spatial data uncertainty. I would argue that by doing so, it manages to aptly demonstrate why uncertainty exists. Each definition and subsection within this chapter leaves room for interpretation, and possesses blurry edges. The mission to define why something cannot be clearly defined is a tall order, and it makes sense that it would result in such a confusing set of arbitrary categorizations. While this is of course an exercise in semantics, it is a necessary one, if only for the purposes of proving a point.

the unified theory of movement is here

Sunday, November 3rd, 2019

This is the only blog post I’ve actually wanted to write.

All things are dynamic. In our last class, Corey showed that even though we were equipped to portray a river as a dynamic feature, we did so statically. I bet we did this because of the numerous ways we are told to think of our world as static. Relationships are inherently dynamic, but we have static statuses to represent sometimes extensive periods. We take repeated single-point observations to measure some natural phenomena, then interpolate to fill in the blanks. But what are these blanks: evidence of dynamism. Since all phenomena are actually dynamic; falling down some temporal gradient — not dissimilar to Miller’s space-time cubes concept.

Miller brings up scale issues in movement. Traditional movement scientists like kinesiologists, physiotherapists, or occupational therapists think of movement on completely different scales than do movement ecologists. In fact they have a different semantic representation of movement as well, often related to the individual irrespective of the environment. Geographers and human mobility researchers have their own ideas about drivers and detractors of movement that run contrary to ecologists conceptualizations. So, how do we move toward an integrated science of movement? The best option is to start thinking about movement as fractal patterns. There’s a primatologist at Kyoto studying just that in penguins (which are not primates ….) to get an understanding of interactions of complexity, scale, movement, and penguin deep-diving behaviour. Think about this: this researcher is interested in how movement is invariant across scale and can explain behaviour as a complex phenomena. There’s already a unified theory of movement — it’s called fractal analysis of movement.

I am optimistic about the potential of merging scale-invariant disciplines: if physicists could accept Newton’s law of universal gravitational attraction, even when it couldn’t explain solar systems with more than 2 planets, why can we not accept that movement unifies us even if it cannot predict each time-step for each species taking whatever method of transport. It’s a narrow-minded perspective to say that we can’t have unified movement theory because some people take bicycles, while others prefer the Metro. Algorithms cooked up by silicon valley are already capable of differentiating this movement — doesn’t that mean these are already unified in the neural network’s internal representations of movement? Train a neural network to detect directionality of some moving object. Assuming you did the requisite pre-processing, chances are that algorithm will tell you the direction of any moving object. That’s unified movement theory. Not convinced? Take the first neural network and perform transfer learning for another object. The transferred network will outperform a network that didn’t ‘see’ the first objects movement/directionality. This is unified movement theory. There’s a team of researcher’s studying locomotion in ants who strapped sticks onto the legs of ants. They found their ants on stilts would walk past their intended destinations. Doesn’t this indicate that regardless of the interaction between ant and environment (the ecology), movement could be characterized using common conceptualizations: be they step-length, velocity, or the ant’s internal step count?

This paper came about as a discussion Miller had with various mobility/movement researchers; what’s clear is that people don’t have answers. It’s not as simple as ecologists neglecting scale or geographers neglecting behaviour: our silo-ed approach to science is undermining our ability to comprehend unifying phenomena. And I bet movement is that unifying theory. Can you think of anything that’s truly static?

Scaling Behavior of Human Mobility

Sunday, November 3rd, 2019

This conference paper discusses about the spatial-temporal scaling behavior of human mobility by conducting an experimental study using five datasets from different areas and generations. They do get the results that consistent with the literature that human mobility shows characteristics forms for power law distributions, not all datasets are equal, etc. However, what are the basic disciplines of human mobility is not well explained. And the analysis (case studies) carried out based on large amounts of data generated by new measurement techniques is to examine the impact on aggerate metrics of spatial and temporal sampling period. These analysis and discussion on results conduct great researches on how scaling behavior varies and how massive datasets be interpolated with a generalized conclusion that spatial temporal resolution behaviors matter a lot to describe human mobility. That issue is not only associated with human mobility analysis, and it does always matter in plenty of fields in GISicence.

What is particular different and influential on human mobility? Is there any spatial data quality or spatial data uncertainty discussion necessary before or after analyzing movement datasets? Is there any argument on definition of human mobility and related measuring metrics? I expected more about more fundamental issues on human movement analysis which is still vague to me instead of case studies showing basic rules form human mobility and relationships with scaling issues.

SRS for Uncertainty — some brief thoughts

Sunday, November 3rd, 2019

Quale — a new word I will almost certainly never use.

It does however represent a concept we all have to wrangle with. Forget statistical models, literally no representation is complete. I tell you to imagine a red house – you imagine it. But was it red? Or maroon, burgundy, pink, or orangish? Not just a matter of precision, what we are communicating depends on what we both think of as ‘red’, or ‘maroon’, or ‘burgundy’, or whatever else. We might also have ideas about what sorts of red are ‘house’ appropriate. An upper-level ontology might suggest to us red-ness that is universal. But no houses in my neighbourhood are the bright lego red? Why not?

Some of what Schade writes reminds me of Intro Stats: error being present in every single observation. This sort of error can be thought of as explained and unexplained variance. Variance is present in all data; the unexplained variety being what may have risen due not only apparatus error, but also what we describe as uncertainty in data.

Schades temperature example is handy: the thermometer doesn’t read 14 degrees – it reads 13.5-14.5 with 68% probability. The stories we tell aren’t about what we say, but what we mean. This sort of anti-reductionism is also at the root of complexity theory. Acknowledging that we cannot characterize systems as static and linear components disregards the emergence of something that can explain why complex things are greater than the sum of their parts. Applied machine learning research also appreciates this anti-reductionism: the link to AI Schade makes, I THINK is about how applied machine learning researchers aren’t really interested in determining the underlying relationships of phenomena – only the observed patterns of associations. Methods that neglect the former and embrace the latter perspective explicitly consider their data to be incomplete and uncertain to some degree. Though to be honest this connection seems forced in the paper, but I’m happy to help force it along. 🙂

Scaling Behavior of Human Mobility Distributions (Paul et al., 2016)

Sunday, November 3rd, 2019

This paper characterizes human mobility patterns at different spatiotemporal resolutions using high-resolution data and finds that some aggregate distributions have scaling behaviors. This paper reaffirms that scale is a central tenet of GIScience.

First, the authors mentioned that varying resolution impacts datasets through the underlying behaviors of the individuals and the data collection context. Indeed, movement is often driven by the characteristics of the surrounding environment and the nature of space that the object is moving through. However, besides of spatial scale and granularity of movement, I would argue that this research should also take into account the temporal scale, such as the sequential structure of trajectories. Also, it is important to note that trajectory data is uncertain, and this can negatively impact the accuracy of algorithms used to obtain the movement patterns of objects. One of the sources of trajectory uncertainty is the error inherent to GPS measurements. Those datasets were collected using different GPS devices, which may make it difficult to assess and compare the internal quality of different datasets.

Because this paper focuses on the influence of scale, I am looking forward to knowing more methodologies applied in movement research, such as modeling and visualizing movement. Further, the integration of mobility data may lead to ethical challenges because environmental and other contextual data can reveal personal information beyond only location and time. Note that attaching the devices may affect animals’ behavior and perhaps the survival of the animal.

Thoughts on “Towards on Integrated Science of Movement”

Sunday, November 3rd, 2019

This paper covers a number of different aspects of integrating movement with GIS and spatial visualizations. Two in particular interested me. One is the challenge of big but thin data. The authors point out that while there are more Big Data than ever before on individuals’ locations to use as movement data, they are thinly attributed. This reminds me of a paper I had to read in GEOG 307, “Mobile Phone Data Highlights the Role of Mass Gatherings in the Spreading of Cholera Outbreaks.” The authors used mobile phone information to track the movements of individuals in Senegal in the wake of a cholera outbreak, to see where people leaving the affected area were going to and gathering. This is a perfect example of having access to tons of movement data that are thinly attributed: there were no names or demographic information attached to the call locations (for confidentiality reasons if nothing else), and the infection status of these individuals was also unknown. Even so, however, the authors were able to draw powerful conclusions about where people were moving and therefore where the epidemic could potentially spread. This begs the question: is thinly attributed data an issue, when so many of them are available? I would say it depends on the question to be answered. In the cholera study, while it would have been nice to have more data for the phone calls, this was not necessary to conduct effective and meaningful analyses. However, this may not be the case in all such movement studies. There will likely be studies where discriminating between data, for example, would be necessary, and in cases like these more attribute data than the ones currently available would be required, even with the massive volume of data accessible.

Thoughts on “Approaches to Uncertainty in Spatial Data”

Sunday, November 3rd, 2019

This chapter deals with a number of aspects of uncertainty in spatial data. The one that caught my eye in particular is definition: how well defined or not well defined a geographical object is. Well defined objects tend to be human made (like census tracts), while poorly defined objects tend to be natural (like a patch of woodland). This raised a question for me that this chapter does not address: how should or can someone deal with objects that may be overlapping/related if one is well defined and the other isn’t? For example, would performing an intersect on two such object be appropriate, considering the gap in the quality of definition? Does a large difference in definition make two objects incomparable? Maybe not, and that is why the paper does not address this particular issue. However, I would say there could be issues in data incompatability between a well defined and poorly defined object. For example, if there is a well-defined census tract overlaid on a poorly defined patch of woods, how well could the intersect between the two be defined? This perhaps feeds into other issues of uncertainty mentioned in the chapter, like vagueness and error. But fundamentally, I would say that such a notable difference in definition would make these object incompatible. Perhaps one data type could be converted, for example the wood patch could be converted and given “hard” borders under the assumption that these are clearly defined, even if they aren’t. Even so, however, this overlooks the central properties of the object and may not bridge the gap between the level of definition in each object.

Spatial Data Uncertainty (Devillers & Jeansoulin, 2006)

Sunday, November 3rd, 2019

This chapter provides basic concepts on quality, definitions, sources of the problem with quality, and the distinction between “internal quality” and “external quality”.

The question about crowdsourced geospatial data quality is the first one to come up. When it comes to crowdsourced geographic data, it is very common to hear suggestions that the data is not good enough and that contributors cannot collect data at a good quality, because unlike trained researchers, they don’t have enough experience and expertise of geospatial data. Therefore, we should pay particular attention to the issues stemming from the quality of crowdsourced geospatial data. Also, note that any crowdsourced data is biased in on or more ways. Contributors can have different aspects and levels of quality of judgment and decision making. Their decisions and preferences could significantly influence their data. I am curious about how to identify and estimate biases in crowdsourced data?

Furthermore, the authors mention that users can evaluate external quality based on internal quality. However, nowadays, geographical resources (both data and applications) are mostly accessible via web services. Data producers do not always provide internal quality of data. In this situation, how users evaluate the external quality of resources? Last, while the internal and external quality measures are applied to measure the quality of data which is factual in nature, how to assess the quality of information aiming at opinions or vague concepts? (QZ)

Spatial data quality: Concepts

Sunday, November 3rd, 2019

This chapter in the book “Fundamentals of Spatial Data Quality” gives a shot on basic concepts in spatial data quality by pointing out that the divergences between the reality and the representation are what the spatial data quality issues often deal with. And there are several aspects of where the errors would happen during the data production process such as the data manipulation process and the human involved data creation process. Moreover, spatial data quality is summarized to be assessed from internal and external aspects. This chapter explains well what the data quality is and what errors could be and is very easy to understand.

It is interesting that the introduction starts with a quote, “All models are wrong, but some are useful”. However, does it mean all spatial data or data created could be interpolated as the product of model or filter? Authors argue that the representation of reality may not be fully detailed and accurate but partially useful. But how to determine whether the data with those uncertainty or errors should be accepted is a much more urgent problem. Also, as the topic is “spatial data uncertainty” and spatial data quality issues discussed in the chapter, does the uncertainty exactly mean different sources of error assessed in spatial data quality?

The chapter defines the internal quality as level of similarity between data produced and perfect data while external quality means level of concordance between data product and user needs. My thought is if user participate in the data producing process (which is about internal quality), will the external quality be efficiently and effectively improved? Can we just replace “as requested by the manager” with “what user wanted” in Figure 2.4 and there should be no external quality worries?

Thoughts on ” Scaling Behavior of Human Mobility Distributions”

Sunday, November 3rd, 2019

This paper presented an empirical study of how temporal and spatial scale impacts the distribution of mobility data. The main finding is not surprising – a different spatial and temporal scale of analysis leads to a different distribution of data. Once again we saw the importance of scale in the analysis of the spatial datasets.

What interests me are finding 3 and 5. Finding 3 states that ordering between metrics over datasets is generally preserved under resampling, which implicates that the comparison across the datasets can be made regardless of the spatial and temporal resolution. This reminds me of the reading of spatial data quality. Though it is critical about the effects of scale, it is also important to bear in mind about the “use”. In the case of comparing human mobility across different datasets, the scale does not seem to matter anymore.

Find 5 concludes that the sensitivity to resampling can itself be a metric. I think this is a good point but I was having some difficulties to grasp what the authors want to express in the subsequent argument of “difference in sensitivity indicates that information about population mobility is encoded in the scaling behavior”. I think they could have explained this better. To my understanding, the difference in sensitivity to resampling is nothing more than the difference in the heterogeneity of the datasets.

Another point I want to make is that although the analysis is performed on mobility datasets, it seems to me that the most conclusions they made can be generalized to all kinds of datasets. I’m not sure what is special about the mobility data here in their analysis.