Just read an interesting editorial by Kenneth Field (aka The Marauding Carto-nerd). He examines his experiences from recently attending various conferences in Germany, the UK and the US. As subjective as these experiences may be, I think he makes some astute observations and valuable conclusions on the matter of Cartographic Tribalism, with a focus on the neo- vs. traditional and proprietary vs. open source cartographers. A bit lengthy, but definitely worth reading: Cartographic tribalism.
Instead of working on my backlog of half-finished drafts, Big Data issues keep on popping up. A while ago, I posted a longer piece on Big Citizen Data, and remarked that a lot of seemingly 20th century issues on data quality and sampling bias are being steadfastly ignored nowadays. Jonas Lerman has published an excellent argument on the Standford Law Review on the matter of exclusion through digital invisibility. To cite the abstract:
“Legal debates over the “big data” revolution currently focus on the risks of inclusion: the privacy and civil liberties consequences of being swept up in big data’s net. This Essay takes a different approach, focusing on the risks of exclusion: the threats big data poses to those whom it overlooks. Billions of people worldwide remain on big data’s periphery. Their information is not regularly collected or analyzed, because they do not routinely engage in activities that big data is designed to capture. Consequently, their preferences and needs risk being routinely ignored when governments and private industry use big data and advanced analytics to shape public policy and the marketplace. Because big data poses a unique threat to equality, not just privacy, this Essay argues that a new “data antisubordination” doctrine may be needed.” (source: Stanford Law Review, 03.09.2013).
The article is well worth reading, even if the second part is unfamiliar territory for those not well-versed in US law (e.g. me).
It made me rethink (though not change) my attitude towards some of the popular means of getting citizen (customer) information: If no precautions and countermeasures are taken, the socially and financially disadvantaged may actually want to share as much of their data on shopping, leisure activities and other preferences in order to prevent being completely marginalized…
New article focusing on the system design and architecture behind our approach to filtering volunteered social media information:
Spinsanti, L, & Ostermann, F. (2013). Automated Geographic Context Analysis for Volunteered Information. Applied Geography 43 (September): 36–44. doi:10.1016/j.apgeog.2013.05.005.
If you can’t access the article, don’t hesitate to drop me a line for a pre-print.
#hochwasser 2013 in Germany
I’d like to summarize my perception of the use of social media during the European floods of 2013, with a special emphasis on Germany (NB most of the links are for German sources; for an excellent blog post focused on Dresden, go here). Since I have been travelling during the event, I had to gather my information just recently, i.e. after the actual event. Therefore, the information certainly is incomplete, and I’d be happy for additional information and corrections by the gentle readers…
For those outside of Germany, here’s a brief overview of what happened:
- The floods were mainly caused by a cold and wet spring resulting in saturated soils, coupled with abnormal meteorological situation and heavy rains for several days end of May and beginning of June.
- The floods affected most countries of central Europe, however I will focus on Germany here.
- In Germany, several Länder were affected, with the worst damage occuring in the South and East.
- The two weeks saw a massive mobilisation of around 75.000 fire fighters, plus 19.000 soldiers.
- Several cities reported record high water lines, several dams burst, and large areas were flooded. There were 14 deaths.
- The situation now mostly under control, only some areas still flooded. Compare the official information here.
Examples of social media use (Facebook pages, maps) include:
- A Google map for the city of Magdeburg curated by four collaborators, with over one million hits and a corresponding Facebook page.
- Another Google map for the city of Dresden, curated by eight collaborators and with almost four million hits.
- A third Google map for Halle, a bit smaller in scope with two contributors and half a million hits.
- Additionally, there are many pages of Facebook, usually focusing on a geographic area or place.
- On Twitter, the most used hashtag seems to be #hochwasser, but many others were also used. On a dedicated channel, requests and offers for help for Dresden could be posted (see also a corresponding website).
As I mentioned, I wasn’t able to collect any data – if someone has data and would like to attempt an analysis, I’d be happy to help out.
For Germany, the use of social media during a disaster was a new experience – fortunately, there are not that many large-scale disasters happening occuring, and the last one (floods of 2002) happened before the advent of social media. In consequence, the use of social media found an echo in more traditional broadcast media (e.g. Handelsblatt, Neue Osnabrücker Zeitung, and Spiegel Online).
Highlights and lowlights
In other words, what worked and what didn’t?
Positive experiences include:
- Many volunteers can be mobilised within little time.
- More information (channels) were available for everyone (with internet connection).
- Self-organizing help (who does what) works overall, with volunteers gathering and providing information, helping in the deployment of sandbags, and aiding the volunteers through infrastructure and consumables.
Some negative experiences were:
- No weighting or ranking available, making it difficult to estimate the importance and urgency of information and requests. Subjective criteria like proximity and local knowledge can help but may be misleading.
- A blurring between private and official channels.
- A lack of feedback and checks led to occasional proliferation of wrong information.
- Too many helpers and a lack of coordination can have a negative impact (coordination, gawkers, …).
But apparently, a lack of coordination can also affect public authorities (article on Cicero).
Algorithms to the rescue?
It’s obvious that the problems described above are not specifically German or flood-related. They are problems that haunt any undertaking of a large crowd. In my humble opinion, there are two main avenues to overcome the problems and thereby increase the utility of social media: Improved filtering and ranking, and improved platforms.
I have been an advocate for algorithmic filtering and ranking of social media messages for some time now (see my research publications and this blog). Various studies show that even in critical situations like disasters, algorithmic approaches can provide two important advantages: First, they can filter out noise and redundant messages. And second, they can organize and enrich the remaining information to faciliitate human curation. Examples for algorithmic approaches include Swiftriver and GeoCONAVI, with ongoing research for example at the QCRI. The Ushahidi platform and the Stand-By-Task-Force are examples for successful human (crowd sourced) filtering and curation.
I have also been a long-time skeptic of the utility of information streams, which are one of the dominating characteristics of Web 2.0 (from the proverbial Twitter streams to Facebook’s Timeline to the increasing number of “live tickers” on news sites that replace journalistic and editorial care taking with unfiltered and raw data). These relentless streams of information don’t stop for important news, and marginal (but nevertheless important events) risk being overlooked. He who shouts the loudest and the longest wins (the battle for attention). In order to organize the flood of information, a more interactive interface is necessary, such as … a map! Putting the textual information from Facebook posts, Tweets and other sources on a combined map and make the information searchable by place, time and content would be a significant improvement. While I wish to express my sincere congratulations and respect to the map makers linked above, it is also obvious that for larger events and more up-to-date information, more resources are needed. Either computing power and algorithms, or volunteers and professionals. Or even better, both.
Can we do it?
It seems that the current state of affairs in Germany resembles the situation of the Californian wildfires of 2007. I’m not trying to be condescending here – this is not surprising because there are fewer natural disasters in Germany, and the infrastructure for dealing with those is generally good (and it seems there is still room for improvement in the US, too).
However, simply tapping into the gigantic information stream is not the solution per se (as Patrick Meier argues as well), but a first step. There are many examples that show it’s possible, and our GeoCONAVI system used off-the-shelf hardware to monitor four European countries for social media on forest fires. In my opinion, the big problems are not computational, but ethical, legal and organizational. Legal implications include issues of privacy (although if only public messages are being used, this is less of a problem), and liability – what if wrong information leads to property damage, or even worse to the loss of human life? Organizational and political obstacles at least in Germany are the many agencies involved in civil protection: On the Federal level (strictly for defence issues), the Länder level (strictly for natural disasters and such, and each Land has its own agency), plus the various organizations such as (volunteer) fire departments, Technisches Hilfswerk, etc etc. Since disasters don’t stop at geographical or organizational borders, this could be a real problem, although it seems that the during the 2013 flood the public authorities coordinated their work rather closely and well (with the exception mentioned above). The EU has also a new Emergency Response Centre based on the capabilities and knowledge from the JRC.
I’d like to recommend two excellent critical papers on user-generated geographic content and the geosocial web. The first one is by Muki Haklay and raises important issues on the democratizing effects of the Web 2.0 and neography, while the second one by Crampton et. al. takes up the issue and suggests possible solutions to improve the study and analysis of geosocial media.
In his study , Haklay argues that neographic theory and practice assume an instrumentalist view of technology, i.e. that technology is value-free and that there is a clear seperation between the means and the ends. Obviously, Haklay does not agree with this view and argues that there is less empowerment and democratization to be found than commonly assumed. In order to realize the full potential of neographic tools and practices, anyone implementing neogeographic tools or practices needs to take into account economic and political aspects. There is a substantial body of work supporting Haklay, including the research by Mark Graham , which I recommended in my last post. Patrick Meier on iRevolution has a in-depth commentary of Haklay’s paper  and provides a somewhat more optimistic interpretation. My own point of view is running along similar lines as Haklay’s, in that the contemporary digital divides are a continuation of old power divides that participatory GIS sought to overcome in the 90s. And while I have no ill will towards companies that add value to user-generated content, I am highly skeptical of such “involuntary crowdsourcing”, in which the crowd provides freely the raw material but in the end has to pay for access to derived products . There is some similarity to the argument for Open Government Data – why should the tax payers (and tax paying companies) pay again for the use of the data, when they already payed for the creation of it?
Crampton et al.  investigate critically the hype around the “Big Data” geoweb. They remind the reader of (a) the limitations inherent in “big-data”-based analysis and (b) shortcomings of the simple spatial ontology of the geotag. Concerning (a), the data used often has limited explanatory value or informational richness, something our research has shown as well . Further, geocoded social media are still a non-representative sample, no matter how many of them one has collected. Concerning (b), Crampton et al. point out a number of problems with the geotag, e.g. that it is difficult to ascertain whether it refers to the origin of the content or the topic of the content, its lineage and accuracy, and its oversimplification of geography by limiting place geometry to points or lat/lon pairs (see also ). As a consequence of their analysis, the authors suggest that studies of the geoweb should try to take into account:
- social media that is not explicitly geographic
- spatialities beyond the “here and now”
- methodologies that are not focused on proximity only
- non-human social media
- geographic data from non-user generated sources.
I have to admit that I am a little bit proud to say that our research has addressed three of those suggestions: We haven’t limited our sample to geo-coded social media, instead we have re-geo-coded even those with existing coordinates to ensure that we capture the places the social media was about. We have also gone beyond the “here and now” by spatio-temporal clustering data. Finally, a core concept of our approach is the enrichment of the social media data with explicitly geographic data from non-user generated (i.e. authoritative) sources (a paper describing the details has just been accepted but not published yet, an overview can be found here ).
Crampton et. al. conclude their paper with the important reminder that caution is needed regarding the surveillance potential of such research, with intelligence agencies around the world focusing more and more on open source intelligence (OSINT). Indeed it seems that even in Really Big Data, our spatial behaviour is unique enough to allow identification .
or so the saying goes. At least part of it. Anyway, it’s been very quiet here for almost three months now. The main reason is that most of my spare energy at the moment goes into searching for new work – my current project (and with it funding) will end in a couple of months, so I’m spending my time less writing and more scouting. And flying a UAV, actually, because we’re following up on last year’s successful “Big Blue Balloon” experience. I’ll be posting about Ed (our “Environmental Drone”) soonish.
In the meantime, let me recommend some great posts by other bloggers.
First, there’s always something worthwhile on iRevolution – I am always awed by the frequency with which Patrick can publish high-quality blog posts. Yesterday’s post caught my eye in particular, because it shows Patrick isn’t only a keen thinker and great communicator, but also could do well as entrepreneur – check out his ideas for a smartphone application for disaster-struck communities here.
Then, I really love reading Brian Timoney’s MapBrief blog. It’s not only enlightening, it’s also fun – as long as you’re not the target of Brian’s sharp wit. Recently, he has run a series on why map portals don’t work. Most of the reasons should be pretty obvious, but equally obvious is the failure by most portals to do differently. Read up on it here.
On Zero Geography, Mark Graham shares with us the latest results from his research, and there have been several great posts recently on the usage of Twitter in several African cities. Visit it here to learn some surprising things about contemporary digital divides.
I hope you enjoy reading them as much as I did, but I also promise you won’t have to wait for “strange aeons” before some original material is posted here.
We have one free access and one open access paper on our work newly published – check them out, especially the first one if you want to get an overview of what work sparked this blog:
Craglia, M., Ostermann, F., & Spinsanti, L. (2012). Digital Earth from vision to practice: making sense of citizen-generated content. International Journal of Digital Earth, 5(5), 398–416. doi:10.1080/17538947.2012.712273
Schade, S., Ostermann, F., Spinsanti, L., & Kuhn, W. (2012). Semantic Observation Integration. Future Internet, 4(3), 807–829