The power of visualising big data time-series that are derived from remote sensing products processed on Hadoop can not be overestimated. Visualization can give scientists, policy makers, journalists, and the public immediate insights into how the environment is changing over time, leading to quicker understanding and action. Effectively putting big data time-series on maps remains difficult. With the advent of Hadoop, CartoDB, and HTML5 APIs, our ability to create interactive maps in highly performant ways has greatly improved. New problems with the scale and complexity of near real-time data keep data visualization interesting and challenging. Here, we present our work to develop fast mapping solutions for 500 meter resolution deforestation data produced 16-days by the FORMA algorithm for the Global Forest Watch 2.0 web portal.
Deforestation data contain rich temporal information that is often lost when visualized using static map tiles. To ensure that these data characteristics are effectively surfaced in the Global Forest Watch portal, we developed new methods for big data storage, query, transfer, and map-based visualization. The Forest Monitoring for Action (FORMA) project provides free and open forest clearing alert data derived from MODIS satellite imagery every 16 days beginning in December 2005. FORMA is written in the Clojure programming language and rides on Cascading and Cascalog APIs for processing big spatial data on top of Hadoop using MapReduce. Here we will focus on the high level FORMA algorithm and data workflow, with a particular emphasis on the visualization mechanisms for these data.
At a high level, deforestation events are converted from raster products to JSON data objects. Each JSON data object efficiently stores an index of the date and pixel locations of deforestation on quadtree map-tiles. On the client, these three-dimensional JSON objects are unpacked and used to render HTML5 canvas objects that are displayed on the map. In combination with user-interface controls, users can interact with the history of deforestation on the map.
The methods developed for the Global Forest Watch website have been further generalized in an open-source library called, Torque (http://github.com/CartoDB/Torque). A generalized SQL statement to compress temporal-geospatial data to tile-based JSON objects and the HTML5 canvas rendering functions will be expanded in the future to visualize the motion of multiple agents and ordered, non-temporal, data. In this presentation we will describe in-depth the analysis of deforestation data, the efficiency of the temporal JSON data schema, and finally the challenges and rewards of visualizing temporal data on the web.