Predictive analysis for cache generation in quadtree based geo visualizations

Predic've analysis for cache genera'on in quadtree based   geo
visualiza'ons Carla Iriberri Bachelor in Telema1cs Engineering October 2015

1. Introduc1on and goals 2. Short history of web mapping
3. The source data 4. The caching algorithm 5. Results and conclusions Predic1ve analysis for cache genera1on in quadtree based geo visualiza1ons

1. Introduc+on and goals

The amount of 1me people spend online looking at web
maps has soared by more than 50% in the past two years. 1. Introduc1on and goals Source: ComScore

• To predict usage paMerns in web maps so that
data can be precached to improve the usability of such maps 1. Introduc1on and goals The goal

• U.S. revenue in the geospa1al industry es1mated in $100
billion by 2017  • Legal frameworks regulate government solu1ons (INSPIRE Direc-ve in Europe) 1. Introduc1on and goals The environment

2. Short history of web mapping

MapQuest The ﬁrst web map with rou-ng and geocoding capabili-es.
1996 2. Short history of web mapping

Google Maps 2005 2. Short history of web mapping

The concept of a 1le • Tiles are PNG images
of 256x256 pixels  • Tiles are drawn according to the Web Mercator projec1on  • Tiles are described by an ZXY numbering scheme, where Z is the zoom level and X and Y iden1fy the 1le 2. Short history of web mapping

2. Short history of web mapping The quadtree Tiles are
organised in a quadtree scheme: a tree structure in which each internal node has exactly four children. At each zoom level, a map will be formed by 22z -les.

2. Short history of web mapping The layers of a
map !!!!!?###?!######## !!!!!?????######### !!?????????????###! !!???????######### !!!!!?###??######## !!!!???????######## !!!??############# Basemap layer Data layer(s) Interac1vity layer

3. The source data

3. The source data CDN logs for user requests PostgreSQL
and Redis dumps for maps and layers metadata 150GB plain text logs 10GB Postgres database 800MB Redis data

The data cura1on process 3. The source data Log data:
• Collected from compressed ﬁles by using regular expressions in Python and transformed to JSON   Map metadata: • Sources transformed to JSON ﬁles Event = namedtuple("Event", [ 'x', 'y', 'z', 'time', ‘user', ‘ip_address', 'named_map_template', 'layergroup', 'layergroup_timestamp', 'type' ])

Data insights 3. The source data Total requests: 1.7M  80%
of which are 1le requests Ratio 0 0,06 0,12 0,18 0,24 0,3 Zoom level 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 Requests by zoom level: Key measurement: From a speciﬁc from level, how much keeps the user zooming in?

The data insights (II) 3. The source data 3. The
source data

4. The caching algorithm

Timing breakdown TTILE = TDATA + TRENDERING TDATA • Increases
with the complexity of the SQL needed to generate the informa1on    • Increases with the amount of data to be drawn or the complexity of its drawing proper1es   TRENDERING SELECT p.*, d.escrutado,  partido_ganador, d.p_vot  d.nom_municipio, d.numer FROM poligonos_municipio WHERE p.cod_municipio = AND p.cod_provincia = d. AND d.escrutado >= 3 4. The caching algorithm

4. The caching algorithm • Caching resources at map genera1on,
not at map request 

The caching algorithm: pseudo-code 4. The caching algorithm def caching_traversal(quads,
max_depth): if max_depth <= 0 or timed_out(): return quads = sorted(quads, key=lambda q: q.information_density) for q in quads: cache_tile(q) md = (max_depth - 1) * q.information_density caching_traversal(q.quads, md) caching_traversal(quads, AverageZoom[z])

The caching algorithm 4. The caching algorithm

5. Results and conclusions

The results: a visual REQUESTED CACHED REQUESTED CACHED 5. Results
and conclusions

5. Results and conclusions The results • Average TCACHED ≈
55ms.  • Min(TTILE ) ≈ 170ms. Max(TTILE ) ≈ 45s  • The caching technique covers the most common requests

Carla Iriberri Supervisor: Víctor Elvira Bachelor in Telema1cs Engineering October
2015 Thanks.

Predictive analysis for cache generation in qua...

Predictive analysis for cache generation in quadtree based geo visualizations

Carla

More Decks by Carla

Other Decks in Technology

Featured

Transcript

Predic've analysis for cache genera'on in quadtree based   geo

1. Introduc1on and goals 2. Short history of web mapping

1. Introduc+on and goals

The amount of 1me people spend online looking at web

• To predict usage paMerns in web maps so that

• U.S. revenue in the geospa1al industry es1mated in $100

2. Short history of web mapping

MapQuest The ﬁrst web map with rou-ng and geocoding capabili-es.

Google Maps 2005 2. Short history of web mapping

The concept of a 1le • Tiles are PNG images

2. Short history of web mapping The quadtree Tiles are

2. Short history of web mapping The layers of a

3. The source data

3. The source data CDN logs for user requests PostgreSQL

The data cura1on process 3. The source data Log data:

Data insights 3. The source data Total requests: 1.7M  80%

The data insights (II) 3. The source data 3. The

4. The caching algorithm

Timing breakdown TTILE = TDATA + TRENDERING TDATA • Increases

4. The caching algorithm • Caching resources at map genera1on,

The caching algorithm: pseudo-code 4. The caching algorithm def caching_traversal(quads,

The caching algorithm 4. The caching algorithm

The caching algorithm 4. The caching algorithm

The caching algorithm 4. The caching algorithm

The caching algorithm 4. The caching algorithm

5. Results and conclusions

The results: a visual REQUESTED CACHED REQUESTED CACHED 5. Results

5. Results and conclusions The results • Average TCACHED ≈

Carla Iriberri Supervisor: Víctor Elvira Bachelor in Telema1cs Engineering October