Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch Meetup Vienna - webLyzard Live Demo

Elasticsearch Meetup Vienna - webLyzard Live Demo

The Elastic blog [1] recently featured webLyzard’s Visual Exploration of Sustainability Communication with Elasticsearch, a project [2] to track global information flows. Customized for the United Nations Environment Programme, the resulting platform identifies opinion leaders and analyzes the public debate surrounding the UN’s Sustainable Development Goals (SDGs). Its custom-built dashboard [3] synchronizes multiple views in real time and uses aggregations to convey context information through a portfolio of visual tools.

Two of the webLyzard [4] co-founders will present a live demo of the platform and similar applications in other domains. They will discuss some of the underlying aggregations and their experience of recently migrating to Elasticsearch 6.5. The concluding outlook will show how predictive capabilities might help to anticipate mobility bottlenecks, support digital newsrooms, or maximize the impact of published content across social media channels.

[1] https://www.elastic.co/blog/weblyzards-visual-exploration-of-sustainability-communication-with-elasticsearch
[2] https://www.weblyzard.com/unep-live
[3] https://unep.ecoresearch.net
[4] https://www.weblyzard.com

webLyzard technology

December 14, 2018

More Decks by webLyzard technology

Other Decks in Technology


  1. 2 UNEP Live Web Intelligence uneplive.unep.org Developed by the Department

    of New Media Technology of MODUL University Vienna and powered by webLyzard technology www.weblyzard.com/unep-live ▪ unep.ecoresearch.net
  2. 3

  3. 4

  4. 5 Elasticsearch | Integration History - Started with PostgreSQL/Tsearch2 -

    Switched to Apache Lucene in 2009 - Adopted Elasticsearch 1.0 early 2014 - Migrated to Elasticsearch 6 earlier this year (currently 6.5.1) Current Cluster - 5 physical machines (XEON E5, 40 cores, 256GB RAM) - 3 x 2TB M.2 NVMe Samsung 960 Pro in striped LVM - Multiple Elasticsearch nodes per machine 4 data nodes, separate master nodes, one coordinating-only node per machine - Docker containers, overlay network, discovery using DNS
  5. 6 Elasticsearch | Integration Indexing – Custom component based on

    Vert.x – Data read from PostgreSQL – Indexer applies transformations/de-normalizations – Enriches with additional metadata; e.g., translations – One index per source and language (e.g. German news media) per month (e.g. de.1.media.2018-12) – Balance between index/shard size and number of indexes affected by queries and aggregations
  6. 7 Elasticsearch | Integration Migration 1.7 > 6.x – No

    in-place upgrade path from Elasticsearch 1.x to 6.x – Skipped Elasticsearch 2.x because of field names containing dots in existing mapping – Complete re-index needed, adapted document mapping Performance Improvements – Single request performance improved by about 50% – Concurrent requests with 100 simulated users improved by almost 85%
  7. 8 Elasticsearch | Integration Example 1: Geographic Map – Hashgrid

    aggregation for extracted target locations – User-selectable precision – Aggregate average document sentiment Example 2: Word Tree – Inner hits on search query – Filtered for sentences matching the query – Maintaining document-level sorting
  8. 9

  9. 10 InVID | www.weblyzard.com/invid In Video Veritas – Video Verification

    Media Partners: APA, AFP, Deutsche Welle EU Horizon 2020, 3.65 Mio EUR SONAR | www.weblyzard.com/sonar Semantic Repository for News Analytics Media Partner: ProSiebenSat.1 PULS 4 Google News Initiative, 225,000 EUR ReTV | www.weblyzard.com/retv Enhancing and Repurposing TV Content Media Partners: Zattoo, rbb|24, Sound & Vision EU Horizon 2020, 3.5 Mio EUR
  10. 11 EPOCH | www.weblyzard.com/epoch Extracting and Predicting Events from Online

    Communication and Hybrid Datasets Partners: Ketchum Publico, KPMG FFG ICT of the Future; 500,000 EUR EPOCH will assess the real-world impact of events reported in news and social media channels. It will use extracted knowledge from the public debate across these channels to predict future trends, offering a visual dashboard to explore and analyze these trends. Organizations will be able to identify and thus better prepare for anticipated changes, adapting their decision- making and resource allocation strategies accordingly. New methods developed witin EPOCH will be applied to the domains of purchase price forecasting and public relations.
  11. 12 @weblyzard We Are Definitely Hiring Our team spans 2+

    countries and counting. See if you or someone you know is a fit. [email protected]