Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch Meetup Vienna - webLyzard Live Demo

Elasticsearch Meetup Vienna - webLyzard Live Demo

The Elastic blog [1] recently featured webLyzard’s Visual Exploration of Sustainability Communication with Elasticsearch, a project [2] to track global information flows. Customized for the United Nations Environment Programme, the resulting platform identifies opinion leaders and analyzes the public debate surrounding the UN’s Sustainable Development Goals (SDGs). Its custom-built dashboard [3] synchronizes multiple views in real time and uses aggregations to convey context information through a portfolio of visual tools.

Two of the webLyzard [4] co-founders will present a live demo of the platform and similar applications in other domains. They will discuss some of the underlying aggregations and their experience of recently migrating to Elasticsearch 6.5. The concluding outlook will show how predictive capabilities might help to anticipate mobility bottlenecks, support digital newsrooms, or maximize the impact of published content across social media channels.

[1] https://www.elastic.co/blog/weblyzards-visual-exploration-of-sustainability-communication-with-elasticsearch
[2] https://www.weblyzard.com/unep-live
[3] https://unep.ecoresearch.net
[4] https://www.weblyzard.com

webLyzard technology
PRO

December 14, 2018
Tweet

More Decks by webLyzard technology

Other Decks in Technology

Transcript

  1. 1
    www.webLyzard.com
    webLyzard Web Intelligence and Visual Analytics
    Elasticsearch Meetup
    Arno Scharl, Alexander Hubmann-Haidvogel

    View Slide

  2. 2
    UNEP Live Web
    Intelligence
    uneplive.unep.org
    Developed by the Department of New Media Technology of
    MODUL University Vienna and powered by webLyzard technology
    www.weblyzard.com/unep-live ▪ unep.ecoresearch.net

    View Slide

  3. 3

    View Slide

  4. 4

    View Slide

  5. 5 Elasticsearch | Integration
    History
    - Started with PostgreSQL/Tsearch2
    - Switched to Apache Lucene in 2009
    - Adopted Elasticsearch 1.0 early 2014
    - Migrated to Elasticsearch 6 earlier this year (currently 6.5.1)
    Current Cluster
    - 5 physical machines (XEON E5, 40 cores, 256GB RAM)
    - 3 x 2TB M.2 NVMe Samsung 960 Pro in striped LVM
    - Multiple Elasticsearch nodes per machine
    4 data nodes, separate master nodes, one coordinating-only node per machine
    - Docker containers, overlay network, discovery using DNS

    View Slide

  6. 6 Elasticsearch | Integration
    Indexing
    – Custom component based on Vert.x
    – Data read from PostgreSQL
    – Indexer applies transformations/de-normalizations
    – Enriches with additional metadata; e.g., translations
    – One index per source and language (e.g. German news media)
    per month (e.g. de.1.media.2018-12)
    – Balance between index/shard size and number of indexes
    affected by queries and aggregations

    View Slide

  7. 7 Elasticsearch | Integration
    Migration 1.7 > 6.x
    – No in-place upgrade path from Elasticsearch 1.x to 6.x
    – Skipped Elasticsearch 2.x because of field names containing
    dots in existing mapping
    – Complete re-index needed, adapted document mapping
    Performance Improvements
    – Single request performance improved by about 50%
    – Concurrent requests with 100 simulated users improved
    by almost 85%

    View Slide

  8. 8 Elasticsearch | Integration
    Example 1: Geographic Map
    – Hashgrid aggregation for extracted target locations
    – User-selectable precision
    – Aggregate average document sentiment
    Example 2: Word Tree
    – Inner hits on search query
    – Filtered for sentences matching the query
    – Maintaining document-level sorting

    View Slide

  9. 9

    View Slide

  10. 10 InVID | www.weblyzard.com/invid
    In Video Veritas – Video Verification
    Media Partners: APA, AFP, Deutsche Welle
    EU Horizon 2020, 3.65 Mio EUR
    SONAR | www.weblyzard.com/sonar
    Semantic Repository for News Analytics
    Media Partner: ProSiebenSat.1 PULS 4
    Google News Initiative, 225,000 EUR
    ReTV | www.weblyzard.com/retv
    Enhancing and Repurposing TV Content
    Media Partners: Zattoo, rbb|24, Sound & Vision
    EU Horizon 2020, 3.5 Mio EUR

    View Slide

  11. 11 EPOCH | www.weblyzard.com/epoch
    Extracting and Predicting Events from Online
    Communication and Hybrid Datasets
    Partners: Ketchum Publico, KPMG
    FFG ICT of the Future; 500,000 EUR
    EPOCH will assess the real-world impact of events reported
    in news and social media channels. It will use extracted
    knowledge from the public debate across these channels to
    predict future trends, offering a visual dashboard to explore
    and analyze these trends.
    Organizations will be able to identify and thus better
    prepare for anticipated changes, adapting their decision-
    making and resource allocation strategies accordingly. New
    methods developed witin EPOCH will be applied to the
    domains of purchase price forecasting and public relations.

    View Slide

  12. 12
    @weblyzard
    We Are Definitely Hiring
    Our team spans 2+ countries and counting.
    See if you or someone you know is a fit.
    [email protected]

    View Slide