Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Towards Visual Overviews for Open Government Data

Alvaro Graves
September 01, 2014

Towards Visual Overviews for Open Government Data

Presentation of our talk about Visual Overviews in DataWiz2014 in Santiago, Chile 2014

Alvaro Graves

September 01, 2014
Tweet

More Decks by Alvaro Graves

Other Decks in Science

Transcript

  1. Towards Visual Overviews for Open Government Data Alvaro Graves [email protected],

    @alvarograves INRIA Chile 1 Javier Bustos-Jimenez [email protected] NIC Chile Research Labs
  2. Data is everywhere! • Dozens of countries provide data about

    health, budget, and more • In most cases, free for use and hopefully in a machine-readable format. • But… • How can we leverage people without strong technical knowledge to use these datasets?
  3. More information is needed • Titles and descriptions are not

    enough • What type of data is available is critical • The devil is in the details • What about finding things that I wasn’t looking for? (serendipity) • A visual overview of the dataset can help potential consumers to decide whether it is useful or not
  4. Can visual representations help users? • Visual representations allow users

    to consume large amounts of data easily • It allows to identify trends hard to detect by machines • It allows to detect outliers
  5. data metadata annotations <persons> <person> <!-- this is a comment

    --> <name>John</name> <lastname>Doe</lastname> <language iso="EN">English</language> </person> </persons>
  6. data metadata annotations Listado de productos Bioequivalentes, actualizado al 1

    de Agosto de 2014 ! ! No,Principio Activo,Producto,Registro,Uso / Tratamiento 1,ANASTROZOL,Anastrazol comprimidos recubiertos 1 mg,F-16801,Cancer de mama 2,ANASTROZOL,Madelen comprimidos recubiertos 1 mg,F-16807,Cancer de mama
  7. data metadata annotations { persons:[ { name: "John", lastname: "Doe",

    language: { value: "English", iso: “EN” } } ] }
  8. Prototype • “Free sample” of the data: allow user to

    check what the data looks like • A visualization per column (in CSV) • Word clouds for most common terms per column • Histogram to see the distribution of values
  9. Many open challenges • Not all datasets are available as

    CSV • 1-viz-per-column is not enough (e.g., latitude & longitude) • Type of data should change the visualization strategy • Ex. gender & age vs. address
  10. Conclusion • We showed how we can extract different types

    of data, metadata and annotations in order to create visual overviews • Smarter mechanisms are needed to create VO, easy to share and embed • Comparative visual overviews over multiple datasets? • We need to run different tests to validate different approaches