Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Government Data Portal for Germany

Konrad Reiche
September 12, 2013

The Government Data Portal for Germany

Fraunhofer FOKUS collaborated with the Federal Ministry of the Interior to build up the official open government data platform in Germany: govdata.de. This talk gives an introduction to open data use cases in Germany, how we approached and eventually architected the platform.

Konrad Reiche

September 12, 2013
Tweet

More Decks by Konrad Reiche

Other Decks in Technology

Transcript

  1. Fraunhofer FOKUS
    1 [email protected] | Berlin
    The Government Data Portal for Germany
    GovData.de
    Konrad Johannes Reiche | September 12, 2013 | Nancy, France

    View Slide

  2. Fraunhofer FOKUS
    2 [email protected] | Berlin
    Open Government Data
    Government Data
     Motives
    – Transparency
    – Innovation
    – Participation
    – Efficiency
     Core Elements
    – Machine-readable data
    – Licenses
    – Accessibility
    Open Government Data Venn Diagram
    by justgrimes

    View Slide

  3. Fraunhofer FOKUS
    3 [email protected] | Berlin
    Open Data Example #1
    mundraub.de
     Many fruit bushes and
    fruit trees are unused
     Wild fruits, private grower,
    organizations
     Data about these plants is
    collected and published on
    http://www.mundraub.de
     Data comes from people
    and administrations who
    submit their knowledge
    for the public use
    by hybrid.moment

    View Slide

  4. Fraunhofer FOKUS
    4 [email protected] | Berlin
    Open Data Example #2
    Glass Recycling Container in Berlin
     Glass Recycling Container in Berlin
    – City
    – Private Organizations
     The Berlin Cleansing Department (BSR)
    has to clean around them
     Problem BSR has no information about
    the location of many containers
     Solution Local administrations do have
    data about the container’s location and
    help the BSR by making these data
    publicly available
    by Andreas Möller
    by pixelroiber

    View Slide

  5. Fraunhofer FOKUS
    5 [email protected] | Berlin
    Metadata…
    …and Harvesting
     Data is stored and managed distributed
     Why? Centralized data is hardly feasible and beyond administrative
     Heterogeneous data, distributed competence, conflict of interests
     Metadata is used to describe the data
     Distributed data storage with central metadata portal
     Harvesting: Copying of metadata for making the data accessible, too
    Portal
    Document
    Dataset
    Dataset
    Document

    View Slide

  6. Fraunhofer FOKUS
    6 [email protected] | Berlin
    Starting Point
    Data Portals of the Länder
     Germany is a federalism
    – Consisting of 16 states (Länder)
    – Administrative power divided
     Different Data Provider and Portals
    – Bavaria
    – Berlin
    – Bremen
    – Federal Statistical Office (Destatis)
    – Hamburg
    – Rostock
    – Environmental Information Portal (PortalU)
    – Geo Data Infrastructure Germany (GDI-DE)
    – Rhineland-Palatinate
    – and more…
    ?
    DeStatis (David Liuzzo)

    View Slide

  7. Fraunhofer FOKUS
    7 [email protected] | Berlin
    Government Data Portal for Germany
    GovData.de
     Launch February 19, 2013
     http://www.govdata.de
     Prototyped at Fraunhofer FOKUS
     Different type of data
    – Datasets
    – Documents
    – Applications
     Focus on free licenses
    – German Data License (de-dl,…)
    – Creative Commons (cc-by,…)
    – ...

    View Slide

  8. Fraunhofer FOKUS
    8 [email protected] | Berlin
    Quantification
    in Numbers
     February, 2013
    – Datasets: 1,123
    – Documents: 12
    – Applications: 25
    – Daily visitors: 2,000
     March, 2013
    – Daily visitors: 500
     August, 2013
    – Datasets: 3,797
    – Documents: 230
    – Applications 15
    – Daily visitors: 300
    Open Data Licenses on GovData.de

    View Slide

  9. Fraunhofer FOKUS
    9 [email protected] | Berlin
    Building GovData.de
    Strategy
     Repository software: CKAN (Comprehensive Knowledge Archive Network)
    – Data catalogue for storing and distributing data
    – Developed by the Open Knowledge Foundation (OKFN)
    – Prevalent format: JSON
    – API offers REST Interface
     Metadata Schema (OGD-Metadata)
    – Structure used to standardize and unify metadata by data providers
    – https://github.com/fraunhoferfokus/ogd-metadata
    – JSON Schema, keep it simple (few fields), e.g. document data origins
    – Why the hassle? Different data providers: very heterogeneous data
    – Make data accessible: unification needed
    – Schema not a mere tool, but communicator

    View Slide

  10. Fraunhofer FOKUS
    10 [email protected] | Berlin
    Metadata Schema − Example
    Field Subfield Value
    Name waste-management-statistics-2013
    Title
    Waste Management: Disposal and Treatment
    Facility
    Author Statistical Office
    Maintainer Juliane Sanger
    Tags Hessen, Berlin, Visualization, Classification
    … … …
    Extras
    Terms of Use
    ID: cc-by
    URL: http://creativecommons.org/licenses/by/3.0
    Spatial Coordinates: [[15.02, 47.16], [15.02, 47.16]]
    Original Portal http://www.regionalstatistik.de
    … ….

    View Slide

  11. Fraunhofer FOKUS
    11 [email protected] | Berlin
    Architecture
    GovData.de
    Portal (Liferay)
    Information Pool
    (Web Portal + CMS)
    User Interface
    for the Data Catalog
    Indexer + Thesaurus
    CKAN
    CSW/CKAN Harvester
    REST
    Interface
    Browser
    Apps
    Web Sites of
    Public Authorities
    Subject Catalogs
    (Geo Data, etc.)
    Open Data Catalogs
    (Berlin, Bavaria, Bremen, etc.)
    REST
    Interface

    View Slide

  12. Fraunhofer FOKUS
    12 [email protected] | Berlin
    What’s next?
    Outlook
     Open data has to be understood as a process
     Active communication with current, but also to-be data providers to
    get more data, but especially more interesting data to GovData.de
     Quality of metadata plays a crucial role
    – Influences the discoverability and searchability
    – Needs to be improved constantly
     GovData.de and its metadata schema should not be an isolated application
    – Schema compatibility with Government Data Austria (data.gv.at)
    – DCAT: RDF vocabulary to facilitate interoperability between data catalogs

    View Slide

  13. Fraunhofer FOKUS
    13 [email protected] | Berlin
    Metadata Harvesting
    Techniques
     CKAN to CKAN
     JSON to CKAN
     ISO 19115 to CKAN
     CKAN API

    View Slide