Slide 1

Slide 1 text

Defining Dataset specifica-ons to communicate data quality Peter Desmet, S-jn Van Hoey, Dimitri Brosens

Slide 2

Slide 2 text

Darwin Core offers a lot of (necessary) freedom

Slide 3

Slide 3 text

But how do you express more rigorous requirements?

Slide 4

Slide 4 text

We need documenta-on

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Does my dataset comply?

Slide 8

Slide 8 text

We need machine-readable documenta-on

Slide 9

Slide 9 text

YAML Human & machine-readable

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Demo

Slide 12

Slide 12 text

Dataset

Slide 13

Slide 13 text

Run data-validator

Slide 14

Slide 14 text

Report

Slide 15

Slide 15 text

Improved dataset

Slide 16

Slide 16 text

Rerun data-validator

Slide 17

Slide 17 text

Specifica-ons for datasets

Slide 18

Slide 18 text

Specifica-ons for data publishers

Slide 19

Slide 19 text

Specifica-ons for data users

Slide 20

Slide 20 text

Specifica-ons for communi-es

Slide 21

Slide 21 text

Integra-on in data publica-on workflows

Slide 22

Slide 22 text

No content

Slide 23

Slide 23 text

Proof of concept github.com/inbo/data-validator Examples used in this presenta-on: bit.ly/2h352c8

Slide 24

Slide 24 text

Thanks! @peterdesmet @s-jnvanhoey @dimibro bit.ly/2h0cDLU Desmet P, Van Hoey S & Brosens D (2016) Defining dataset specifica-ons to communicate data quality. hbp://bit.ly/2h0cDLU