Slide 1

Slide 1 text

Whip Communicate and test what to expect from data Stijn Van Hoey & Peter Desmet

Slide 2

Slide 2 text

Expectations Data Users

Slide 3

Slide 3 text

Expectations Data Users Fit for my research? Fit for specific user community?

Slide 4

Slide 4 text

We are a data publisher

Slide 5

Slide 5 text

We care

Slide 6

Slide 6 text

What to expect Data Publisher

Slide 7

Slide 7 text

What to expect Data Publisher Data quality Standardization Community recommendations Dataset characteristics

Slide 8

Slide 8 text

Expectations / What to expect Data Publisher Users Expectations What to expect

Slide 9

Slide 9 text

How to communicate expectations? Data Publisher Users Expectations What to expect

Slide 10

Slide 10 text

How to test expectations? Data Publisher Users Expectations What to expect

Slide 11

Slide 11 text

Whip

Slide 12

Slide 12 text

Whip syntax

Slide 13

Slide 13 text

Whip syntax

Slide 14

Slide 14 text

Whip syntax Field

Slide 15

Slide 15 text

Whip syntax Field Specification

Slide 16

Slide 16 text

Whip syntax Comment Field Specification

Slide 17

Slide 17 text

Whip syntax Comment Field Specification

Slide 18

Slide 18 text

Whip specifications allowed minlength / maxlength stringformat regex min / max numberformat mindate / maxdate dateformat

Slide 19

Slide 19 text

Whip scope specifications empty delimitedvalues if

Slide 20

Slide 20 text

Using whip to document

Slide 21

Slide 21 text

Pywhip: a whip implementation

Slide 22

Slide 22 text

import whip_csv from pywhip # load specifications with open("my_specifications.yml") as spec_file: specifications = yaml.load(spec_file) # test specifications test = whip_csv("my_data.csv", specifications) # get report test.get_report("html") Pywhip Or “json”

Slide 23

Slide 23 text

Pywhip

Slide 24

Slide 24 text

Pywhip

Slide 25

Slide 25 text

Pywhip

Slide 26

Slide 26 text

Conclusion Human and machine-readable syntax to express specifications for data Not specific to Darwin Core (but we plan to use it for that) Can be adopted by users (expectations) and publishers (what to expect) Can be included with dataset as testable metadata Pywhip: first implementation for testing whip specifications

Slide 27

Slide 27 text

github.com/inbo/whip github.com/inbo/pywhip bit.ly/pywhip_binder Thank you! Data Specifications