$30 off During Our Annual Pro Sale. View Details »

Whip: Communicate and test what to expect from data

Peter Desmet
August 28, 2018

Whip: Communicate and test what to expect from data

Talk at the TDWG 2018 annual conference in Dunedin, New Zealand - August 28, 2018.

Abstract: https://doi.org/10.3897/biss.2.25317

Peter Desmet

August 28, 2018
Tweet

More Decks by Peter Desmet

Other Decks in Science

Transcript

  1. Whip
    Communicate and test
    what to expect from data
    Stijn Van Hoey & Peter Desmet

    View Slide

  2. Expectations
    Data
    Users

    View Slide

  3. Expectations
    Data
    Users
    Fit for my research?
    Fit for specific user
    community?

    View Slide

  4. We are a data publisher

    View Slide

  5. We care

    View Slide

  6. What to expect
    Data Publisher

    View Slide

  7. What to expect
    Data Publisher
    Data quality
    Standardization
    Community
    recommendations
    Dataset characteristics

    View Slide

  8. Expectations / What to expect
    Data Publisher
    Users
    Expectations What to expect

    View Slide

  9. How to communicate expectations?
    Data Publisher
    Users
    Expectations What to expect

    View Slide

  10. How to test expectations?
    Data Publisher
    Users
    Expectations What to expect

    View Slide

  11. Whip

    View Slide

  12. Whip syntax

    View Slide

  13. Whip syntax

    View Slide

  14. Whip syntax
    Field

    View Slide

  15. Whip syntax
    Field
    Specification

    View Slide

  16. Whip syntax
    Comment
    Field
    Specification

    View Slide

  17. Whip syntax
    Comment
    Field
    Specification

    View Slide

  18. Whip specifications
    allowed
    minlength / maxlength
    stringformat
    regex
    min / max
    numberformat
    mindate / maxdate
    dateformat

    View Slide

  19. Whip scope specifications
    empty
    delimitedvalues
    if

    View Slide

  20. Using whip to document

    View Slide

  21. Pywhip: a whip implementation

    View Slide

  22. import whip_csv from pywhip
    # load specifications
    with open("my_specifications.yml") as spec_file:
    specifications = yaml.load(spec_file)
    # test specifications
    test = whip_csv("my_data.csv", specifications)
    # get report
    test.get_report("html")
    Pywhip
    Or “json”

    View Slide

  23. Pywhip

    View Slide

  24. Pywhip

    View Slide

  25. Pywhip

    View Slide

  26. Conclusion
    Human and machine-readable syntax to express
    specifications for data
    Not specific to Darwin Core (but we plan to use it for that)
    Can be adopted by users (expectations) and publishers (what
    to expect)
    Can be included with dataset as testable metadata
    Pywhip: first implementation for testing whip specifications

    View Slide

  27. github.com/inbo/whip
    github.com/inbo/pywhip
    bit.ly/pywhip_binder
    Thank you!
    Data Specifications

    View Slide