Link
Embed
Share
Beginning
This slide
Copy link URL
Copy link URL
Copy iframe embed code
Copy iframe embed code
Copy javascript embed code
Copy javascript embed code
Share
Tweet
Share
Tweet
Slide 1
Slide 1 text
Collec&ng quan&ta&ve metadata by coun&ng all specimens in a herbarium Peter Desmet
Slide 2
Slide 2 text
Quan&ta&ve metadata are cool! A very colourful presenta&on by @peterdesmet #tdwg
Slide 3
Slide 3 text
Index Herbariorum 350,000,000 herbarium specimens worldwide
Slide 4
Slide 4 text
25,000,000 digi&zed and published (= 7%) GBIF Data Portal (Andrea Hahn)
Slide 5
Slide 5 text
What do we know about the other 93% ?
Slide 6
Slide 6 text
Descrip&ve metadata
Slide 7
Slide 7 text
Metadata registries bit.ly/IH-‐herbaria biocol.org
Slide 8
Slide 8 text
Collec&on name + code Address Staff Subcollec&ons
Slide 9
Slide 9 text
Es&mated size Based on what? Actually counted?
Slide 10
Slide 10 text
Geographic scope Pre^y well described How distributed?
Slide 11
Slide 11 text
Taxonomic scope Vascular plants + Bryophytes? Families? Genera?
Slide 12
Slide 12 text
Can we get some real numbers?
Slide 13
Slide 13 text
Vascular plants specimens are organized in Folders
Slide 14
Slide 14 text
No content
Slide 15
Slide 15 text
No content
Slide 16
Slide 16 text
No content
Slide 17
Slide 17 text
No content
Slide 18
Slide 18 text
What if we counted the folders?
Slide 19
Slide 19 text
And the # of specimens per folder?
Slide 20
Slide 20 text
? $ How much would it cost?
Slide 21
Slide 21 text
? days How long would it take?
Slide 22
Slide 22 text
What we did at the Marie-‐Victorin Herbarium (MT)
Slide 23
Slide 23 text
Move an es&mated 900,000 specimens
Slide 24
Slide 24 text
More space Reassign 350 -‐> 640 cases
Slide 25
Slide 25 text
New classifica&on Flowering plants: APG III (2009) Ferns: Smith et al. (2006)
Slide 26
Slide 26 text
Coun&ng Digi&zing Data cleaning Publishing
Slide 27
Slide 27 text
Coun&ng
Slide 28
Slide 28 text
No content
Slide 29
Slide 29 text
Average age > 60
Slide 30
Slide 30 text
No content
Slide 31
Slide 31 text
1 summer
Slide 32
Slide 32 text
826 work hours 110 work days, 22 work weeks
Slide 33
Slide 33 text
Digi&zing
Slide 34
Slide 34 text
4 volunteers
Slide 35
Slide 35 text
Paper -‐> Excel
Slide 36
Slide 36 text
Data cleaning
Slide 37
Slide 37 text
2 volunteers 1 professor 1 informa&cian
Slide 38
Slide 38 text
Correc&ng errors Typos, missed genera, dubious counts
Slide 39
Slide 39 text
New classifica&on Assigning families, correc&ng genera
Slide 40
Slide 40 text
Format data
Slide 41
Slide 41 text
Publishing
Slide 42
Slide 42 text
1 informa&cian (me)
Slide 43
Slide 43 text
Google Fusion Tables bit.ly/mt-‐inventory-‐gk
Slide 44
Slide 44 text
Darwin Core Archive via IPT bit.ly/mt-‐inventory
Slide 45
Slide 45 text
Metadata = EML Descrip&ve metadata
Slide 46
Slide 46 text
Occurrence dataset basisOfRecord = PreservedSpecimen
Slide 47
Slide 47 text
1 record 1 folder 1 genus 1 loca&on in 1 tray
Slide 48
Slide 48 text
# specimens individualCount
Slide 49
Slide 49 text
What do we know now?
Slide 50
Slide 50 text
22,298 folders
Slide 51
Slide 51 text
628,664 specimens
Slide 52
Slide 52 text
2/3 of previous es&mate
Slide 53
Slide 53 text
21.5% digi&zed
Slide 54
Slide 54 text
380 families
Slide 55
Slide 55 text
82% of known families
Slide 56
Slide 56 text
5,298 genera
Slide 57
Slide 57 text
6 con&nents
Slide 58
Slide 58 text
Combina&ons Rubus specimens from Canada? Yes: 2921, in trays A236-‐07 – A238-‐04
Slide 59
Slide 59 text
Useful for us In-‐house management & planning Digi&za&on priori&es
Slide 60
Slide 60 text
Useful for others? Loans Demand driven digi&za&on?
Slide 61
Slide 61 text
Granularity Genus, con&nent -‐> Useful for climate change & invasive species studies?
Slide 62
Slide 62 text
Global picture Really 350 mil. specimens? How distributed over genus & con&nent?
Slide 63
Slide 63 text
Cost / Time ?
Slide 64
Slide 64 text
158 work days Publishing 1% Data cleaning 21% Digi&zing 8% Coun&ng 70%
Slide 65
Slide 65 text
5,740 $ total salary cost Publishing 7% Digi&zing 0% Coun&ng 37% Data cleaning 56%
Slide 66
Slide 66 text
110 specimens = 1$ 100 &mes cheaper than full digi&za&on
Slide 67
Slide 67 text
3,200,000 $ All 350 mil. specimens
Slide 68
Slide 68 text
138 h 1049 h Staff 5,740 $ Volunteers 0 $ 88% by volunteers
Slide 69
Slide 69 text
16,230 $ 10$ wage for “volunteers” + staff salary
Slide 70
Slide 70 text
9,000,000 $ All 350 mil. specimens
Slide 71
Slide 71 text
340 years 1 person at 7.5h/day, 5 days/week, no holidays
Slide 72
Slide 72 text
26 days One person per herbarium 3,400 herbaria -‐ Index Herbariorum
Slide 73
Slide 73 text
?! Tricky to extrapolate! What about non-‐mounted specimens? How useful is this data? Is there a metadata repository?
Slide 74
Slide 74 text
First step Towards some real numbers
Slide 75
Slide 75 text
Thanks! bit.ly/mt-‐inventory Peter Desmet