human-computer collaboration
mauricio giraldo arteaga
@mgiraldo
@nypl_labs
IPAM Culture Analytics and User Experience Design, April 2016
Slide 3
Slide 3 text
hello
Slide 4
Slide 4 text
not a real library scientist
Slide 5
Slide 5 text
flickr.com/photos/wallyg/6133216510
Slide 6
Slide 6 text
No content
Slide 7
Slide 7 text
No content
Slide 8
Slide 8 text
Eric Shows
Slide 9
Slide 9 text
NYPL Labs
Slide 10
Slide 10 text
access
digitization metadata
public
traditional digital library program
Slide 11
Slide 11 text
access
digitization metadata
public
engagement
r+d
Slide 12
Slide 12 text
No content
Slide 13
Slide 13 text
No content
Slide 14
Slide 14 text
what happens after digitization?
Slide 15
Slide 15 text
human-computer collaboration
Slide 16
Slide 16 text
¿ ?
Slide 17
Slide 17 text
embrace imperfection
corollary of “perfect is the enemy of good”
Slide 18
Slide 18 text
« A designer’s definition of
‘perfect’ is different for
computational designers. »
because it is not achievable
John Maeda
Slide 19
Slide 19 text
human-computer collaboration
Slide 20
Slide 20 text
computers are good at some things…
Slide 21
Slide 21 text
Randall Munroe - xkcd.com/1140
Slide 22
Slide 22 text
David Hagen - drhagen.com/blog/the-missing-11th-of-the-month
Slide 23
Slide 23 text
people overestimate OCR quality
Slide 24
Slide 24 text
OCR result
Slide 25
Slide 25 text
okay… so maybe computers are not that good
Slide 26
Slide 26 text
people are good at other things
Slide 27
Slide 27 text
human-computer collaboration
i avoid the term “crowdsourcing”
Slide 28
Slide 28 text
two examples
Slide 29
Slide 29 text
No content
Slide 30
Slide 30 text
No content
Slide 31
Slide 31 text
No content
Slide 32
Slide 32 text
No content
Slide 33
Slide 33 text
footprint
material
use type
street names
address
floors
name
class
geo location
year
skylights
backyards
Slide 34
Slide 34 text
like Google Maps for the 19th century
but Google Maps cannot answer questions about the 19th century
Slide 35
Slide 35 text
No content
Slide 36
Slide 36 text
No content
Slide 37
Slide 37 text
No content
Slide 38
Slide 38 text
*this is a simulation. actual process is intensive. consult your mathematician before trying
Slide 39
Slide 39 text
No content
Slide 40
Slide 40 text
No content
Slide 41
Slide 41 text
and now you start tracing those buildings by hand
(˽°□°)˽Ɨ ˍʓʓˍ
Slide 42
Slide 42 text
No content
Slide 43
Slide 43 text
1852-1854
Slide 44
Slide 44 text
1852-1854
Slide 45
Slide 45 text
can we automate this?
Slide 46
Slide 46 text
computers are good at some things…
Slide 47
Slide 47 text
No content
Slide 48
Slide 48 text
No content
Slide 49
Slide 49 text
No content
Slide 50
Slide 50 text
No content
Slide 51
Slide 51 text
No content
Slide 52
Slide 52 text
No content
Slide 53
Slide 53 text
No content
Slide 54
Slide 54 text
yay footprints!
60,000+ of those!
Slide 55
Slide 55 text
like OCR for maps!™
(not really trademarked)
Slide 56
Slide 56 text
but OCR is pretty bad
ಠ_ಠ
Slide 57
Slide 57 text
No content
Slide 58
Slide 58 text
people are good at other things!
Slide 59
Slide 59 text
No content
Slide 60
Slide 60 text
No content
Slide 61
Slide 61 text
people don’t choose to complete these
Slide 62
Slide 62 text
we have over 60,000 footprints to check!
will people want to do this?
Slide 63
Slide 63 text
what is the minimum contribution we need?
we want the lowest friction possible so people will want to contribute
Slide 64
Slide 64 text
this was 2013, touch-screen mobile had taken off
Slide 65
Slide 65 text
No content
Slide 66
Slide 66 text
what about malicious users?
or even well-meaning ones who make mistakes
Slide 67
Slide 67 text
No content
Slide 68
Slide 68 text
75% or more agreement between 3 or more people
arbitrary numbers that have worked for us
Slide 69
Slide 69 text
No content
Slide 70
Slide 70 text
YES is on the right side because most people are right-handed and the algorithm is right most of the time
Slide 71
Slide 71 text
Building Inspector
buildinginspector.nypl.org
Slide 72
Slide 72 text
will people participate?
remember that little tweet button?
Slide 73
Slide 73 text
No content
Slide 74
Slide 74 text
No content
Slide 75
Slide 75 text
No content
Slide 76
Slide 76 text
footprint
material
use type
street names
address
floors
name
class
geo location
year
skylights
backyards
Slide 77
Slide 77 text
No content
Slide 78
Slide 78 text
No content
Slide 79
Slide 79 text
check
YES FIX
address color fix
*footprints marked as “NO” go to polygon heaven
Slide 80
Slide 80 text
address
had to use full keyboard on mobile because fractions
Slide 81
Slide 81 text
classify
Slide 82
Slide 82 text
fix
Slide 83
Slide 83 text
place names
Slide 84
Slide 84 text
No content
Slide 85
Slide 85 text
we add new maps as old ones are completed
the bottleneck now became geo-rectifying those maps ¯\_(ϑ)_/¯
Slide 86
Slide 86 text
this is actually version 2
Slide 87
Slide 87 text
(the magic of git)
Slide 88
Slide 88 text
No content
Slide 89
Slide 89 text
No content
Slide 90
Slide 90 text
No content
Slide 91
Slide 91 text
good tutorials are hard
Slide 92
Slide 92 text
No content
Slide 93
Slide 93 text
Super Mario Bros. (Nintendo, 1985)
Slide 94
Slide 94 text
we have too many edge cases
or: how i learned to stop worrying and embrace imperfection
Slide 95
Slide 95 text
No content
Slide 96
Slide 96 text
¯\_(ϑ)_/¯
people skip them anyway
Slide 97
Slide 97 text
No content
Slide 98
Slide 98 text
coming soon
Slide 99
Slide 99 text
No content
Slide 100
Slide 100 text
No content
Slide 101
Slide 101 text
No content
Slide 102
Slide 102 text
No content
Slide 103
Slide 103 text
NYPL Community Oral History Project
oralhistory.nypl.org
Slide 104
Slide 104 text
No content
Slide 105
Slide 105 text
No content
Slide 106
Slide 106 text
make these stories more accessible
Slide 107
Slide 107 text
No content
Slide 108
Slide 108 text
mark
transcribe
Slide 109
Slide 109 text
by brian foo @beefoo
Slide 110
Slide 110 text
allows for basic text search
but it’s not a proper transcript
Slide 111
Slide 111 text
No content
Slide 112
Slide 112 text
No content
Slide 113
Slide 113 text
we felt we needed something different
Slide 114
Slide 114 text
No content
Slide 115
Slide 115 text
computers are good at some things…
Slide 116
Slide 116 text
like OCR for audio!™
(not sure if they trademarked that)
Slide 117
Slide 117 text
we get transcription “snippets”
from 1 to about 6 seconds long in varying levels of quality
Slide 118
Slide 118 text
No content
Slide 119
Slide 119 text
people are good at other things…
Slide 120
Slide 120 text
No content
Slide 121
Slide 121 text
No content
Slide 122
Slide 122 text
by brian foo @beefoo
Slide 123
Slide 123 text
we conducted a few usability studies
Slide 124
Slide 124 text
by brian foo @beefoo
Slide 125
Slide 125 text
No content
Slide 126
Slide 126 text
No content
Slide 127
Slide 127 text
it’s hard to reach consensus
ಠ_ಠ
Slide 128
Slide 128 text
embrace imperfection
Slide 129
Slide 129 text
transcribe.oralhistory.nypl.org
Slide 130
Slide 130 text
transcribe.oralhistory.nypl.org
Slide 131
Slide 131 text
made with customizability in mind
Slide 132
Slide 132 text
storyscribe.themoth.org
Slide 133
Slide 133 text
this is one week after launch
Slide 134
Slide 134 text
it is still being improved
Slide 135
Slide 135 text
two of several projects we’ve worked on so far
Slide 136
Slide 136 text
of human-computer collaboration
Slide 137
Slide 137 text
it’s a collaborative process
Willa Armstrong, Shawn Averkamp, Paul Beaudoin, Brian Foo, Josh
Hadro, Elizabeth Hummer, Ara Kim, Shana Kimball, Tom Listanti,
Matthew Miller, Eric Shows, Bert Spaan, and more at NYPL…
Slide 138
Slide 138 text
one more thing…
Slide 139
Slide 139 text
No content
Slide 140
Slide 140 text
No content
Slide 141
Slide 141 text
No content
Slide 142
Slide 142 text
No content
Slide 143
Slide 143 text
lala.cursivebuildings.com
Slide 144
Slide 144 text
how to decode the 3D data?
in the browser
Slide 145
Slide 145 text
No content
Slide 146
Slide 146 text
No content
Slide 147
Slide 147 text
No content
Slide 148
Slide 148 text
No content
Slide 149
Slide 149 text
No content
Slide 150
Slide 150 text
stereo.nypl.org
Slide 151
Slide 151 text
No content
Slide 152
Slide 152 text
No content
Slide 153
Slide 153 text
No content
Slide 154
Slide 154 text
No content
Slide 155
Slide 155 text
No content
Slide 156
Slide 156 text
Boston Public Library Boston Public Library U.S. Geological Survey
U.S. Geological Survey
Slide 157
Slide 157 text
thank you!
mauricio giraldo arteaga
@mgiraldo
@nypl_labs
IPAM Culture Analytics and User Experience Design, April 2016