Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Lies and Data Performances

Marc Alexander
March 05, 2015
38

Data Lies and Data Performances

Presented at the DAH PhD Workshop on Visualization, 2015

Marc Alexander

March 05, 2015
Tweet

More Decks by Marc Alexander

Transcript

  1. D ATA L I E S A N D D

    ATA P E R F O R M A N C E S D A TA V I S U A L I Z A T I O N F O R T H E A R T S A N D H U M A N I T I E S D I G I TA L A R T S A N D H U M A N I T I E S , Q U E E N ’ S U N I V E R S I T Y B E L FA S T M A R C A L E X A N D E R [email protected] @marcgalexander
  2. W H Y V I S U A L I

    Z E ? ‣ Contextualisation ‣ Explorability ‣ Efficiency ‣ Graspability
  3. A situation with ‘direct perception and action in familiar frames

    that are easily apprehended by human beings: An object falls, someone lifts an object, two people converse, one person goes somewhere. They typically have very few participants, direct intentionality, and immediate bodily effect and are immediately apprehended as coherent’ (p.312). In short, the human scale ‘is the level at which is it natural for us to have the impression that we have direct, reliable, and comprehensive understanding’ (p.323). Fauconnier, Gilles and Mark Turner. 2002. The Way We Think: Conceptual Blending and the Mind’s Hidden Complexities. New York: Basic Books. H U M A N S C A L E
  4. 6 . W H Y : C R O S

    S - R E F E R E N C E M E A N I N G S M A R C A L E X A N D E R @ M A R C G A L E X A N D E R U N I V E R S I T Y O F G L A S G O W , U K Data Deluge, Brett Ryder, The Economist, Feb. 2010
  5. “A new conjunction of scientist, curator, humanist, and artist is

    what the digital humanities must strive to achieve. It is the only way of ensuring that we do not lose our souls in a world of data.” Prescott, Andrew. 2012. An Electric Current of the Imagination: What the Digital Humanities Are and What They Might Become. Journal of Digital Humanities 1(2).
  6. O U T L I N E ‣ Performance and

    production ‣ Dimensions and their reduction ‣ The particular problem of text ‣ Confessions ‣ Envoi
  7. 1 . P E R F O R M A

    N C E A N D P R O D U C T I O N
  8. P R O D U C T I O N

    ‣ Choose your text ‣ Decide what you need to highlight in the text ‣ Create a budget, discover your resources, and work to your limitations ‣ Workshop your ideas ‣ Decide how you display the text ‣ Respect detail ‣ Produce your performance ‣ Evaluate it
  9. C H O R U S ‣ Remember it is

    possible to produce a wrong (unfaithful) adaptation of a text ‣ If a text is worth producing, it must always be reduced in performance ‣ Don't just have a hammer ‣ Don't lose the audience ‣ Never mislead the audience ‣ It is always our fault if we are not understood.
  10. E D WA R D T U F T E

    ’ S P R I N C I P L E S ‣ Graphical excellence consists of complex ideas communicated with clarity, precision, and efficiency. ‣ Graphical excellence is what gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space. ‣ Graphical excellence requires telling the truth about data. ‣ Embrace multivariate data, and focus on what makes it coherent. ‣ Encourage users to compare and contrast different pieces of data. ‣ Show causality, mechanism, explanations, systematic structure. ‣ Induce the viewer to think about the substance of your data rather than about your methodology, design, technology, etc. The Visual Display of Quantitative Information, p. 51
  11. Minard, Charles Joseph (1869) Carte figurative des pertes successives en


    hommes de l'Armée Française dans la campagne de Russie 1812-1813. Paris: [s.n.].
  12. R E D U C I N G D I

    M E N S I O N S ‣ You will often need to reduce the number of dimensions in your data that you can display on screen (usually three or four) ‣ Eg, by ignoring them ‣ Or statistically (by performing MDS/PCA) ‣ One visual dimension each for: Position (x), Position (y), Position (z), Shape, Size, and Colour ‣ Can also change across time – dynamically and of viewing
  13. P H Y S I C A L D I

    M E N S I O N S : A WA R N I N G
  14. D I M E N S I O N S

    A N D ‘ D ATA L I E S ’ “There are considerable ambiguities in how people perceive a two-dimensional surface and convert that perception into a one-dimensional number. Changes in physical area… do not reliably produce proportional changes in perceived areas. These designs cause so many problems that they should be avoided.” Tufte, Edward R (2001) The Visual Display of Quantitative Information, 2nd edition. Cheshire, CT: Graphics Press. See also Macdonald-Ross, Michael (1977) How Numbers Are Shown: A Review of Research on the Presentation of Quantitative Data in Texts. Educational Technology Research and Development, 25(4) 359-409.
  15. D I M E N S I O N S

    A N D ‘ D ATA L I E S ’
  16. D I M E N S I O N S

    A N D ‘ D ATA L I E S ’
  17. D I M E N S I O N S

    A N D ‘ D ATA L I E S ’
  18. Q U E S T I O N S T

    O A S K ‣ How many data dimensions can I show? ‣ Aim for a cross-data comparison, a demonstration of relationships, an illustration of parts compared to the whole, geographical, time-based, etc? ‣ Which visual dimensions do I want to use (Position (x), Position (y), Position (z), Shape, Size, Time, and Colour)? ‣ Which visual dimensions can I use?
  19. D I S TA N C E A S A

    C O N D I T I O N Moretti, Franco. 2003. Graphs, Maps, Trees: Abstract Models for Literary History 1. New Left Review 24: 67-93. Moretti, Franco. 2005. Graphs, Maps, Trees. London: Verso. Moretti, Franco. 2013. Distant Reading. London: Verso.
  20. […] ambition is now directly proportional to the distance from

    the text: the more ambitious the project, the greater must the distance be. […] if you want to look beyond the canon (and of course, world literature will do so: it would be absurd if it didn’t!) close reading will not do it. It’s not designed to do it, it’s designed to do the opposite. At bottom, it’s a theological exercise—very solemn treatment of very few texts taken very seriously—whereas what we really need is a little pact with the devil: we know how to read texts, now let’s learn how not to read them.
  21. Distant reading: where distance, let me repeat it, is a

    condition of knowledge: it allows you to focus on units that are much smaller or much larger than the text: devices, themes, tropes—or genres and systems. And if, between the very small and the very large, the text itself disappears, well, it is one of those cases when one can justifiably say, less is more. If we want to understand the system in its entirety, we must accept losing something. We always pay a price for theoretical knowledge: reality is infinitely rich; concepts are abstract, are poor. But it’s precisely this ‘poverty’ that makes it possible to handle them, and therefore to know. This is why less is actually more. Moretti, Franco. 2000. Conjectures on World Literature. New Left Review 1. http://www.newleftreview.org/A2094
  22. 3 . T H E PA R T I C

    U L A R P R O B L E M O F T E X T
  23. T E X T A N D V I S

    U A L S • As the volume of text increases, so does interference (“noise”) • Text is the biggest challenge of big data in any field • Data in the sciences is often highly ordered; in the humanities it uses text, and so is rarely ordered well • Real language in use is chaotic, contradictory, complex, and counterintuitive (as are people)
  24. “Words, words. They’re all we have to go on.” Stoppard,

    Tom. 1967. Rosencrantz and Guildenstern are Dead.
  25. 01.03.01.05.02|03.05 n 
 Health and disease :: Disorders of cattle/horse/sheep

    :: disorders of cattle/sheep :: other disorders strike (1933–) 01.03.03.04.14|02 vt
 Make healthy :: Practise physiotherapy :: rub/stroke with hands strike (1400 + 1611 + 1886 dial.) 01.02.04.04.03|07 vt 
 Come by death :: Kill by specific method :: by poisoning strike (1592–1621) 01.06.10.08|01 vi
 Plant :: Be a root :: grow (as root) strike (1682–) 01.05.17.05.02|04 n
 Animals :: Suborder Ophidia (snakes) :: act of darting at prey strike (1879–) 01.10.03.03.02.03|12.06 vt
 Burn/consume by fire :: kindle/set alight :: produce (fire/spark) by striking strike (c1450– also fig.) 01.10.09.03.03|01 vi
 Dye :: sink in strike (c1790–) 01.13.06.01|04 vt
 Time :: Clock :: strike strike (1417–) 03.11.04.03|05 vi
 Carry on an occupation/work :: Participate in labour relations :: strike strike (1768–)
  26. 62% of English word forms
 refer to more than one

    meaning Of the 793,742 entries in the Historical Thesaurus
 of English there are 370,011 non-Old-English
 word forms, of which: 67 have more than 100 possible meanings 464 have more than 50 possible meanings 2,580 have more than 20 possible meanings 7,554 have more than 10 possible meanings 111,127 have more than 1 possible meaning 258,883 have just 1 possible meaning
  27. I: The External World 01. The world 01.01 The earth

    01.02 Life 01.03 Health and Disease 01.04 People 01.05 Animals 01.06 Plants 01.07 Food and drink 01.08 Textiles and clothing 01.09 Physical sensation 01.10 Matter 01.10.01. Alchemy 01.10.02. Chemistry 01.10.03. Properties of materials 01.10.04. Constitution of matter 01.10.05. Liquid 01.10.06. Gas 01.10.07. Physics 01.10.08. Light 01.10.09. Colour 01.04.09.07. Named colours
  28. EEBO-TCP Corpus (40,000 Early Modern English texts; almost all the

    books and pamphlets published in English before 1700) Hansard Corpus (2.3 billion words; approximately every word uttered in Parliament over the past two hundred years)
  29. A C C E S S • Web demo site:

    http://is.gd/semtag – Quick access • A more convenient GUI tool for processing multiple texts – Contact me if you want a copy! • (Soon) Access via Web server client – Contact me if you are interested • (Soon) Access via WMatrix – http://ucrel.lancs.ac.uk/wmatrix
  30. Dr Marc Alexander University of Glasgow Jean Anderson University of

    Glasgow Professor Dawn Archer University of Central Lancashire Dr Alistair Baron Lancaster University Professor Jonathan Hope University of Strathclyde Professor Lesley Jeffries University of Huddersfield Professor Christian Kay University of Glasgow Dr Paul Rayson Lancaster University Dr Brian Walker University of Huddersfield Brian Aitken University of Glasgow Dr Fraser Dallachy University of Glasgow Dr Jane Demmen University of Huddersfield Bethan McCarthy University of Central Lancashire Dr Scott Piao Lancaster University Stephen Wattam Lancaster University Professor Mark Davies Brigham Young University Professor Anthony Johnson Åbo Akademi University Ilkka Juuso University of Oulu Professor Tapio Seppänen University of Oulu Other partners: Oxford University Press, the University of Wisconsin-Madison
 and the Folger Shakespeare Library.
  31. 4 . C O N F E S S I

    O N S / A D M I S S I O N S ( T H E R A P Y )
  32. “a magnificent achievement of quite extraordinary value. It is perhaps

    the single most significant tool ever devised for investigating semantic, social, and intellectual history” Randolph Quirk
  33. ‣ T H E F I R S T H

    I S TO R I C A L T H E S AU R U S F O R A N Y L A N G UAG E ‣ C . 7 9 7 , 1 2 0 M E A N I N G S ‣ C . 2 3 6 , 4 0 0 C AT E G O R I E S ‣ C . 1 , 0 0 0 , 0 0 0 I N D E X E N T R I E S ‣ WO R D S DAT I N G F RO M C . 7 0 0 A D TO T H E P R E S E N T DAY I R E N É WOT H E R S P O O N
  34. ‣ T H E F I R S T H

    I S TO R I C A L T H E S AU R U S F O R A N Y L A N G UAG E ‣ C . 7 9 7 , 1 2 0 M E A N I N G S ‣ C . 2 3 6 , 4 0 0 C AT E G O R I E S ‣ C . 1 , 0 0 0 , 0 0 0 I N D E X E N T R I E S ‣ WO R D S DAT I N G F RO M C . 7 0 0 A D TO T H E P R E S E N T DAY I R E N É WOT H E R S P O O N
  35. ‣ T H E F I R S T H

    I S TO R I C A L T H E S AU R U S F O R A N Y L A N G UAG E ‣ C . 7 9 7 , 1 2 0 M E A N I N G S ‣ C . 2 3 6 , 4 0 0 C AT E G O R I E S ‣ C . 1 , 0 0 0 , 0 0 0 I N D E X E N T R I E S ‣ WO R D S DAT I N G F RO M C . 7 0 0 A D TO T H E P R E S E N T DAY I R E N É WOT H E R S P O O N
  36. ‣ T H E F I R S T H

    I S TO R I C A L T H E S AU R U S F O R A N Y L A N G UAG E ‣ C . 7 9 7 , 1 2 0 M E A N I N G S ‣ C . 2 3 6 , 4 0 0 C AT E G O R I E S ‣ C . 1 , 0 0 0 , 0 0 0 I N D E X E N T R I E S ‣ WO R D S DAT I N G F RO M C . 7 0 0 A D TO T H E P R E S E N T DAY I R E N É WOT H E R S P O O N
  37. !"# $%&#'()* | !"#!$% &'()*')+ !"#!$#!.#!/ !(!"#$%!)! % J%&'()*)*+,'-,2)*+3)4')<,3*)'4 !,

    2%X%C)< lexemic !%$)' !"#!$#!.#!/#!, !(!&%!)! % =-&1 clipung .12. · dæl .12. · word< word .12.' · vocable !$#*'!+#(/.!"(+' · diction !$)&'!+%" · whid !$+"'!(+!...,+,<(*',,,.. · vowel !$"('!+)( · accent !$%$' · phrase !$%"'!+%% · quatch .".!+#$/.!"(#'...,+,1)(2I,,,.. · mot .!.!+)$ · verb .".!"!+ · verbalism !"("' · monolog !%&%'!%)" · dicky-bird !%#&'...,+,42(*+,,,.. · dicky !%)#...,+,42(*+,,,.. · word-type !%+!' · lexical item !%+)' · lexon !%+)' !, 9&)''%* word< word .12.' !" 4$-D%* ideophone !((!'!%*% !- $&-$%&#($$&-$&)('%, 9-&1 the word !$%+' !. +6-4',9-&1 ghost word !((+' · ghost name !(%+ · ghost form !%## !/ 9-&1,9)'6,-*2?, -*%,&%<-&1%1,-<<3&&%*<% hapax legomenon !((&' · hapax !%+& !0 9-&1,%X$&%44)*+,$6&(4%#4%*'%*<% sentence- word !()(' · holophrasm !(+&'!%** · holophrase !(%%' · phrase-word !%##' · monorheme !%#"' !0#!, 34%,-" holophrasis !(+% !1 *-*4%*4%,9-&1 nonsense-name !()& · nonsense word !%!%' !$ -'6%&,4$%<)! ,<,'?$%4, -",9-&1 cyneword .12. · froforword .12. · grandame words !$%( · household word !$%%' · wordy/wordie !"!('...,+,;<-'4,,,.. · my whole .!.!"(%' · foundling !(&"0#( · Mesopotamia !(&"' · book-word !($! · thought-word .".!(++' · pillow-word !(""' · nonce-word !(() · word- symbol !%*)' · blessed word !%!*' · object word !%!)' · bogy-word !%!% · key-word !%&+' · fossil !%#!' · nursery word !%##' · four-letter word !%#)' · pseudo-word !%$!' · plus word !%$%' · non-word !%+!' !2 %U3)G(2%*', -",(,9-&1,)*,(*-'6%&,2(*+3(+% synonym !$%)/.!(*) ,! 9-&14,<-22%<')G%2?#G-<(B32(&? wordhord .12. · wordloca .12. · vocabulary !"(&' · wordage !(&%' · word- hoard !(+%' · wordlore !%*) · word-stock !%!!' · lexicon !%##' · lexis !%+*' · vocab !%"!' ,!#!, -",(,4$6%&%,-", (<')G)'? lexicon !+)"'!(#% ,!#!" -",(,U3(2)'?#"%%2)*+ vocabulary !""*' ,!#!- -",(,&%+)-*#4$%(D%& lexicon !%$)' ,!#!. C)*)C3C minimum vocabulary !%))'!%)" ,!#!/ 7(4)<,>*+2)46 Basic English !%&%' ,, 4)C)2(&# <-*"34(B2%,9-&14 homoeoteleft !+$& · word-pair !%#+' · confusables/confusibles !%"%' ,,#!, 4)C)2(&)'?, <(34)*+,C)4'(D%4,)*,<-$?)*+ homoeoteleuton !(+!' · homoeoarchy !((# · homoeotel !((# · homoeotopy !((# · homoe(o)archon !(%+' ," -'6%&,4$%<)! ,<, +&-3$4#4%'4,-",9-&14 doublet !("%' · word square !(%* ,- 2%''%&#4-3*1,(',B%+)**)*+#%*1,-",9-&1 word-fi nal !%!(' · word-initial !%!(' ,. 4?CB-2,&%$&%4%*')*+,9-&1 word-symbol !%## · lexigram !%"#' ,/ 4'31?,-",9-&14 lexicology !(&(0#&...,+,H)<'I,,,../.!%)%' · lexis !%+*' ,/#!, -*%, 96- wordster !%+$'...,+,*-*<%,9-&1,,,.. ,0 2-G%,-",9-&14 logolatry !(!*' · epeolatry !(+*'!%+( · verbomania !%&#' · logophilia !%(*' ,0#!, -*%,96- logophile !%$%' ,1 -*%,96-,+4D)2"322?,,34%4,9-&14 verbalist !"%)' · wordmonger !%!+' · wordster !%"!'...,+,*-*<%, 9-&1,,,.. ,$ "%(&,-",9-&14 logophobia !%&#' ,2 <-*<%&*, C%&%2?,9)'6,9-&14 verbality !+)$'!("" ,2#!, -*%, 96- verbalist .!.!+*%' "! *3CB%&#"&%U3%*<?,-",9-&14 word-frequency !%&(' · word-count !%#*' · wordage !%"$' ", (B3*1(*<%,-",9-&14 copiousness !(&"'!($* "" <6(*+%,-","-&C,-",9-&1,'-,+)G%,4)+*)! ,<(*<% parasynesis !("" · popular etymology !((*' · folk etymology !((#' "- %X$&%44)-*,B?,C%(*4,-",2%X)<-* lexicalization !%)%' !"#!$#!.#!/#!, !(!"#$%!)! % J%&'()*)*+,'-,(,9-&1 verbal !+*$'!"*! · lexonic !%++ !, )*,&%4$%<',-", %(<6,4)*+2%,9-&1 verbal !"%*' !" $%&'()*)*+,'-, 9-&14 vocabular !+*(' · verbatical !+!& · vocabulary !+!+' · verbarian !(#* · lexical !(#+' · lexicalic !(+* · morpholexical !%#%' !- <-*4)4')*+,-"#%X$&%44%1, )*,9-&14 verbal !$#*' · wordy !+&"' !-#!, )*, C%&%,9-&14 wordish .".!$(+'!+%" · wording !+*! · verbal !+*$' · syllabical !+*+ · wordly !+##,&-/. !%&"' !-#!,#!, <-*4)4')*+,C%&%2?,)*,9-&14#4$%%<6 verbal !+!('!+$# !-#!" 9)'6-3',C(*)"%4'(')-*,)*,(<')-* verbal !+&&' !-#!- *-' non-verbal !%&"' · textless !%$"' !. 6(G)*+#&%2(')*+,'-,4$%<)! ,<,*3CB%&,-",9-&14 triverbal !(!" · diverbal !(&$ · many-worded .".!(#&' · monepic .".!(#&' !/ 4D)22%1,)*,'6%,34%,-",9-&14 wordy !+*#'!+(* !/#!, *-' non-verbal !%"& !0 6(G)*+, 2(&+%,G-<(B32(&? copious !$)%...,+,;<-'4,,,../.!+$!'!""&0" · worded !"#) !1 (1-&*%1,9)'6,9-&14 word-painted !("*' !$ 1%4<&)B%1,)*,9-&14 word-painted !%#"' !2 9)'6)*,(,9-&1 intraverbal !%*%...,+,H)<'I,,,../.!%$#' !2#!, )*, 4$%<)! ,<,$-4)')-* word-fi nal !%)%' · word-initial !%)%' · word-medial !%)%' ,! $%&'()*)*+,'-,4'31?,-",9-&14 lexicological !(+"' ,, &%"%&&)*+,'-,"&%U3%*<?,-",9-&14 word frequency !%")' ," -",9-&14a,6(G)*+,4)C)2(&, %*1)*+4 homoteleutic !(&! · homoeoteleutic !(%* ,"#!, &%432')*+,(4,(*,%&&-&,13%,'- homoeoteleutic !((* ,- $%&'()*)*+,'-,-'6%&,4$%<)! ,<,'?$%4,-",9-&1 gefeged .12. · manidel .12. · teart .12. · long-tailed !$)%...,+,;<-'4,,,../.!"+% · communicable .".!++! · unanalogical !"$$...,+,H)<'I,,,.. · learned !(+%,&- · parasynetic !(($ · monorrhemic !%#% ,. 9-&1/"-&/9-&1 word-for-word .!.!+!!/.!($(0% · verbatim !(#)' · word-by-word !(+$' ,/ %X$&%44)*+,96-2%,$6&(4%, B?,-*%,9-&1 holophrastic !(+*' ,0 %X$&%44%1,B?,C%(*4, -",2%X)<-* lexicalised !%)%' ,1 $(44)G%,G-<(B32(&? passive !%#$' ,$ 2)C)'%1,G-<(B32(&? basic !%&%' !"#!$#!.#!/#!, !(!"#'%!)! % 7?,C%(*4,-"#)*,&%4$%<',-",9-&14 verbally !+)+' · lexically !($('!(++ !, B?,C%&%,9-&14 verbally !($$0+' !,#!, 9)'6-3',(<<-C$(*?)*+,&%(2)'? verbally !+!*'!+"( !" )*,4$%<)! ,<,$-4)')-*,)*,9-&1 word- internally !%+)' · word fi nally !%+$' · word-medially !%+(' · word-initially !%"#' !- )*,4-,C(*?,9-&14 in so many words !"&*' !. 9-&1,"-&,9-&1 word after word .!.!&** · word by word !#"%' · word for word .".!)**' · after the word .!.!)** · fro word unto word .!.!)"$ · verbatim !)(!' · word in word !)%# · verbally !$((' · verbatimly !$%" · syllabically !+$)'.".!""( · totidem verbis !+$%'!%*& · verbatim et literatim !")&' !"#!$#!.#!/#!, !(!')%!)! % 53&*)46,9)'6,9-&14 vocabularize !($! !, (<<%$',)*'-,2%X)<-* lexicalise !%#"' !"#!$#!.#!/#!" !(!&%!)! % J6&(4% cwide .12. · foresettedness .12. · forsetnes .12. · word .12. · wordcwide .12. · locution !)#&0$*' · saying !$#* · phrase !$#*' · comma !$(+'!"!# · word !$%#/.!"(*/. !%*#...,+,(&<6I,,,.. · speech !$%+'!+"$ · stand !+!+ · gramm !+)" · diction .".!++*'!"*% · road .".!+%* · slip-slop !(&# · construct !("!' · group-word !%$# !, <-22%<')G%2? speakings .".!#&$'!+$# · saying .!.!$"* · verbalism !(** · verbalities !()*' !" 4%&)%4,-" routine !(&& !- 34%,-" phrasing !+!!' · phraseology !+"*'!+"(...,+,H)<'I,,,.. !. <-*'&(<')-*, -",(,$6&(4% short !%&* !/ '%&C#%X$&%44)-* word< word .12.' · term .!.!)""' · conveyance !$(+ · termination !$%% · epithet !$%%'!+*) · notion !+$$'!+$" !/#!, 34%, -" terming .!.!)!*'!$%! !/#!" ($'#! ,'')*+ a word on its/upon the wheels !+$$'.!.!+$" · mot juste !%!&' !/#!- 9)'6,-*2?,-*%,9-&1 mononym !(()' · monomial !(($ !/#!-#!, 4?4'%C,-" mononymy !(($ !/#!-#!" <-*G%&4)-*,)*'- mononymization !((% !0 <('<6/$6&(4%#4'-<D,$6&(4% byword !$+#0("'!"!* · phrase !$"%' · cant !+(!'.!.!(!$ · cant phrase !"!&'!(+( · cant word !"$#'!"%* · cant term !"") · catch-phrase .".!($*' · wheeze !(%*'...,+,42(*+,M,<-22-UI,,,.. !1 <2)<6c glittering generality !()%'...,+,-&)+I,R;,,,.. · cliché !(%&'...,+,! ,+I,,,.. · thought- saver !%#!' !$ "-&C32( formala .12. · hiw .12. · formula .".!+#(' · cant !+(!'!"!& !2 )1)-C wise .12. · idiom !+&(' ,! $&-G%&B proverb .!.!#"$' · ditton !$"&'!+$# ,!#!, <-22%<' folksay/folk-say !%&%'...,+,R;,,,.. ,, -'6%&,4$%<)! ,<,'?$%4,-",$6&(4% et cetera !$%"' · chr(e)ia !+!&'!+$$0+* · hob-nob !"+!'!""* · phraseograph !()$' · continentalism !($) · snapper !($"'...,+,R;,,,.. · humilifi c !%*$ · binomial !%$%'!%+) ," 34%,-",! ,*%/4-3*1)*+, $6&(4%4 phrase-making !(+"'!%&% ,"#!, -*%,96- sententiolist !++* · phrase-maker !(&&' ,"#!" -*%,96-, %X$2()*4 phraseologist !"&" !"#!$#!.#!/#!" !(!"#$%!)! % J%&'()*)*+,'-,$6&(4%4 phrasical !+!$ · phrasal !("!,&- · construct !("!' !, 34)*+,$6&(4%4 phraseological !++)' · phrasing !((( !" (B-3*1)*+,)*,$6&(4%4 phrasy !()%' !- %X$&%44%1, )*,$6&(4%4 phrased !$$"' · worded !()(' !. 1%(2)*+, 9)'6,$6&(4%4 phraseological !++)' !/ -",$6&(4%4a, "&%U3%*'2?,34%1#6%(&1 rife !$!#'!+"!/.!("( !0 %X$&%44)*+, )*,'%&C4 terming !+)# !1 <-*4)4')*+,-",-*%/9-&1,'%&C mononymic !("& · monomial !(() !"#!$#!.#!/#!" !(!"#'%!)! % @4#9)'6,(,$6&(4% phraseologically !(+"' !, )*,'6%,C(**%&,-",(,"-&C32( formulaically !%$"' !"#!$#!.#!/#!" !(!'(%!)! % R4%,(,$6&(4%#$6&(4%4 phrase .".!$$*/.!((( · phrasify !+##'!+") !, 34%,(,<2)<6c coin a phrase !%)*' !"#!$#!.#!/#!" !(!')%!)! % >X$&%44,)*,$6&(4%4 have .!.!))%' · phrase !$$+' · speak !$"% !, )*,1)""%&%*', $6&(4%4 reword !((&' · rephrase !(%$' · retune !%$% !" <-*G%&',)*'-,-*%/9-&1,'%&C mononymize !((% !"#!$#!.#!/#!- !(!&%!)! % R4%#"-&C(')-*,-",*%9,9-&14# $6&(4%4 coining .".!+(*'!"!& · coinage !+%#' · neology !"%"' · neologism !(**' · minting !()! · neologization !()+ · neonism !()+ · neoterism !("# !, -*%,96- logodaedalus !+!!'!++) · mint-master !+)!'!+%* · logodaedalist !"&"'!(*+ · neologist !"($' · neoterist !("# · verbarian !("# !" *%9,9-&1#$6&(4% mint phrase !+&+ · mintage !+#(' · neologism !(*#' · neology !()+' · neoterism !("# · coinage !("#' !- 9-&1# $6&(4%,B-&&-9%1,"&-C,-'6%&,2(*+3(+% loan-word !(")' · foreignism !(""' · lending !(() · loan-form !%*&' !-#!, *('3&(2)T%1 denizened word !$$+ · denizen .".!+&+/. !%##'!%#) · hobson-jobsonism !%#) · replica !%$+' !-#!,#!, $&-<%44,-" nativization !%"* !-#!,#!" *-' alien !(()' · translation loan(-word) !%**' · loan- translation !%##' · calque !%#"' !-#!" &%B-&&-9%1,9-&1 reborrowing !%$#' !-#!"#!, (<')-*,-" reborrowing !%##' !"#!$#!.#!/#!- !(!"#$%!)! % J%&'()*)*+,'-,34%,-",*%9, 9-&14#$6&(4%4 new-fashion !"&% · neological !"$)'!"") · neologous !(!& · neologismal !(#+ · neoteristic !("# · neologistic !%#$ !, <-)*%1 new-minted !$%('!"!# · new-coined !$%(' · made !+*"'!+(" · coined !((!' !,#!, (B2%,'-, B% coinable !(#% !" -",(,2-(*/9-&1a,(1($'%1,'-,*(')G%, 2(*+3(+% nativized !%## !"#!$#!.#!/#!- !(!'(%!)! % R4%#<-)*,*%9,9-&14#$6&(4%4 neologize !()+' · neoterize !("# !, <(2U3% calque !%$(' !"#!$#!.#!/#!- !(!')%!)! % A-)*,*%9,9-&1#$6&(4% coin !$(%' · feign !+*" · mint !+$%' · new-coin !"**'!(*) !, B-&&-9,"&-C,(*-'6%&,2(*+3(+% usurp !$#!'!+%*/. !($% · borrow !"*+' !,#!, *('3&(2)T% enfranchise !++('.".!")(...,+,! ,+I,,,.. · nativize !%"* !"#!$#!.#!/#!. !(!&%!)! % K%X)<-+&($6? dictionary-making !++(' · lexicography !+(*' · lexigraphy !(&(0#&'...,+,H)<'I,,,.. · dictionary-work !(("' !, 2%X)<-+&($6%& dictionarist !+!" · lexicographer !+$(' · dictionary-maker !"&"'!((& · word-catcher !"#$' · dictionary-writer !")& · lexicographist !(#)0)#' !" 2%X)<-+&($6)<(2,9&)')*+4 lexicographics !"!+ !- 1)<')-*(&? dictionary !$&+' · calepin !$+('!++& · world of words !$%('!+%+ · lexicon !+*#'!()( · thesaurus !"#+'!(+& !-#!, 4$%<)! ,<,1)<')-*(&)%4 alveary !$(* · gradus .".!"+)' · Webster !()#' · the/an unabridged !(+*'!(%) · O.E.D. !(%(' !-#!" 4$%<)! ,<, '?$%4,-",1)<')-*(&? interpreter !+*"'!+"& · etymologicon !+)$'!(+& · pronouncing dictionary !"+)'!($" · rhyming dictionary !""$' · idioticon !()&'!((# · collegiate !(%(...,+,.I,@C%&IW,(24-,H)<'I,,,.. · collegiate dictionary !(%('...,+,-&)+I, H)<'I,,,.. · desk dictionary !%)(' · learner’s dictionary !%)(' · reverse dictionary !%$)' !-#!- $(&'4,-",(, 1)<')-*(&?,%*'&? !-#!-#!, 6%(1/9-&1#/"-&C main word !((( · head-form !%+& · entry form !%+&' · head-word !%++' !-#!-#!" 2%CC( lemma !%$!' !-#!-#!"#!, (<', -",4-&')*+,)*'- lemmatisation !%+"' !-#!-#!- 2(B%2 label !%!!' !. G-<(B32(&?#<-22%<')-*,-",9-&14 vocabular !$#* · vocabulist !$#*,&- · vocabuler !$#*'!"*+ · vocabulary !$#&' · nomenclator !$($'!"*" · word- book !$%(' · verbal !$%%'!+&# · lexicon !+$+'!(&# · nomenclature !+$%'!")$ · vocabula !+%( · vocab !%** !.#!, -*%,96-,<-C$)2%4 vocabulist !$)$/.!(** · nomenclator !+*%'!+&& !/ G-<(B32(&?,-",$&-$%&,*(C%4 onomasticon !"!*' !/#!, -*%,96-,<-C$)2%4 onomastic !+*%'!"!+ !0 +2-44(&? glossary !)(#' !0#!, 4$%<)! ,< microglossary !%$$' !0#!" -*%,96-,<-C$)2%4 glossarist !"(&' !1 1)<')-*(&?,-",4?*-*?C4#(*'-*?C4 sylva !+"$ · synonymicon !(!# · thesaurus !(%('...,+,R;,,,.. !$ '6%4(3&34 thesaurus !($&' !$#!, 4$%<)! ,< Roget !%)*' !2 2)4',-",D%?/9-&14 word-index !%#"' · thesaurus !%$"' !2#!, <-*<-&1(*<% concordance !#("'!(+% · concordant !+&$ !2#!,#!, -*%,96-,9&)'%4 concordist !(!! · concordancer !((( ,! $6&(4%/B--D phrase-book !$%)' · phraseology !""+ !"#!$#!.#!/#!. !(!"#$%!)! % J%&'()*)*+,'-,2%X)<-+&($6? lexicographal !+($ · dictionarial !"$* · lexicographical !"%!' · lexicographian !(!$ · lexicographic !(!+'!()# · lexical !("#' · lexigraphical !(%$ !, $%&'()*)*+,'-, 4$%<)! ,<,1)<')-*(&? Websterian !(")' !" $%&'()*)*+,'-, (,2%CC( lemmatic !%$$' !- $%&'()*)*+,'-,(,+2-44(&? glossarial !(&!' !. $%&'()*)*+,'-,(,<-*<-&1(*<% concordantial !++*'.".!(*& !.#!, 4$%<)! ,< key-word-in- context !%$%' !"#!$#!*#!( !"#$"%!&%'(")*%!&%*'$$+,% n. adj. adv. v. vi. v. pass. vt. v. refl. v. impers. phr. int. conj. prep. !&%$
  38. .".!(++' · pillow-word !(""' · nonce-word !(() · word- symbol

    !%*)' · blessed word !%!*' · object word !%!)' · bogy-word !%!% · key-word !%&+' · fossil !%#!' · nursery word !%##' · four-letter word !%#)' · pseudo-word !%$!' · plus word !%$%' · non-word !%+!' !2 %U3)G(2%*', -",(,9-&1,)*,(*-'6%&,2(*+3(+% synonym !$%)/.!(*) ,! 9-&14,<-22%<')G%2?#G-<(B32(&? wordhord .12. · wordloca .12. · vocabulary !"(&' · wordage !(&%' · word- hoard !(+%' · wordlore !%*) · word-stock !%!!' · lexicon !%##' · lexis !%+*' · vocab !%"!' ,!#!, -",(,4$6%&%,-", (<')G)'? lexicon !+)"'!(#% ,!#!" -",(,U3(2)'?#"%%2)*+ vocabulary !""*' ,!#!- -",(,&%+)-*#4$%(D%& lexicon !%$)' ,!#!. C)*)C3C minimum vocabulary !%))'!%)" ,!#!/ 7(4)<,>*+2)46 Basic English !%&%' ,, 4)C)2(&# <-*"34(B2%,9-&14 homoeoteleft !+$& · word-pair !%#+' · confusables/confusibles !%"%' ,,#!, 4)C)2(&)'?, <(34)*+,C)4'(D%4,)*,<-$?)*+ homoeoteleuton !(+!' · homoeoarchy !((# · homoeotel !((# · homoeotopy ! ! !"#!$#!.#!/ !(!"#$%!)! % J%&'()*)*+,'-,2)*+3)4')<,3*)'4 !, 2%X%C)< lexemic !%$)' !"#!$#!.#!/#!, !(!&%!)! % =-&1 clipung .12. · dæl .12. · word< word .12.' · vocable !$#*'!+#(/.!"(+' · diction !$)&'!+%" · whid !$+"'!(+!...,+,<(*',,,.. · vowel !$"('!+)( · accent !$%$' · phrase !$%"'!+%% · quatch .".!+#$/.!"(#'...,+,1)(2I,,,.. · mot .!.!+)$ · verb .".!"!+ · verbalism !"("' · monolog !%&%'!%)" · dicky-bird !%#&'...,+,42(*+,,,.. · dicky !%)#...,+,42(*+,,,.. · word-type !%+!' · lexical item !%+)' · lexon !%+)' !, 9&)''%* word< word .12.' !" 4$-D%* ideophone !((!'!%*% !- $&-$%&#($$&-$&)('%, 9-&1 the word !$%+' !. +6-4',9-&1 ghost word !((+' · ghost name !(%+ · ghost form !%## !/ 9-&1,9)'6,-*2?, -*%,&%<-&1%1,-<<3&&%*<% hapax legomenon !((&' · hapax !%+& !0 9-&1,%X$&%44)*+,$6&(4%#4%*'%*<% sentence- word !()(' · holophrasm !(+&'!%** · holophrase !(%%' · phrase-word !%##' · monorheme !%#"' !0#!, 34%,-" holophrasis !(+% !1 *-*4%*4%,9-&1 nonsense-name % , , m c l , ! B - p !"# v v v i !
  39. F I G U R E S • 793,742 words

    • 225,131 categories (= meanings) • Approximately 3.5 words for each concept, on average • Largest categories: • 01.05.06.08.02 av (264 synonyms) “Immediately” • 02.01.09.03 aj (248 synonyms) ”Dull, stupid” • 02.06.01.06 (224 synonyms) ”Excellent” • 01.02.03 (213 synonyms) ”Die” • 02.01.09.06.01 (203 synonyms) ”Stupid person, dolt, blockhead”
  40. 80,000 160,000 240,000 320,000 400,000 1050 1150 1250 1350 1450

    1550 1650 1750 1850 1950 15,343 15,343 15,405 18,257 21,841 30,857 37,408 67,229 75,396 85,249 106,314 152,212 184,602 199,224 205,892 220,539 248,448 278,415 334,064 363,039 Middle English Early Modern English Later Modern English Old English
  41. .".!(++' · pillow-word !(""' · nonce-word !(() · word- symbol

    !%*)' · blessed word !%!*' · object word !%!)' · bogy-word !%!% · key-word !%&+' · fossil !%#!' · nursery word !%##' · four-letter word !%#)' · pseudo-word !%$!' · plus word !%$%' · non-word !%+!' !2 %U3)G(2%*', -",(,9-&1,)*,(*-'6%&,2(*+3(+% synonym !$%)/.!(*) ,! 9-&14,<-22%<')G%2?#G-<(B32(&? wordhord .12. · wordloca .12. · vocabulary !"(&' · wordage !(&%' · word- hoard !(+%' · wordlore !%*) · word-stock !%!!' · lexicon !%##' · lexis !%+*' · vocab !%"!' ,!#!, -",(,4$6%&%,-", (<')G)'? lexicon !+)"'!(#% ,!#!" -",(,U3(2)'?#"%%2)*+ vocabulary !""*' ,!#!- -",(,&%+)-*#4$%(D%& lexicon !%$)' ,!#!. C)*)C3C minimum vocabulary !%))'!%)" ,!#!/ 7(4)<,>*+2)46 Basic English !%&%' ,, 4)C)2(&# <-*"34(B2%,9-&14 homoeoteleft !+$& · word-pair !%#+' · confusables/confusibles !%"%' ,,#!, 4)C)2(&)'?, <(34)*+,C)4'(D%4,)*,<-$?)*+ homoeoteleuton !(+!' · homoeoarchy !((# · homoeotel !((# · homoeotopy ! ! !"#!$#!.#!/ !(!"#$%!)! % J%&'()*)*+,'-,2)*+3)4')<,3*)'4 !, 2%X%C)< lexemic !%$)' !"#!$#!.#!/#!, !(!&%!)! % =-&1 clipung .12. · dæl .12. · word< word .12.' · vocable !$#*'!+#(/.!"(+' · diction !$)&'!+%" · whid !$+"'!(+!...,+,<(*',,,.. · vowel !$"('!+)( · accent !$%$' · phrase !$%"'!+%% · quatch .".!+#$/.!"(#'...,+,1)(2I,,,.. · mot .!.!+)$ · verb .".!"!+ · verbalism !"("' · monolog !%&%'!%)" · dicky-bird !%#&'...,+,42(*+,,,.. · dicky !%)#...,+,42(*+,,,.. · word-type !%+!' · lexical item !%+)' · lexon !%+)' !, 9&)''%* word< word .12.' !" 4$-D%* ideophone !((!'!%*% !- $&-$%&#($$&-$&)('%, 9-&1 the word !$%+' !. +6-4',9-&1 ghost word !((+' · ghost name !(%+ · ghost form !%## !/ 9-&1,9)'6,-*2?, -*%,&%<-&1%1,-<<3&&%*<% hapax legomenon !((&' · hapax !%+& !0 9-&1,%X$&%44)*+,$6&(4%#4%*'%*<% sentence- word !()(' · holophrasm !(+&'!%** · holophrase !(%%' · phrase-word !%##' · monorheme !%#"' !0#!, 34%,-" holophrasis !(+% !1 *-*4%*4%,9-&1 nonsense-name % , , m c l , ! B - p !"# v v v i !
  42. wordhord
 OE wordloca 
 OE vocabulary 
 1782– wordage 


    1829– wordhoard 
 1869– wordlore 
 1904 word-stock 
 1911– lexicon 
 1933– lexis
 1960– vocab
 1971–
  43. wordhord
 OE wordloca 
 OE vocabulary 
 1782– wordage 


    1829– wordhoard 
 1869– wordlore 
 1904 word-stock 
 1911– lexicon 
 1933– lexis
 1960– vocab
 1971–
  44. Food and Drink Health and Disease The Body Biology People

    Clothing Death Cleanliness Textiles (Other) Plants Action Space Physics Chemistry Movement Time The Earth Colour Properties of Materials The Supernatural Physical Sensibility Relative Properties Number Wholeness Quantity Mental Capacity Travel Leisure Work Authority Communication Armed Hostility Faith Society Dwelling Morality Education Emotion Language Possession Faculty of Will Philosophy Refusal and Denial Aesthetics Existence, Creation, Causation Constitution of Matter Animals Modern English 469,470 words; those cited after 1860AD or marked as ‘current’
  45. The Age of Johnson 247,933 words; those first cited before

    1784AD,
 and last cited after 1709AD
  46. The Age of Chaucer 73,432 words; those first cited before

    1400AD,
 and last cited after 1340AD
  47. Food and Drink Health and Disease The Body Biology People

    Clothing Death Cleanliness Textiles (Other) Plants Action Space Physics Chemistry Movement Time The Earth Colour Properties of Materials The Supernatural Physical Sensibility Relative Properties Number Wholeness Quantity Mental Capacity Travel Leisure Work Authority Communication Armed Hostility Faith Society Dwelling Morality Education Emotion Language Possession Faculty of Will Philosophy Refusal and Denial Aesthetics Existence, Creation, Causation Constitution of Matter Animals Modern English 469,470 words; those cited after 1860AD or marked as ‘current’
  48. 1100 1125 1150 1175 1200 1225 1250 1275 1300 1325

    1350 1375 1400 1425 1450 1475 1500 1525 1550 1575 1600 1625 1650 1675 1700 1725 1750 1775 1800 1825 1850 1875 1900 1925 1950 1975 2000 01.01.05.04 Fountain 1100 1125 1150 1175 1200 1225 1250 1275 1300 1325 1350 1375 1400 1425 1450 1475 1500 1525 1550 1575 1600 1625 1650 1675 1700 1725 1750 1775 1800 1825 1850 1875 1900 1925 1950 1975 2000 01.01.10.13 Astrology 1100 1125 1150 1175 1200 1225 1250 1275 1300 1325 1350 1375 1400 1425 1450 1475 1500 1525 1550 1575 1600 1625 1650 1675 1700 1725 1750 1775 1800 1825 1850 1875 1900 1925 1950 1975 2000 01.03.06.01 Inodorousness
  49. 1100 1125 1150 1175 1200 1225 1250 1275 1300 1325

    1350 1375 1400 1425 1450 1475 1500 1525 1550 1575 1600 1625 1650 1675 1700 1725 1750 1775 1800 1825 1850 1875 1900 1925 1950 1975 2000 01.04.01 Alchemy 1100 1125 1150 1175 1200 1225 1250 1275 1300 1325 1350 1375 1400 1425 1450 1475 1500 1525 1550 1575 1600 1625 1650 1675 1700 1725 1750 1775 1800 1825 1850 1875 1900 1925 1950 1975 2000 01.04.02.01 Chemistry 1100 1125 1150 1175 1200 1225 1250 1275 1300 1325 1350 1375 1400 1425 1450 1475 1500 1525 1550 1575 1600 1625 1650 1675 1700 1725 1750 1775 1800 1825 1850 1875 1900 1925 1950 1975 2000 03.04.06.07 Rule over the sea
  50. www.glasgow.ac.uk/metaphor   http://blogs.arts.gla.ac.uk/metaphor/ Twitter: @MappingMetaphor Mapping Metaphor with the Historical

    Thesaurus Image from Thomas Wright: An original theory or new hypothesis of the universe. London, 1750. Courtesy of University of Glasgow Library, Special Collections.
  51. Food and Drink Health and Disease The Body Biology People

    Clothing Death Cleanliness Textiles (Other) Plants Action Space Physics Chemistry Movement Time The Earth Colour Properties of Materials The Supernatural Physical Sensibility Relative Properties Number Wholeness Quantity Mental Capacity Travel Leisure Work Authority Communication Armed Hostility Faith Society Dwelling Morality Education Emotion Language Possession Faculty of Will Philosophy Refusal and Denial Aesthetics Existence, Creation, Causation Constitution of Matter Animals Modern English 469,470 words; those cited after 1860AD or marked as ‘current’
  52. Food and Drink Health and Disease The Body Biology People

    Clothing Death Cleanliness Textiles (Other) Plants Action Space Physics Chemistry Movement Time The Earth Colour Properties of Materials The Supernatural Physical Sensibility Relative Properties Number Wholeness Quantity Mental Capacity Travel Leisure Work Authority Communication Armed Hostility Faith Society Dwelling Morality Education Emotion Language Possession Faculty of Will Philosophy Refusal and Denial Aesthetics Existence, Creation, Causation Constitution of Matter Animals Modern English 469,470 words; those cited after 1860AD or marked as ‘current’
  53. Unfortunately they can only do so by treating all literature

    as if it were the same. The algorithmic analysis of novels and of newspaper articles is necessarily at the limit of reductivism. The process of turning literature into data removes distinction itself. It removes taste. It removes all the refinement from criticism. It removes the history of the reception of works. To the Lighthouse is just another novel in its pile of novels. Marche, Stephen. 2012. Literature is not data: Against digital humanities. Los Angeles Review of Books, 28 October 2012. http:// lareviewofbooks.org/article.php?id=1040&fulltext=1
  54. The notion seems to be that a professor somewhere will

    feed all of Virginia Woolf’s books into a machine and forget they’re good books. (For the record: Virginia Woolf is in no danger on this count.) [...] As in the sciences, some of this research will come to nothing; a percentage of it will change how we account for particular works of literature; some may change how we understand the sweep of literary history. But there are no monsters, or fascists, under any of these beds. None of these questions is going to endanger the ways that literature spurs all of us to think. [...] The machine-driven projects of distant reading will humbly supplement — usually by just a little, and perhaps one day by a great deal — what we know about literature, just as historical data and biographical data have done all along. Selisker, Scott. 2012. The Digital Inhumanities? Los Angeles Review of Books, 5 November 2012. https://lareviewofbooks.org/essay/in-defense-of-data-responses- to-stephen-marches-literature-is-not-data
  55. “A new conjunction of scientist, curator, humanist, and artist is

    what the digital humanities must strive to achieve. It is the only way of ensuring that we do not lose our souls in a world of data.” Prescott, Andrew. 2012. An Electric Current of the Imagination: What the Digital Humanities Are and What They Might Become. Journal of Digital Humanities 1(2).
  56. Thank you! [email protected] Historical Thesaurus of English: www.glasgow.ac.uk/thesaurus SAMUELS Alpha

    Test Site: http://is.gd/semtag Mapping Metaphor: www.glasgow.ac.uk/metaphor
  57. S O F T WA R E ( W I

    T H O U T P R O G R A M M I N G ) ‣ IBM ManyEyes ‣ Google Docs (Gadgets) ‣ JMP ‣ Gephi ‣ Voyant ‣ Excel ‣ Tableau
  58. S O F T WA R E ( W I

    T H P R O G R A M M I N G ) ‣ R (and ggplot2) ‣ Python and NLTK ‣ Java and OpenNLP ‣ Perl ‣ Protoviz ‣ D3