Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How Can We See Half a Million Words at Once

Marc Alexander
September 24, 2014
32

How Can We See Half a Million Words at Once

Presented at the University of Stirling English Research Seminars, 2014

Marc Alexander

September 24, 2014
Tweet

More Decks by Marc Alexander

Transcript

  1. How Can We See Half A Million Words At Once?

    Words and text using the Historical Thesaurus Marc Alexander, University of Glasgow [email protected] | @marcgalexander University of Stirling, 24 September 2014
  2. Christian Kay, Jane Roberts, Michael Samuels and Irené Wotherspoon (eds).

    2009. Historical Thesaurus of the Oxford English Dictionary. Oxford: Oxford University Press. 03.01 Society/the community 03.01.01 Kinship/relationship 03.01.02 Study of society 03.01.03 Society in relation to customs/values/beliefs 03.01.04 Social communication/relations 03.07.00.13 Conformity 03.07.00.14 Non-conformity 03.07.00.15 Apostasy 03.07.00.16 Sectarianism 03.07.00.17 Catholicity The Histori of the OED —the large in the world historical th created in a Based on th English Dic contains ev English from to the prese RUS nary ociety/the community  6A40C6A40C6A0=350C74A  74A     a 1693                 02.05      great- 05R34A       oe       g     6A40C6A40C 03      grandmother           4;3<>C74A  =30<4     a 1225–      g     6A0=3<>C74A  09 ( Scots & N. English )       g     14;30<4  4     1663–      g     6A0=3<0<<0  0=     1863–      g     6A0=3<0  olloq. )                  03.01      condition of           03.02      step-grandmother           03.03      great-grandmother           6A0=3<>C74A     1530–      g     6A40C     1340–      g     F7>;41A>C74A     1377–                 02.01      collectively            A87C641A>SAD       oe                  03      half-brother           70;51A>C74A      c 1330–                 03.01      by same father           5R34A4=1A>S>A       oe       g      1A>C74A2>=B0=6D8=40=     1880                 03.02      by same mother            F><11A>C74A     1647– a 1661                 04      bastard brother            7>A=D=61A>S>A       oe                  05      stepbrother           BC4?1A>C74A      1440–      g     BC4?     1933 ( colloq. )                  06      twin-brother           CF8=1A>C74A      1598–                 07      younger brother           2034C     1610–      g     1A>C74A:8=      1827–1856      g     :831A>C74A     1895– ( orig. US )                  08      foster-brother            5>BC4A1A>C74A   5>BC>A1A>S>A       oe –      g     BD2:8=654A4     1382      g      =>DA8B7431A>C74A     1470/85      g     =DAB454;;>F     1526      g     5>BC4A4A     03 03 Society
  3. “a magnificent achievement of quite extraordinary value. It is perhaps

    the single most significant tool ever devised for investigating semantic, social, and intellectual history” Randolph Quirk 03.01 Society/the community 03.01.01 Kinship/relationship 03.01.02 Study of society 03.01.03 Society in relation to customs/values/beliefs 03.01.04 Social communication/relations 03.07.00.13 Conformity 03.07.00.14 Non-conformity 03.07.00.15 Apostasy 03.07.00.16 Sectarianism 03.07.00.17 Catholicity The Histori of the OED —the large in the world historical th created in a Based on th English Dic contains ev English from to the prese RUS nary ociety/the community  6A40C6A40C6A0=350C74A  74A     a 1693                 02.05      great- 05R34A       oe       g     6A40C6A40C 03      grandmother           4;3<>C74A  =30<4     a 1225–      g     6A0=3<>C74A  09 ( Scots & N. English )       g     14;30<4  4     1663–      g     6A0=3<0<<0  0=     1863–      g     6A0=3<0  olloq. )                  03.01      condition of           03.02      step-grandmother           03.03      great-grandmother           6A0=3<>C74A     1530–      g     6A40C     1340–      g     F7>;41A>C74A     1377–                 02.01      collectively            A87C641A>SAD       oe                  03      half-brother           70;51A>C74A      c 1330–                 03.01      by same father           5R34A4=1A>S>A       oe       g      1A>C74A2>=B0=6D8=40=     1880                 03.02      by same mother            F><11A>C74A     1647– a 1661                 04      bastard brother            7>A=D=61A>S>A       oe                  05      stepbrother           BC4?1A>C74A      1440–      g     BC4?     1933 ( colloq. )                  06      twin-brother           CF8=1A>C74A      1598–                 07      younger brother           2034C     1610–      g     1A>C74A:8=      1827–1856      g     :831A>C74A     1895– ( orig. US )                  08      foster-brother            5>BC4A1A>C74A   5>BC>A1A>S>A       oe –      g     BD2:8=654A4     1382      g      =>DA8B7431A>C74A     1470/85      g     =DAB454;;>F     1526      g     5>BC4A4A     03 03 Society
  4. !"# $%&#'()* | !"#!$% &'()*')+ !"#!$#!.#!/ !(!"#$%!)! % J%&'()*)*+,'-,2)*+3)4')<,3*)'4 !,

    2%X%C)< lexemic !%$)' !"#!$#!.#!/#!, !(!&%!)! % =-&1 clipung .12. · dæl .12. · word< word .12.' · vocable !$#*'!+#(/.!"(+' · diction !$)&'!+%" · whid !$+"'!(+!...,+,<(*',,,.. · vowel !$"('!+)( · accent !$%$' · phrase !$%"'!+%% · quatch .".!+#$/.!"(#'...,+,1)(2I,,,.. · mot .!.!+)$ · verb .".!"!+ · verbalism !"("' · monolog !%&%'!%)" · dicky-bird !%#&'...,+,42(*+,,,.. · dicky !%)#...,+,42(*+,,,.. · word-type !%+!' · lexical item !%+)' · lexon !%+)' !, 9&)''%* word< word .12.' !" 4$-D%* ideophone !((!'!%*% !- $&-$%&#($$&-$&)('%, 9-&1 the word !$%+' !. +6-4',9-&1 ghost word !((+' · ghost name !(%+ · ghost form !%## !/ 9-&1,9)'6,-*2?, -*%,&%<-&1%1,-<<3&&%*<% hapax legomenon !((&' · hapax !%+& !0 9-&1,%X$&%44)*+,$6&(4%#4%*'%*<% sentence- word !()(' · holophrasm !(+&'!%** · holophrase !(%%' · phrase-word !%##' · monorheme !%#"' !0#!, 34%,-" holophrasis !(+% !1 *-*4%*4%,9-&1 nonsense-name !()& · nonsense word !%!%' !$ -'6%&,4$%<)! ,<,'?$%4, -",9-&1 cyneword .12. · froforword .12. · grandame words !$%( · household word !$%%' · wordy/wordie !"!('...,+,;<-'4,,,.. · my whole .!.!"(%' · foundling !(&"0#( · Mesopotamia !(&"' · book-word !($! · thought-word .".!(++' · pillow-word !(""' · nonce-word !(() · word- symbol !%*)' · blessed word !%!*' · object word !%!)' · bogy-word !%!% · key-word !%&+' · fossil !%#!' · nursery word !%##' · four-letter word !%#)' · pseudo-word !%$!' · plus word !%$%' · non-word !%+!' !2 %U3)G(2%*', -",(,9-&1,)*,(*-'6%&,2(*+3(+% synonym !$%)/.!(*) ,! 9-&14,<-22%<')G%2?#G-<(B32(&? wordhord .12. · wordloca .12. · vocabulary !"(&' · wordage !(&%' · word- hoard !(+%' · wordlore !%*) · word-stock !%!!' · lexicon !%##' · lexis !%+*' · vocab !%"!' ,!#!, -",(,4$6%&%,-", (<')G)'? lexicon !+)"'!(#% ,!#!" -",(,U3(2)'?#"%%2)*+ vocabulary !""*' ,!#!- -",(,&%+)-*#4$%(D%& lexicon !%$)' ,!#!. C)*)C3C minimum vocabulary !%))'!%)" ,!#!/ 7(4)<,>*+2)46 Basic English !%&%' ,, 4)C)2(&# <-*"34(B2%,9-&14 homoeoteleft !+$& · word-pair !%#+' · confusables/confusibles !%"%' ,,#!, 4)C)2(&)'?, <(34)*+,C)4'(D%4,)*,<-$?)*+ homoeoteleuton !(+!' · homoeoarchy !((# · homoeotel !((# · homoeotopy !((# · homoe(o)archon !(%+' ," -'6%&,4$%<)! ,<, +&-3$4#4%'4,-",9-&14 doublet !("%' · word square !(%* ,- 2%''%&#4-3*1,(',B%+)**)*+#%*1,-",9-&1 word-fi nal !%!(' · word-initial !%!(' ,. 4?CB-2,&%$&%4%*')*+,9-&1 word-symbol !%## · lexigram !%"#' ,/ 4'31?,-",9-&14 lexicology !(&(0#&...,+,H)<'I,,,../.!%)%' · lexis !%+*' ,/#!, -*%, 96- wordster !%+$'...,+,*-*<%,9-&1,,,.. ,0 2-G%,-",9-&14 logolatry !(!*' · epeolatry !(+*'!%+( · verbomania !%&#' · logophilia !%(*' ,0#!, -*%,96- logophile !%$%' ,1 -*%,96-,+4D)2"322?,,34%4,9-&14 verbalist !"%)' · wordmonger !%!+' · wordster !%"!'...,+,*-*<%, 9-&1,,,.. ,$ "%(&,-",9-&14 logophobia !%&#' ,2 <-*<%&*, C%&%2?,9)'6,9-&14 verbality !+)$'!("" ,2#!, -*%, 96- verbalist .!.!+*%' "! *3CB%&#"&%U3%*<?,-",9-&14 word-frequency !%&(' · word-count !%#*' · wordage !%"$' ", (B3*1(*<%,-",9-&14 copiousness !(&"'!($* "" <6(*+%,-","-&C,-",9-&1,'-,+)G%,4)+*)! ,<(*<% parasynesis !("" · popular etymology !((*' · folk etymology !((#' "- %X$&%44)-*,B?,C%(*4,-",2%X)<-* lexicalization !%)%' !"#!$#!.#!/#!, !(!"#$%!)! % J%&'()*)*+,'-,(,9-&1 verbal !+*$'!"*! · lexonic !%++ !, )*,&%4$%<',-", %(<6,4)*+2%,9-&1 verbal !"%*' !" $%&'()*)*+,'-, 9-&14 vocabular !+*(' · verbatical !+!& · vocabulary !+!+' · verbarian !(#* · lexical !(#+' · lexicalic !(+* · morpholexical !%#%' !- <-*4)4')*+,-"#%X$&%44%1, )*,9-&14 verbal !$#*' · wordy !+&"' !-#!, )*, C%&%,9-&14 wordish .".!$(+'!+%" · wording !+*! · verbal !+*$' · syllabical !+*+ · wordly !+##,&-/. !%&"' !-#!,#!, <-*4)4')*+,C%&%2?,)*,9-&14#4$%%<6 verbal !+!('!+$# !-#!" 9)'6-3',C(*)"%4'(')-*,)*,(<')-* verbal !+&&' !-#!- *-' non-verbal !%&"' · textless !%$"' !. 6(G)*+#&%2(')*+,'-,4$%<)! ,<,*3CB%&,-",9-&14 triverbal !(!" · diverbal !(&$ · many-worded .".!(#&' · monepic .".!(#&' !/ 4D)22%1,)*,'6%,34%,-",9-&14 wordy !+*#'!+(* !/#!, *-' non-verbal !%"& !0 6(G)*+, 2(&+%,G-<(B32(&? copious !$)%...,+,;<-'4,,,../.!+$!'!""&0" · worded !"#) !1 (1-&*%1,9)'6,9-&14 word-painted !("*' !$ 1%4<&)B%1,)*,9-&14 word-painted !%#"' !2 9)'6)*,(,9-&1 intraverbal !%*%...,+,H)<'I,,,../.!%$#' !2#!, )*, 4$%<)! ,<,$-4)')-* word-fi nal !%)%' · word-initial !%)%' · word-medial !%)%' ,! $%&'()*)*+,'-,4'31?,-",9-&14 lexicological !(+"' ,, &%"%&&)*+,'-,"&%U3%*<?,-",9-&14 word frequency !%")' ," -",9-&14a,6(G)*+,4)C)2(&, %*1)*+4 homoteleutic !(&! · homoeoteleutic !(%* ,"#!, &%432')*+,(4,(*,%&&-&,13%,'- homoeoteleutic !((* ,- $%&'()*)*+,'-,-'6%&,4$%<)! ,<,'?$%4,-",9-&1 gefeged .12. · manidel .12. · teart .12. · long-tailed !$)%...,+,;<-'4,,,../.!"+% · communicable .".!++! · unanalogical !"$$...,+,H)<'I,,,.. · learned !(+%,&- · parasynetic !(($ · monorrhemic !%#% ,. 9-&1/"-&/9-&1 word-for-word .!.!+!!/.!($(0% · verbatim !(#)' · word-by-word !(+$' ,/ %X$&%44)*+,96-2%,$6&(4%, B?,-*%,9-&1 holophrastic !(+*' ,0 %X$&%44%1,B?,C%(*4, -",2%X)<-* lexicalised !%)%' ,1 $(44)G%,G-<(B32(&? passive !%#$' ,$ 2)C)'%1,G-<(B32(&? basic !%&%' !"#!$#!.#!/#!, !(!"#'%!)! % 7?,C%(*4,-"#)*,&%4$%<',-",9-&14 verbally !+)+' · lexically !($('!(++ !, B?,C%&%,9-&14 verbally !($$0+' !,#!, 9)'6-3',(<<-C$(*?)*+,&%(2)'? verbally !+!*'!+"( !" )*,4$%<)! ,<,$-4)')-*,)*,9-&1 word- internally !%+)' · word fi nally !%+$' · word-medially !%+(' · word-initially !%"#' !- )*,4-,C(*?,9-&14 in so many words !"&*' !. 9-&1,"-&,9-&1 word after word .!.!&** · word by word !#"%' · word for word .".!)**' · after the word .!.!)** · fro word unto word .!.!)"$ · verbatim !)(!' · word in word !)%# · verbally !$((' · verbatimly !$%" · syllabically !+$)'.".!""( · totidem verbis !+$%'!%*& · verbatim et literatim !")&' !"#!$#!.#!/#!, !(!')%!)! % 53&*)46,9)'6,9-&14 vocabularize !($! !, (<<%$',)*'-,2%X)<-* lexicalise !%#"' !"#!$#!.#!/#!" !(!&%!)! % J6&(4% cwide .12. · foresettedness .12. · forsetnes .12. · word .12. · wordcwide .12. · locution !)#&0$*' · saying !$#* · phrase !$#*' · comma !$(+'!"!# · word !$%#/.!"(*/. !%*#...,+,(&<6I,,,.. · speech !$%+'!+"$ · stand !+!+ · gramm !+)" · diction .".!++*'!"*% · road .".!+%* · slip-slop !(&# · construct !("!' · group-word !%$# !, <-22%<')G%2? speakings .".!#&$'!+$# · saying .!.!$"* · verbalism !(** · verbalities !()*' !" 4%&)%4,-" routine !(&& !- 34%,-" phrasing !+!!' · phraseology !+"*'!+"(...,+,H)<'I,,,.. !. <-*'&(<')-*, -",(,$6&(4% short !%&* !/ '%&C#%X$&%44)-* word< word .12.' · term .!.!)""' · conveyance !$(+ · termination !$%% · epithet !$%%'!+*) · notion !+$$'!+$" !/#!, 34%, -" terming .!.!)!*'!$%! !/#!" ($'#! ,'')*+ a word on its/upon the wheels !+$$'.!.!+$" · mot juste !%!&' !/#!- 9)'6,-*2?,-*%,9-&1 mononym !(()' · monomial !(($ !/#!-#!, 4?4'%C,-" mononymy !(($ !/#!-#!" <-*G%&4)-*,)*'- mononymization !((% !0 <('<6/$6&(4%#4'-<D,$6&(4% byword !$+#0("'!"!* · phrase !$"%' · cant !+(!'.!.!(!$ · cant phrase !"!&'!(+( · cant word !"$#'!"%* · cant term !"") · catch-phrase .".!($*' · wheeze !(%*'...,+,42(*+,M,<-22-UI,,,.. !1 <2)<6c glittering generality !()%'...,+,-&)+I,R;,,,.. · cliché !(%&'...,+,! ,+I,,,.. · thought- saver !%#!' !$ "-&C32( formala .12. · hiw .12. · formula .".!+#(' · cant !+(!'!"!& !2 )1)-C wise .12. · idiom !+&(' ,! $&-G%&B proverb .!.!#"$' · ditton !$"&'!+$# ,!#!, <-22%<' folksay/folk-say !%&%'...,+,R;,,,.. ,, -'6%&,4$%<)! ,<,'?$%4,-",$6&(4% et cetera !$%"' · chr(e)ia !+!&'!+$$0+* · hob-nob !"+!'!""* · phraseograph !()$' · continentalism !($) · snapper !($"'...,+,R;,,,.. · humilifi c !%*$ · binomial !%$%'!%+) ," 34%,-",! ,*%/4-3*1)*+, $6&(4%4 phrase-making !(+"'!%&% ,"#!, -*%,96- sententiolist !++* · phrase-maker !(&&' ,"#!" -*%,96-, %X$2()*4 phraseologist !"&" !"#!$#!.#!/#!" !(!"#$%!)! % J%&'()*)*+,'-,$6&(4%4 phrasical !+!$ · phrasal !("!,&- · construct !("!' !, 34)*+,$6&(4%4 phraseological !++)' · phrasing !((( !" (B-3*1)*+,)*,$6&(4%4 phrasy !()%' !- %X$&%44%1, )*,$6&(4%4 phrased !$$"' · worded !()(' !. 1%(2)*+, 9)'6,$6&(4%4 phraseological !++)' !/ -",$6&(4%4a, "&%U3%*'2?,34%1#6%(&1 rife !$!#'!+"!/.!("( !0 %X$&%44)*+, )*,'%&C4 terming !+)# !1 <-*4)4')*+,-",-*%/9-&1,'%&C mononymic !("& · monomial !(() !"#!$#!.#!/#!" !(!"#'%!)! % @4#9)'6,(,$6&(4% phraseologically !(+"' !, )*,'6%,C(**%&,-",(,"-&C32( formulaically !%$"' !"#!$#!.#!/#!" !(!'(%!)! % R4%,(,$6&(4%#$6&(4%4 phrase .".!$$*/.!((( · phrasify !+##'!+") !, 34%,(,<2)<6c coin a phrase !%)*' !"#!$#!.#!/#!" !(!')%!)! % >X$&%44,)*,$6&(4%4 have .!.!))%' · phrase !$$+' · speak !$"% !, )*,1)""%&%*', $6&(4%4 reword !((&' · rephrase !(%$' · retune !%$% !" <-*G%&',)*'-,-*%/9-&1,'%&C mononymize !((% !"#!$#!.#!/#!- !(!&%!)! % R4%#"-&C(')-*,-",*%9,9-&14# $6&(4%4 coining .".!+(*'!"!& · coinage !+%#' · neology !"%"' · neologism !(**' · minting !()! · neologization !()+ · neonism !()+ · neoterism !("# !, -*%,96- logodaedalus !+!!'!++) · mint-master !+)!'!+%* · logodaedalist !"&"'!(*+ · neologist !"($' · neoterist !("# · verbarian !("# !" *%9,9-&1#$6&(4% mint phrase !+&+ · mintage !+#(' · neologism !(*#' · neology !()+' · neoterism !("# · coinage !("#' !- 9-&1# $6&(4%,B-&&-9%1,"&-C,-'6%&,2(*+3(+% loan-word !(")' · foreignism !(""' · lending !(() · loan-form !%*&' !-#!, *('3&(2)T%1 denizened word !$$+ · denizen .".!+&+/. !%##'!%#) · hobson-jobsonism !%#) · replica !%$+' !-#!,#!, $&-<%44,-" nativization !%"* !-#!,#!" *-' alien !(()' · translation loan(-word) !%**' · loan- translation !%##' · calque !%#"' !-#!" &%B-&&-9%1,9-&1 reborrowing !%$#' !-#!"#!, (<')-*,-" reborrowing !%##' !"#!$#!.#!/#!- !(!"#$%!)! % J%&'()*)*+,'-,34%,-",*%9, 9-&14#$6&(4%4 new-fashion !"&% · neological !"$)'!"") · neologous !(!& · neologismal !(#+ · neoteristic !("# · neologistic !%#$ !, <-)*%1 new-minted !$%('!"!# · new-coined !$%(' · made !+*"'!+(" · coined !((!' !,#!, (B2%,'-, B% coinable !(#% !" -",(,2-(*/9-&1a,(1($'%1,'-,*(')G%, 2(*+3(+% nativized !%## !"#!$#!.#!/#!- !(!'(%!)! % R4%#<-)*,*%9,9-&14#$6&(4%4 neologize !()+' · neoterize !("# !, <(2U3% calque !%$(' !"#!$#!.#!/#!- !(!')%!)! % A-)*,*%9,9-&1#$6&(4% coin !$(%' · feign !+*" · mint !+$%' · new-coin !"**'!(*) !, B-&&-9,"&-C,(*-'6%&,2(*+3(+% usurp !$#!'!+%*/. !($% · borrow !"*+' !,#!, *('3&(2)T% enfranchise !++('.".!")(...,+,! ,+I,,,.. · nativize !%"* !"#!$#!.#!/#!. !(!&%!)! % K%X)<-+&($6? dictionary-making !++(' · lexicography !+(*' · lexigraphy !(&(0#&'...,+,H)<'I,,,.. · dictionary-work !(("' !, 2%X)<-+&($6%& dictionarist !+!" · lexicographer !+$(' · dictionary-maker !"&"'!((& · word-catcher !"#$' · dictionary-writer !")& · lexicographist !(#)0)#' !" 2%X)<-+&($6)<(2,9&)')*+4 lexicographics !"!+ !- 1)<')-*(&? dictionary !$&+' · calepin !$+('!++& · world of words !$%('!+%+ · lexicon !+*#'!()( · thesaurus !"#+'!(+& !-#!, 4$%<)! ,<,1)<')-*(&)%4 alveary !$(* · gradus .".!"+)' · Webster !()#' · the/an unabridged !(+*'!(%) · O.E.D. !(%(' !-#!" 4$%<)! ,<, '?$%4,-",1)<')-*(&? interpreter !+*"'!+"& · etymologicon !+)$'!(+& · pronouncing dictionary !"+)'!($" · rhyming dictionary !""$' · idioticon !()&'!((# · collegiate !(%(...,+,.I,@C%&IW,(24-,H)<'I,,,.. · collegiate dictionary !(%('...,+,-&)+I, H)<'I,,,.. · desk dictionary !%)(' · learner’s dictionary !%)(' · reverse dictionary !%$)' !-#!- $(&'4,-",(, 1)<')-*(&?,%*'&? !-#!-#!, 6%(1/9-&1#/"-&C main word !((( · head-form !%+& · entry form !%+&' · head-word !%++' !-#!-#!" 2%CC( lemma !%$!' !-#!-#!"#!, (<', -",4-&')*+,)*'- lemmatisation !%+"' !-#!-#!- 2(B%2 label !%!!' !. G-<(B32(&?#<-22%<')-*,-",9-&14 vocabular !$#* · vocabulist !$#*,&- · vocabuler !$#*'!"*+ · vocabulary !$#&' · nomenclator !$($'!"*" · word- book !$%(' · verbal !$%%'!+&# · lexicon !+$+'!(&# · nomenclature !+$%'!")$ · vocabula !+%( · vocab !%** !.#!, -*%,96-,<-C$)2%4 vocabulist !$)$/.!(** · nomenclator !+*%'!+&& !/ G-<(B32(&?,-",$&-$%&,*(C%4 onomasticon !"!*' !/#!, -*%,96-,<-C$)2%4 onomastic !+*%'!"!+ !0 +2-44(&? glossary !)(#' !0#!, 4$%<)! ,< microglossary !%$$' !0#!" -*%,96-,<-C$)2%4 glossarist !"(&' !1 1)<')-*(&?,-",4?*-*?C4#(*'-*?C4 sylva !+"$ · synonymicon !(!# · thesaurus !(%('...,+,R;,,,.. !$ '6%4(3&34 thesaurus !($&' !$#!, 4$%<)! ,< Roget !%)*' !2 2)4',-",D%?/9-&14 word-index !%#"' · thesaurus !%$"' !2#!, <-*<-&1(*<% concordance !#("'!(+% · concordant !+&$ !2#!,#!, -*%,96-,9&)'%4 concordist !(!! · concordancer !((( ,! $6&(4%/B--D phrase-book !$%)' · phraseology !""+ !"#!$#!.#!/#!. !(!"#$%!)! % J%&'()*)*+,'-,2%X)<-+&($6? lexicographal !+($ · dictionarial !"$* · lexicographical !"%!' · lexicographian !(!$ · lexicographic !(!+'!()# · lexical !("#' · lexigraphical !(%$ !, $%&'()*)*+,'-, 4$%<)! ,<,1)<')-*(&? Websterian !(")' !" $%&'()*)*+,'-, (,2%CC( lemmatic !%$$' !- $%&'()*)*+,'-,(,+2-44(&? glossarial !(&!' !. $%&'()*)*+,'-,(,<-*<-&1(*<% concordantial !++*'.".!(*& !.#!, 4$%<)! ,< key-word-in- context !%$%' !"#!$#!*#!( !"#$"%!&%'(")*%!&%*'$$+,% n. adj. adv. v. vi. v. pass. vt. v. refl. v. impers. phr. int. conj. prep. !&%$
  5. .".!(++' · pillow-word !(""' · nonce-word !(() · word- symbol

    !%*)' · blessed word !%!*' · object word !%!)' · bogy-word !%!% · key-word !%&+' · fossil !%#!' · nursery word !%##' · four-letter word !%#)' · pseudo-word !%$!' · plus word !%$%' · non-word !%+!' !2 %U3)G(2%*', -",(,9-&1,)*,(*-'6%&,2(*+3(+% synonym !$%)/.!(*) ,! 9-&14,<-22%<')G%2?#G-<(B32(&? wordhord .12. · wordloca .12. · vocabulary !"(&' · wordage !(&%' · word- hoard !(+%' · wordlore !%*) · word-stock !%!!' · lexicon !%##' · lexis !%+*' · vocab !%"!' ,!#!, -",(,4$6%&%,-", (<')G)'? lexicon !+)"'!(#% ,!#!" -",(,U3(2)'?#"%%2)*+ vocabulary !""*' ,!#!- -",(,&%+)-*#4$%(D%& lexicon !%$)' ,!#!. C)*)C3C minimum vocabulary !%))'!%)" ,!#!/ 7(4)<,>*+2)46 Basic English !%&%' ,, 4)C)2(&# <-*"34(B2%,9-&14 homoeoteleft !+$& · word-pair !%#+' · confusables/confusibles !%"%' ,,#!, 4)C)2(&)'?, <(34)*+,C)4'(D%4,)*,<-$?)*+ homoeoteleuton !(+!' · homoeoarchy !((# · homoeotel !((# · homoeotopy ! ! !"#!$#!.#!/ !(!"#$%!)! % J%&'()*)*+,'-,2)*+3)4')<,3*)'4 !, 2%X%C)< lexemic !%$)' !"#!$#!.#!/#!, !(!&%!)! % =-&1 clipung .12. · dæl .12. · word< word .12.' · vocable !$#*'!+#(/.!"(+' · diction !$)&'!+%" · whid !$+"'!(+!...,+,<(*',,,.. · vowel !$"('!+)( · accent !$%$' · phrase !$%"'!+%% · quatch .".!+#$/.!"(#'...,+,1)(2I,,,.. · mot .!.!+)$ · verb .".!"!+ · verbalism !"("' · monolog !%&%'!%)" · dicky-bird !%#&'...,+,42(*+,,,.. · dicky !%)#...,+,42(*+,,,.. · word-type !%+!' · lexical item !%+)' · lexon !%+)' !, 9&)''%* word< word .12.' !" 4$-D%* ideophone !((!'!%*% !- $&-$%&#($$&-$&)('%, 9-&1 the word !$%+' !. +6-4',9-&1 ghost word !((+' · ghost name !(%+ · ghost form !%## !/ 9-&1,9)'6,-*2?, -*%,&%<-&1%1,-<<3&&%*<% hapax legomenon !((&' · hapax !%+& !0 9-&1,%X$&%44)*+,$6&(4%#4%*'%*<% sentence- word !()(' · holophrasm !(+&'!%** · holophrase !(%%' · phrase-word !%##' · monorheme !%#"' !0#!, 34%,-" holophrasis !(+% !1 *-*4%*4%,9-&1 nonsense-name % , , m c l , ! B - p !"# v v v i ! m
  6. Level 1 I The External World II The Mental World

    III The Social World Level 2 26 major categories Level 3 354 categories 236,400 categories and subcategories in all
  7. I: The External World 01. The world 01.01. The earth

    01.02. Life 01.03. Physical sensibility 01.04. Matter 01.04.01. Alchemy 01.04.02. Chemistry 01.04.03. Properties of materials 01.04.04. Constitution of matter 01.04.05. Liquid 01.04.06. Gas 01.04.07. Physics 01.04.08. Light 01.04.09. Colour 01.04.09.07. Named colours 01.04.10. Condition of matter 01.05. Existence in time and space 01.06. Relative properties 01.07. The supernatural
  8. .".!(++' · pillow-word !(""' · nonce-word !(() · word- symbol

    !%*)' · blessed word !%!*' · object word !%!)' · bogy-word !%!% · key-word !%&+' · fossil !%#!' · nursery word !%##' · four-letter word !%#)' · pseudo-word !%$!' · plus word !%$%' · non-word !%+!' !2 %U3)G(2%*', -",(,9-&1,)*,(*-'6%&,2(*+3(+% synonym !$%)/.!(*) ,! 9-&14,<-22%<')G%2?#G-<(B32(&? wordhord .12. · wordloca .12. · vocabulary !"(&' · wordage !(&%' · word- hoard !(+%' · wordlore !%*) · word-stock !%!!' · lexicon !%##' · lexis !%+*' · vocab !%"!' ,!#!, -",(,4$6%&%,-", (<')G)'? lexicon !+)"'!(#% ,!#!" -",(,U3(2)'?#"%%2)*+ vocabulary !""*' ,!#!- -",(,&%+)-*#4$%(D%& lexicon !%$)' ,!#!. C)*)C3C minimum vocabulary !%))'!%)" ,!#!/ 7(4)<,>*+2)46 Basic English !%&%' ,, 4)C)2(&# <-*"34(B2%,9-&14 homoeoteleft !+$& · word-pair !%#+' · confusables/confusibles !%"%' ,,#!, 4)C)2(&)'?, <(34)*+,C)4'(D%4,)*,<-$?)*+ homoeoteleuton !(+!' · homoeoarchy !((# · homoeotel !((# · homoeotopy ! ! !"#!$#!.#!/ !(!"#$%!)! % J%&'()*)*+,'-,2)*+3)4')<,3*)'4 !, 2%X%C)< lexemic !%$)' !"#!$#!.#!/#!, !(!&%!)! % =-&1 clipung .12. · dæl .12. · word< word .12.' · vocable !$#*'!+#(/.!"(+' · diction !$)&'!+%" · whid !$+"'!(+!...,+,<(*',,,.. · vowel !$"('!+)( · accent !$%$' · phrase !$%"'!+%% · quatch .".!+#$/.!"(#'...,+,1)(2I,,,.. · mot .!.!+)$ · verb .".!"!+ · verbalism !"("' · monolog !%&%'!%)" · dicky-bird !%#&'...,+,42(*+,,,.. · dicky !%)#...,+,42(*+,,,.. · word-type !%+!' · lexical item !%+)' · lexon !%+)' !, 9&)''%* word< word .12.' !" 4$-D%* ideophone !((!'!%*% !- $&-$%&#($$&-$&)('%, 9-&1 the word !$%+' !. +6-4',9-&1 ghost word !((+' · ghost name !(%+ · ghost form !%## !/ 9-&1,9)'6,-*2?, -*%,&%<-&1%1,-<<3&&%*<% hapax legomenon !((&' · hapax !%+& !0 9-&1,%X$&%44)*+,$6&(4%#4%*'%*<% sentence- word !()(' · holophrasm !(+&'!%** · holophrase !(%%' · phrase-word !%##' · monorheme !%#"' !0#!, 34%,-" holophrasis !(+% !1 *-*4%*4%,9-&1 nonsense-name % , , m c l , ! B - p !"# v v v i ! m
  9. wordhord OE wordloca OE vocabulary 1782– wordage 1829– wordhoard 1869–

    wordlore 1904 word-stock 1911– lexicon 1933– lexis 1960– vocab 1971–
  10. wordhord OE wordloca OE vocabulary 1782– wordage 1829– wordhoard 1869–

    wordlore 1904 word-stock 1911– lexicon 1933– lexis 1960– vocab 1971–
  11. Food and Drink Health and Disease The Body Biology People

    Clothing Death Cleanliness Textiles (Other) Plants Action Space Physics Chemistry Movement Time The Earth Colour Properties of Materials The Supernatural Physical Sensibility Relative Properties Number Wholeness Quantity Mental Capacity Travel Leisure Work Authority Communication Armed Hostility Faith Society Dwelling Morality Education Emotion Language Possession Faculty of Will Philosophy Refusal and Denial Aesthetics Existence, Creation, Causation Constitution of Matter Animals Modern English 469,470 words; those cited after 1860AD or marked as ‘current’
  12. English in the Age of Johnson 247,933 words; those first

    cited before 1784AD, and last cited after 1709AD
  13. English in the Age of Shakespeare 207,930 words; those first

    cited before 1616AD, and last cited after 1564AD
  14. English in the Age of Chaucer 73,432 words; those first

    cited before 1400AD, and last cited after 1340AD
  15. Existence in time and space Life Relative Properties The Earth

    Matter Physical Sensibility The Supernatural Authority Faith Society Work Armed Hostility Travel Morality Leisure Inhabiting Communication Education Mental Capacity Emotion Possession Language Will Aesthetics and Philosophy
  16. Life Existence in time and space The Earth The Supernatural

    Physical Sensibility Relative Properties Mental Capacity Travel Leisure Work Authority Communication Armed Hostility Faith Society Inhabiting Morality Education Emotion Language Possession Will Aesthetics and Philosophy Matter
  17. Category Growth, 1700-1800 Positive growth in green Negative growth in

    red 01.04.07.03.06 Electromagnetic energy 03.10.13.20 Management of money
  18. Category Growth, 1700-1800 Positive growth in green Negative growth in

    red 01.04.07.03.06 Electromagnetic energy 02.08.02 Language families 03.10.13.20 Management of money 01.04.02.01 Chemistry 01.01.08.04 Geology 03.08.11.08 Journal/newspaper
  19. LL Salience, Thomas Jefferson Corpus 01.06.06.03/01.06.06.05 Greatness in quantity/Increase 01.05.05.21

    inter alia Courtesy/Conduct/Forms of address 03.04.02.01 Command/order 03.11 et seq Leisure
  20. Dr Marc Alexander University of Glasgow Jean Anderson University of

    Glasgow Professor Dawn Archer University of Central Lancashire Dr Alistair Baron Lancaster University Professor Jonathan Hope University of Strathclyde Professor Lesley Jeffries University of Huddersfield Professor Christian Kay University of Glasgow Dr Paul Rayson Lancaster University Dr Brian Walker University of Huddersfield Brian Aitken University of Glasgow Dr Fraser Dallachy University of Glasgow Dr Scott Piao Lancaster University Professor Mark Davies Brigham Young University Professor Anthony Johnson Åbo Akademi University Ilkka Juuso University of Oulu Professor Tapio Seppänen University of Oulu Also Oxford University Press and, through a linked project, the University of Wisconsin-Madison and the Folger Shakespeare Library. Also Oxford University Press and, through a linked project, the University of Wisconsin-Madison and the Folger Shakespeare Library.
  21. Words, words. They’re all we have to go on. Tom

    Stoppard (1967), Rosencrantz and Guildenstern are Dead.
  22. 62% of English word forms refer to more than one

    meaning Of the 793,742 entries in HT there are 370,011 unique non-Old-English word forms, of which: ‣ 67 have more than 100 possible meanings ‣ 464 have more than 50 possible meanings ‣ 2,580 have more than 20 possible meanings ‣ 7,554 have more than 10 possible meanings ‣ 111,127 have more than 1 possible meaning ‣ 258,883 have just 1 possible meaning
  23. Disambiguation 01.02.02.03.05.02.07|04 n Health and disease Medicinal potion/draught :: medicated

    wine wine 1652– 01.02.05.13.09.02.02|06.14 n Plants Tree/plant producing edible berries :: grape-vine wine 1340/70–1632 01.02.09.02.02.02 n Food and drink Wine wine OE– 01.02.09.02.02.02.12 n Food and drink Non-grape and home-made wines wine 1398– 01.02.09.02.14.01|08 vt Make wine store wine/stock cellar wine c1645 01.02.09.02.19|05 vt Provide/serve (with) drink supply with specific drink wine 1862– 01.02.09.02.20.01|18 n Food and drink Drinking vessel :: glass wine 1848– 01.02.09.02.21|10.09 vi Drink drink intoxicating liquor :: drink wine wine (and dine) 1829– 01.04.09.07.03|02.02 n Colour Red/redness :: shades of red :: deep red /crimson wine 1895– 01.04.09.07.03|07 aj Pertaining to colour Red :: deep red/crimson wine 1950– 01.05.05.12.01.03.02|02.03 n Action/operation Patronage :: patron :: as lord/protector wine OE 01.05.05.15.01.05|17.06.03 n Action/operation Care/protection :: protector :: one who looks after a lord wine OE 02.02.22.12|01 n Love One who loves/a lover :: male lover wine OE 02.02.22.15|14 n Love Friendliness :: friend wine OE–1481 02.02.22.15|14.19 n Love Friendliness :: friend :: friendly/gracious lord wine OE 03.04.09.02.01.02.02|09 n Being subject to authority Attendant :: confidential servant/companion wine OE 03.07.05.15.07.02 n Artefacts Wine wine OE– 03.11.02.05.01|11 n Social event Party :: drinking-party wine 1857–
  24. Disambiguation Semantic Context Distance 01.02.02.03.05.02.07|04 n Health and disease Medicinal

    potion/draught :: medicated wine wine 1652– 01.02.05.13.09.02.02|06.14 n Plants Tree/plant producing edible berries :: grape-vine wine 1340/70–1632 01.02.09.02.02.02 n Food and drink Wine wine OE– 01.02.09.02.02.02.12 n Food and drink Non-grape and home-made wines wine 1398– 01.02.09.02.14.01|08 vt Make wine store wine/stock cellar wine c1645 01.02.09.02.19|05 vt Provide/serve (with) drink supply with specific drink wine 1862– 01.02.09.02.20.01|18 n Food and drink Drinking vessel :: glass wine 1848– 01.02.09.02.21|10.09 vi Drink drink intoxicating liquor :: drink wine wine (and dine) 1829– 01.04.09.07.03|02.02 n Colour Red/redness :: shades of red :: deep red /crimson wine 1895– 01.04.09.07.03|07 aj Pertaining to colour Red :: deep red/crimson wine 1950– 01.05.05.12.01.03.02|02.03 n Action/operation Patronage :: patron :: as lord/protector wine OE 01.05.05.15.01.05|17.06.03 n Action/operation Care/protection :: protector :: one who looks after a lord wine OE 02.02.22.12|01 n Love One who loves/a lover :: male lover wine OE 02.02.22.15|14 n Love Friendliness :: friend wine OE–1481 02.02.22.15|14.19 n Love Friendliness :: friend :: friendly/gracious lord wine OE 03.04.09.02.01.02.02|09 n Being subject to authority Attendant :: confidential servant/companion wine OE 03.07.05.15.07.02 n Artefacts Wine wine OE– 03.11.02.05.01|11 n Social event Party :: drinking-party wine 1857–
  25. Disambiguation Semantic Context Distance 01.02.02.03.05.02.07|04 n Health and disease Medicinal

    potion/draught :: medicated wine wine 1652– 01.02.05.13.09.02.02|06.14 n Plants Tree/plant producing edible berries :: grape-vine wine 1340/70–1632 01.02.09.02.02.02 n Food and drink Wine wine OE– 01.02.09.02.02.02.12 n Food and drink Non-grape and home-made wines wine 1398– 01.02.09.02.14.01|08 vt Make wine store wine/stock cellar wine c1645 01.02.09.02.19|05 vt Provide/serve (with) drink supply with specific drink wine 1862– 01.02.09.02.20.01|18 n Food and drink Drinking vessel :: glass wine 1848– 01.02.09.02.21|10.09 vi Drink drink intoxicating liquor :: drink wine wine (and dine) 1829– 01.04.09.07.03|02.02 n Colour Red/redness :: shades of red :: deep red /crimson wine 1895– 01.04.09.07.03|07 aj Pertaining to colour Red :: deep red/crimson wine 1950– 01.05.05.12.01.03.02|02.03 n Action/operation Patronage :: patron :: as lord/protector wine OE 01.05.05.15.01.05|17.06.03 n Action/operation Care/protection :: protector :: one who looks after a lord wine OE 02.02.22.12|01 n Love One who loves/a lover :: male lover wine OE 02.02.22.15|14 n Love Friendliness :: friend wine OE–1481 02.02.22.15|14.19 n Love Friendliness :: friend :: friendly/gracious lord wine OE 03.04.09.02.01.02.02|09 n Being subject to authority Attendant :: confidential servant/companion wine OE 03.07.05.15.07.02 n Artefacts Wine wine OE– 03.11.02.05.01|11 n Social event Party :: drinking-party wine 1857– 01.02.09.02.02.02 01.04.09.07.03|02.02 02.02.22.15|14 03.11.02.05.01|11
  26. Disambiguation Time Filtering 01.02.02.03.05.02.07|04 n Health and disease Medicinal potion/draught

    :: medicated wine wine 1652– 01.02.05.13.09.02.02|06.14 n Plants Tree/plant producing edible berries :: grape-vine wine 1340/70–1632 01.02.09.02.02.02 n Food and drink Wine wine OE– 01.02.09.02.02.02.12 n Food and drink Non-grape and home-made wines wine 1398– 01.02.09.02.14.01|08 vt Make wine store wine/stock cellar wine c1645 01.02.09.02.19|05 vt Provide/serve (with) drink supply with specific drink wine 1862– 01.02.09.02.20.01|18 n Food and drink Drinking vessel :: glass wine 1848– 01.02.09.02.21|10.09 vi Drink drink intoxicating liquor :: drink wine wine (and dine) 1829– 01.04.09.07.03|02.02 n Colour Red/redness :: shades of red :: deep red /crimson wine 1895– 01.04.09.07.03|07 aj Pertaining to colour Red :: deep red/crimson wine 1950– 01.05.05.12.01.03.02|02.03 n Action/operation Patronage :: patron :: as lord/protector wine OE 01.05.05.15.01.05|17.06.03 n Action/operation Care/protection :: protector :: one who looks after a lord wine OE 02.02.22.12|01 n Love One who loves/a lover :: male lover wine OE 02.02.22.15|14 n Love Friendliness :: friend wine OE–1481 02.02.22.15|14.19 n Love Friendliness :: friend :: friendly/gracious lord wine OE 03.04.09.02.01.02.02|09 n Being subject to authority Attendant :: confidential servant/companion wine OE 03.07.05.15.07.02 n Artefacts Wine wine OE– 03.11.02.05.01|11 n Social event Party :: drinking-party wine 1857–
  27. Disambiguation Time Filtering (Present Day) 01.02.02.03.05.02.07|04 n Health and disease

    Medicinal potion/draught :: medicated wine wine 1652– 01.02.05.13.09.02.02|06.14 n Plants Tree/plant producing edible berries :: grape-vine wine 1340/70–1632 01.02.09.02.02.02 n Food and drink Wine wine OE– 01.02.09.02.02.02.12 n Food and drink Non-grape and home-made wines wine 1398– 01.02.09.02.14.01|08 vt Make wine store wine/stock cellar wine c1645 01.02.09.02.19|05 vt Provide/serve (with) drink supply with specific drink wine 1862– 01.02.09.02.20.01|18 n Food and drink Drinking vessel :: glass wine 1848– 01.02.09.02.21|10.09 vi Drink drink intoxicating liquor :: drink wine wine (and dine) 1829– 01.04.09.07.03|02.02 n Colour Red/redness:: shades of red:: deep red /crimson wine 1895– 01.04.09.07.03|07 aj Pertaining to colour Red :: deep red/crimson wine 1950– 01.05.05.12.01.03.02|02.03 n Action/operation Patronage :: patron :: as lord/protector wine OE 01.05.05.15.01.05|17.06.03 n Action/operation Care/protection :: protector :: one who looks after a lord wine OE 02.02.22.12|01 n Love One who loves/a lover :: male lover wine OE 02.02.22.15|14 n Love Friendliness :: friend wine OE–1481 02.02.22.15|14.19 n Love Friendliness :: friend :: friendly/gracious lord wine OE 03.04.09.02.01.02.02|09 n Being subject to authority Attendant :: confidential servant/companion wine OE 03.07.05.15.07.02 n Artefacts Wine wine OE– 03.11.02.05.01|11 n Social event Party :: drinking-party wine 1857–
  28. Disambiguation Time Filtering (1400) 01.02.02.03.05.02.07|04 n Health and disease Medicinal

    potion/draught :: medicated wine wine 1652– 01.02.05.13.09.02.02|06.14 n Plants Plant producing edible berries:: grape-vine wine 1340/70–1632 01.02.09.02.02.02 n Food and drink Wine wine OE– 01.02.09.02.02.02.12 n Food and drink Non-grape and home-made wines wine 1398– 01.02.09.02.14.01|08 vt Make wine store wine/stock cellar wine c1645 01.02.09.02.19|05 vt Provide/serve (with) drink supply with specific drink wine 1862– 01.02.09.02.20.01|18 n Food and drink Drinking vessel :: glass wine 1848– 01.02.09.02.21|10.09 vi Drink drink intoxicating liquor :: drink wine wine (and dine) 1829– 01.04.09.07.03|02.02 n Colour Red/redness :: shades of red :: deep red /crimson wine 1895– 01.04.09.07.03|07 aj Pertaining to colour Red :: deep red/crimson wine 1950– 01.05.05.12.01.03.02|02.03 n Action/operation Patronage :: patron :: as lord/protector wine OE 01.05.05.15.01.05|17.06.03 n Action/operation Care/protection :: protector :: one who looks after a lord wine OE 02.02.22.12|01 n Love One who loves/a lover :: male lover wine OE 02.02.22.15|14 n Love Friendliness :: friend wine OE–1481 02.02.22.15|14.19 n Love Friendliness :: friend :: friendly/gracious lord wine OE 03.04.09.02.01.02.02|09 n Being subject to authority Attendant :: confidential servant/companion wine OE 03.07.05.15.07.02 n Artefacts Wine wine OE– 03.11.02.05.01|11 n Social event Party :: drinking-party wine 1857–
  29. Disambiguation Polyseme Density 01.02.02.03.05.02.07|04 n Health and disease Medicinal potion/draught

    :: medicated wine wine 1652– 01.02.05.13.09.02.02|06.14 n Plants Tree/plant producing edible berries :: grape-vine wine 1340/70–1632 01.02.09.02.02.02 n Food and drink Wine wine OE– 01.02.09.02.02.02.12 n Food and drink Non-grape and home-made wines wine 1398– 01.02.09.02.14.01|08 vt Make wine store wine/stock cellar wine c1645 01.02.09.02.19|05 vt Provide/serve (with) drink supply with specific drink wine 1862– 01.02.09.02.20.01|18 n Food and drink Drinking vessel :: glass wine 1848– 01.02.09.02.21|10.09 vi Drink drink intoxicating liquor :: drink wine wine (and dine) 1829– 01.04.09.07.03|02.02 n Colour Red/redness :: shades of red :: deep red /crimson wine 1895– 01.04.09.07.03|07 aj Pertaining to colour Red :: deep red/crimson wine 1950– 01.05.05.12.01.03.02|02.03 n Action/operation Patronage :: patron :: as lord/protector wine OE 01.05.05.15.01.05|17.06.03 n Action/operation Care/protection :: protector :: one who looks after a lord wine OE 02.02.22.12|01 n Love One who loves/a lover :: male lover wine OE 02.02.22.15|14 n Love Friendliness :: friend wine OE–1481 02.02.22.15|14.19 n Love Friendliness :: friend :: friendly/gracious lord wine OE 03.04.09.02.01.02.02|09 n Being subject to authority Attendant :: confidential servant/companion wine OE 03.07.05.15.07.02 n Artefacts Wine wine OE– 03.11.02.05.01|11 n Social event Party :: drinking-party wine 1857–
  30. Disambiguation Polyseme Density 01.02.02.03.05.02.07|04 n Health and disease Medicinal potion/draught

    :: medicated wine wine 1652– 01.02.05.13.09.02.02|06.14 n Plants Plant producing edible berries :: grape-vine wine 1340/70–1632 01.02.09.02.02.02 n Food and drink Wine wine OE– 01.02.09.02.02.02.12 n Food and drink Non-grape and home-made wines wine 1398– 01.02.09.02.14.01|08 vt Make wine store wine/stock cellar wine c1645 01.02.09.02.19|05 vt Provide/serve (with) drink supply with specific drink wine 1862– 01.02.09.02.20.01|18 n Food and drink Drinking vessel :: glass wine 1848– 01.02.09.02.21|10.09 vi Drink drink intoxicating liquor :: drink wine wine (and dine) 1829– 01.04.09.07.03|02.02 n Colour Red/redness :: shades of red :: deep red /crimson wine 1895– 01.04.09.07.03|07 aj Pertaining to colour Red :: deep red/crimson wine 1950– 01.05.05.12.01.03.02|02.03 n Action/operation Patronage :: patron :: as lord/protector wine OE 01.05.05.15.01.05|17.06.03 n Action/operation Care/protection :: protector :: one who looks after a lord wine OE 02.02.22.12|01 n Love One who loves/a lover :: male lover wine OE 02.02.22.15|14 n Love Friendliness :: friend wine OE–1481 02.02.22.15|14.19 n Love Friendliness :: friend :: friendly/gracious lord wine OE 03.04.09.02.01.02.02|09 n Being subject to authority Attendant :: confidential servant/companion wine OE 03.07.05.15.07.02 n Artefacts Wine wine OE– 03.11.02.05.01|11 n Social event Party :: drinking-party wine 1857–
  31. Disambiguation Human Scale Distance 01.02.02.03.05.02.07|04 n Health and disease Medicinal

    potion/draught :: medicated wine wine 1652– 01.02.05.13.09.02.02|06.14 n Plants Tree/plant producing edible berries :: grape-vine wine 1340/70–1632 01.02.09.02.02.02 n Food and drink Wine wine OE– 01.02.09.02.02.02.12 n Food and drink Non-grape and home-made wines wine 1398– 01.02.09.02.14.01|08 vt Make wine store wine/stock cellar wine c1645 01.02.09.02.19|05 vt Provide/serve (with) drink supply with specific drink wine 1862– 01.02.09.02.20.01|18 n Food and drink Drinking vessel :: glass wine 1848– 01.02.09.02.21|10.09 vi Drink drink intoxicating liquor :: drink wine wine (and dine) 1829– 01.04.09.07.03|02.02 n Colour Red/redness :: shades of red :: deep red /crimson wine 1895– 01.04.09.07.03|07 aj Pertaining to colour Red :: deep red/crimson wine 1950– 01.05.05.12.01.03.02|02.03 n Action/operation Patronage :: patron :: as lord/protector wine OE 01.05.05.15.01.05|17.06.03 n Action/operation Care/protection :: protector :: one who looks after a lord wine OE 02.02.22.12|01 n Love One who loves/a lover :: male lover wine OE 02.02.22.15|14 n Love Friendliness :: friend wine OE–1481 02.02.22.15|14.19 n Love Friendliness :: friend :: friendly/gracious lord wine OE 03.04.09.02.01.02.02|09 n Being subject to authority Attendant :: confidential servant/companion wine OE 03.07.05.15.07.02 n Artefacts Wine wine OE– 03.11.02.05.01|11 n Social event Party :: drinking-party wine 1857– hsd 2.5 1 0 1 0.5 0.5 0.5 1 1 0.5 2 1.5 0.5 0.5 1 2.5 0 0.5
  32. Disambiguation Other Methods Highly polysemous words set (345 meanings), run

    (302), strike (256), fall (206), cast (187), round (179), turn (174), point (169), slip (165), pass (160), shoot (159), take (158)... eg, take (199 results) – likely to be 02.07.13 (‘move a thing from an initial place into one’s possession’) Template rules eg, sneeze usually 01.02.02.01.04.18.14|02 vi ‘have respiratory spasm’, but if in a structure sneeze NP PP (eg ‘sneeze the napkin off the table’) then tag as 01.02.02.01.04.18.14|03 vt ‘eject/cast by sneezing’ Topic identification Collocation/machine learning (OED)
  33. EEBO-TCP Corpus (40,000 Early Modern English texts; almost all the

    books and pamphlets published in English before 1700) Hansard Corpus (2.3 billion words; approximately every word uttered in Parliament over the past two hundred years)
  34. Hansard and Parliament from Above A birds-eye view of parliamentary

    concerns over the past two centuries (University of Glasgow) Is There a Baron in the Commons? The representation of trade unions and their members in Hansard across the past hundred years (University of Huddersfield) Delineating Aggression Across Genres (1473-1700) The speech acts of aggression to explore the nuances of shifting meanings in EEBO-TCP (University of Central Lancashire)
  35. Semantic Annotation System VARD CLAWS HT sense tagger USAS NLP

    lexicon resources USAS [HT-related resources] Historical Thesaurus; Higher-level HT categories; Linked HT categories; Highly polysemous words; Z-category words; Polyseme density list; Input raw text Annotated text HT sense disambiguator Spelling training model
  36. Current Methods • Pre-process text using VARD, CLAWS and USAS.

    • Currently: context-distance method for disambiguation • Filter by POS. • For each candidate category, extract all possible parent categories and collect headings of them, including current heading. • Words in the headings form a feature set HWi = {h1 , h2 , …, hm }. • Collect up to five content words from each side of the key word/MWE. • Together with the target word/MWE wt , they form a context feature set CW={wt , w1 , w2 , …, wn }. • Measure Jaccard Distance between CW and each HWi , and select the candidate categories (up to three) that have close distances to the context. • If the previous steps fail, • Check core HT categories of the key word from a compiled list • If not found, check for default HT categories from polyseme density list.
  37. Access • Web demo site: http://phlox.lancs.ac.uk/ucrel/semtagger/english – Constrained access for

    quick trial. • A more convenient GUI tool for processing multiple texts – Contact us if you want a copy! • Access via Web server client – Contact us if you are interested. • (Soon) Access via WMatrix – http://ucrel.lancs.ac.uk/wmatrix
  38. Evaluation • Ten texts were selected from different genres and

    time periods (1820 to 2014) and manually tagged by our RAs • Each text contains about 1,000 words. • Evaluated for both HT senses and thematic senses • Evaluation criterion: If top three of the candidate tags suggested by the system contain the correct tag(s), this is considered to have been correctly annotated –In our evaluation, 81.18% and 12.74% of the correct tags were the first and second candidate tags respectively.
  39. Document Text type Year of publication HT full cat. precision

    (%) HT main cat. precision (%) Thematic cat precision (%) Biography Written biography (Life of Theodore Roosevelt) 1904 70.50 72.93 73.41 Convers Spoken conversation (SCOTS Corpus) 2014 69.31 71.72 73.77 Email Email messages (Enron Corpus) 2001 67.60 69.79 70.89 Fiction1 Dickens, Bleak House 1852 70.27 73.82 75.32 Fiction2 Wodehouse, Something Fresh 1915 73.31 75.10 75.60 Hans1820 Hansard Speech (William Scott) 1820 76.73 79.45 80.33 Hans2001 Hansard Speech (Tony Blair) 2001 74.61 77.32 77.42 History Gibbon’s Decline and Fall (II.ii) 1845 73.83 77.73 77.98 Journalism Longform Journalism (The Radioactive Boy Scout) 1998 66.67 70.45 71.74 NewsCol Newspaper opinion column (Drinking Is Like A Minibreak) 2010 70.69 73.43 74.91 Total Total Total 71.56 74.44 75.36
  40. The Fabric of the Cosmos Relevant: 01.05.07 Space (LL 13655.8)

    01.05.07.01 Distance (LL 6344.8) 01.04.07.05.04.08 Photon (LL 4912.5) 01.05.06.07 Computation of time (LL 3603.5) Analogical: 01.02.09.15 Spinning textiles (LL 3193.5) 03.11.03.01.08.02 Stringed instruments (LL 2277.7) 03.11.03.02.09.14 Pattern/design (LL 1949.8) 01.02.09.14.01.03 Woven fabric (LL 1922.2)
  41. Since we speak of the ‘fabric’ of spacetime, the suggestion

    goes, maybe spacetime is stitched out of strings much as a shirt is stitched out of thread. That is, much as joining numerous threads together in an appropriate pattern produces a shirt’s fabric, maybe joining numerous strings together in an appropriate pattern produces what we commonly call spacetime’s fabric. Matter, like you and me, would then amount to additional agglomerations of vibrating strings. Greene 2004: 486-7
  42. 0 5 10 15 20 FC001.txt FC009.txt FC020.txt FC031.txt FC040.txt

    FC050.txt FC060.txt FC069.txt FC079.txt FC087.txt FC095.txt FC108.txt FC118.txt FC154.txt FC163.txt FC174.txt FC186.txt FC201.txt FC216.txt FC229.txt FC244.txt FC263.txt FC273.txt FC282.txt FC295.txt FC313.txt FC329.txt FC336.txt FC349.txt FC359.txt FC374.txt FC390.txt FC398.txt FC410.txt FC417.txt FC424.txt FC431.txt FC439.txt FC446.txt FC453.txt FC460.txt FC467.txt FC474.txt FC487.txt FC496.txt FC510.txt FC520.txt FC537.txt FC550.txt FC558.txt FC569.txt FC577.txt FC584.txt FC591.txt Figure 5: FC Analogical Textual Clusters Frequency Filename fabric feel figure new region sense string
  43. Gauss’s two-dimensional map of imaginary numbers charts the numbers that

    we shall feed into the zeta function. The north-south axis keeps track of how many steps we take in the imaginary direction, whilst the east west axis charts the real numbers. We can lay this map out flat on a table. What we want to do is to create a physical landscape situated in the space above this map. The shadow of the zeta function will then turn into a physical object whose peaks and valleys we can explore. Sautoy 2003: 85
  44. 0 7.5 15.0 22.5 30.0 MP003.txt MP010.txt MP018.txt MP026.txt MP032.txt

    MP039.txt MP046.txt MP057.txt MP062.txt MP068.txt MP074.txt MP083.txt MP088.txt MP093.txt MP099.txt MP105.txt MP111.txt MP119.txt MP127.txt MP135.txt MP143.txt MP149.txt MP157.txt MP162.txt MP170.txt MP175.txt MP183.txt MP189.txt MP195.txt MP202.txt MP208.txt MP216.txt MP221.txt MP226.txt MP231.txt MP240.txt MP246.txt MP252.txt MP258.txt MP263.txt MP269.txt MP277.txt MP282.txt MP288.txt MP293.txt MP298.txt MP305.txt MP310.txt MP317.txt MP324.txt Figure 7: MP Analogical Textual Clusters Frequency Filename far level line point way
  45. Thank you! [email protected] Historical Thesaurus of English: www.glasgow.ac.uk/thesaurus SAMUELS Information

    Site: www.glasgow.ac.uk/samuels SAMUELS Alpha Test Site: http://is.gd/semtag
  46. Food and Drink Health and Disease The Body Biology People

    Clothing Death Cleanliness Textiles (Other) Plants Action Space Physics Chemistry Movement Time The Earth Colour Properties of Materials The Supernatural Physical Sensibility Relative Properties Number Wholeness Quantity Mental Capacity Travel Leisure Work Authority Communication Armed Hostility Faith Society Dwelling Morality Education Emotion Language Possession Faculty of Will Philosophy Refusal and Denial Aesthetics Existence, Creation, Causation Constitution of Matter Animals Modern English 469,470 words; those cited after 1860AD or marked as ‘current’
  47. Food and Drink Health and Disease The Body Biology People

    Clothing Death Cleanliness Textiles (Other) Plants Action Space Physics Chemistry Movement Time The Earth Colour Properties of Materials The Supernatural Physical Sensibility Relative Properties Number Wholeness Quantity Mental Capacity Travel Leisure Work Authority Communication Armed Hostility Faith Society Dwelling Morality Education Emotion Language Possession Faculty of Will Philosophy Refusal and Denial Aesthetics Existence, Creation, Causation Constitution of Matter Animals Metaphorical ‘Mappings’
  48. I: The External World 01. The world 01.01. The earth

    01.02. Life 01.03. Physical sensibility 01.03.01. Sleeping and waking 01.03.02. Sexual relations 01.03.03. Use of drugs, poison 01.03.04. Touch 01.03.05. Taste/flavour 01.03.06. Smell/odour 01.03.07. Sight 01.03.08. Hearing/noise 01.04. Matter 01.05. Existence in time and space 01.06. Relative properties 01.07. The supernatural
  49. 02.02. Emotion 02.02.01. Seat of the emotions 02.02.02. Emotional perception

    02.02.03. Quality of affecting emotions 02.02.04. Effect produced on emotions 02.02.05. Emotional attitude 02.02.06. State of feeling/mood 02.02.07. Manifestation of emotion 02.02.08. Capacity for emotion 02.02.09. Sentimentality 02.02.10. Absence of emotion 02.02.11. Types of emotion 02.02.12. Intense/deep emotion 02.02.13. Sincere/earnest emotion 02.02.14. Zeal/earnest enthusiasm 02.02.15. Strong feeling/passion 02.02.16. Violent emotion 02.02.17. Excitement 02.02.18. Composure/calmness 02.02.19. Pleasure/enjoyment 02.02.20. Mental pain/suffering 02.02.21. Anger 02.02.22. Love 02.02.23. Hatred/enmity 02.02.24. Indifference 02.02.25. Pity/compassion 02.02.26. Jealousy/envy 02.02.27. Gratitude 02.02.28. Pride 02.02.29. Humility 02.02.30. Fear 02.02.31. Courage 0% 15% 30% 45% 60% Sound Sight Taste Smell Touch
  50. Taste Sentimentality 9.5% Intense/deep emotion 9.5% Emotional perception 8.3% Emotional

    attitude 7.4% Violent emotion 5.0% Emotion 4.1% Anger 3.9% Zeal/earnest enthusiasm 3.1%
  51. Touch Emotional perception 16.7% Emotional attitude 7.4% Capacity for emotion

    4.7% Emotion 4.7% Quality of affecting emotions 2.7% Mental capacity 1.9% Effect produced on emotions 1.5% Excitement 1.4%
  52. Mapping Metaphor: N02 Wealth A13 Flow/flowing affluent, flow, confluent H27

    Attention, judgement solid, juicy, plenty, enrich B28 Bodily shape/physique fat, plum, pursy, full, opulent, fatten B06 Health and disease well, strong, solid A07 Wild/uncultivated land B73 Food
  53. Mapping Metaphor: N03 Poverty E03 Destruction ruin, waste D38 Matter,

    bad condition of waste, decay T05 Moral evil naught, ruin, fall, mean B28 Bodily shape/physique pinched, starved, withered, poorness, feeble E45 Position, relative bare, stark, skinned H31 Contempt beggar, pinch, cheapo, bankrupt, lowness, ruin, poorly I15 Humility lowness, embarrassed, broken, poorly E23 Harm/injury/detriment mischief (‘They bee nowe in grete myschief and necessite.’) I06 Mental pain/suffering stony
  54. 03.01.03.02 Civilization 5 10 15 20 25 1350 1450 1550

    1650 1750 1850 1950 n: Lack of civilization aj: Uncivilized aj: Pertaining to civilization av: In uncivilized manner n: Civilization vt: Render uncivilized vt: Make civilized
  55. Uncivilized | Wild wild a1300– wildern a1300 fremd c1374 Chaucer,

    Troylus & Crysede (c1374): Al this world is blynd In this matere, bothe fremed and tame. bestial c1400– Mandeville’s Voyages (c1400): Thei weren but bestyalle folk, and diden no thing but kepten Bestes.
  56. Uncivilized | Wild savage c1420/30– Dryden, The Conquest of Granada

    (1672): I am as free as Nature first made man, 'Ere the base Laws of Servitude began, When wild in woods the noble Savage ran. warrigal 1855–(1890) Australian Old Bush Songs (1855): I'm a warragle fellow that long hath dwelt In the wild interior, nor hath felt, Nor heard, nor seen the pleasures of town.
  57. Uncivilized | Rough/Crude rude 1483– raw 1577– Harrison, England, in

    Holinshead, Chronicles (1587): Men, being as then but raw and void of ciiuilitie. ruvid 1632 Lithgow, The totall discourse of the rare adventures and painefull peregrinations of long nineteen yeares travayles (1632): The ruvid Cittizens, being Turkes, Moores, Iewes, … and Nostranes.
  58. Uncivilized | Barbar barbaric 1490-1533; a1837 The sense-development in ancient

    times was (with the Greeks) ‘foreign, non-Hellenic,’ later ‘outlandish, rude, brutal’; (with the Romans) ‘not Latin nor Greek,’ then ‘pertaining to those outside the Roman empire’; hence ‘uncivilized, uncultured,’ and later ‘non-Christian,’ whence ‘Saracen, heathen’; and generally ‘savage, rude, savagely cruel, inhuman’. [J.A. H. Murray, Etymology for barbarous, A New English Dictionary on Historical Principles, Fa.3, 1887]
  59. Uncivilized | Barbar barbaric 1490-1533; a1837 Aikin, General Biography (1799):

    At length, he came forth in all the splendor of his imperial dignity to give them an amicable welcome, and the Spanish historians employ the loftiest terms in describing the barbaric grandeur of his appearance. barbar 1535-a1726 barbarous 1538- barbarious 1570-1762 barbarian 1591- semi-barbarous 1798- semi-barbaric 1864
  60. Uncivilized | Civilness incivil 1586 uncivilized 1607– incivilized 1647 Cowley,

    Welcome (The Mistress) (1647): Either by savages possest, Or wild and uninhabited? What joy couldst take, or what repose, In countries so unciviliz'd as those?
  61. Uncivilized | Civilness inhumane a1680 Butler, Remains (a1680): There's nothing

    so absurd, or vain, Or barbarous, or inhumane, But if it lay the least Pretence To Piety and Godliness… Does sacred instantly commence. irreclaimed 1814 pre-civilized 1953–
  62. Uncivilized | The Other Scythical 1559-1602 Herring, Anatomyes of the

    true physition and counterfeit mounte-banke (1602): Such Schythicall… torturing and massacring of Men. negerous 1609
  63. Uncivilized | The Other mountainous 1613-1851 Mainwaring and Oldmixton, in

    Ellis, Swift vs. Mainwaring (1711): England… bounded on the North by a poor mountainous People call'd Scots. tramontane 1739-1832
  64. Uncivilized | The Other jungle 1908– jungli 1920– Chambers’ Journal

    (Jan 1927): Already he ceases to be jungli*. Note: Wild and boorish, a clodhopper or uneducated peasant.
  65. Sydney Smith, Letter to Francis Jeffrey (Mar 1814): When shall

    I see Scotland again? Never shall I forget the happy days I passed there amidst odious smells, barbarous sounds, bad suppers, excellent hearts, and most enlightened and cultivated understandings.