The length of words reflects their conceptual complexity.

Molly Lewis Michael C. Frank The role of communicative pressures
in shaping the lexicon

However, limits to arbitrariness (Köhler, 1929; Maurer, et al., 2006;
Ramachandran & Hubbard, 2001; Farmer, Christiansen, & Monaghan, 2006; Zipf, 1936; Piantadosi, Tily, & Gibson, 2011) horse “The linguistic sign is arbitrary” – Saussure (1916) !kalë !ناصح !ձի !at !zaldi !конь ! ঘা#া !konj !кон !cavall!kabayo ! ! !konj !kůň !hest !paard!ĉevalo !hobune !kabayo !hevonen !cheval !cabalo !Pferd !άλογο!ઘોડો !chwal!doki !סוס !घोड़ा !nees !ló !hestur !anyịnya !kuda !capall!cavallo ! !jaran !!"# !សេះ !݈ມ"າ !equo !zirgs !arklys!коњ !kuda !! !hoiho!घोडा !адуу !घोडा !hest !بسا !koń !cavalo !ਘ"ੜਾ !cal !лошадь !коњ !kôň !konj !faras !caballo !farasi!häst !!"ை !!ర#$ !ม้า !кінь !اڑوھگ !ngựa !ceﬀyl !!

Complexity Bias A bias to map longer words (in terms
of phonemes, morphemes, syllables) to more complex referents tupabugorn

Complexity Bias Theories of communication predict tradeoff between length and
predictability Horn Implicatures (Horn, 1984) I turned on the car. I got the car to turn on. TYPICAL ATYPICAL Uniform Information Density (Aylett & Turk, 2004; A. Frank & Jaeger, 2008)

Outline I. Do participants have a productive complexity bias? –
Novel real objects (Study 1) – Artificial objects (Study 2) II. What is complexity? (Study 3) III. Is there a complexity bias in the lexicon? – English (Study 4) – Cross-linguistically (Study 5)

Study 1a: Explicit complexity judgment

N=60 Least complex Most complex

Study 1b: Mapping task Map novel word to novel object,
given 2 alternatives

Study 1b: Design Referent complexity x word length (within subject)
Linguistic stimuli: – short words (e.g., "bugorn,” "ratum,” "lopus”) – long words (e.g., "tupabugorn,” "gaburatum,” "fepolopus") Referent stimuli: – Divided objects into quintiles, based on explicit complexity norms – Tested every pairing of quintiles (15 conditions): 1/1, 1/2, 1/3, 1/4, 1/5, 2/2, 2/3, etc. Procedure: 8 trials/participant

Example 1/5 Trial Quintile 1 2 3 4 5

Study 1b: Results • • • • • • •
• • • • • • • • 1/1 1/2 1/3 1/4 1/5 2/2 2/3 2/4 2/5 3/3 3/4 3/5 4/4 4/5 5/5 r= −0.7 −0.25 0.00 0.25 0.50 0.50 0.75 1.00 complexity rating ratio effect size (cohen's d) N = 1500 Target biased to have long label Target less complex

Evidence for a productive complexity bias in an online mapping
task But: manipulate complexity correlationally (difficult to interpret causation) Study 2: Direct manipulation of complexity Is there a productive complexity bias?

Quintile 1 2 3 4 5 Example 1/5 Trial Study
2: Stimuli

Study 2: Results N = 750 Target biased to have
long label Target less complex • • • • • • • • • • • • • • • 1/1 1/2 1/3 1/4 1/5 2/2 2/3 2/4 2/5 3/3 3/4 3/5 4/4 4/5 5/5 r= −0.87 −0.25 0.00 0.25 0.50 0.25 0.50 0.75 1.00 1.25 complexity rating ratio effect size (cohen's d)

Evidence for a productive complexity bias in online mapping task
– Manipulating complexity both correlationally and directly Complexity quantified in terms of visual complexity But: What is the underlying complexity construct? What is complexity?

In visual cognition, use processing time as index of information
load (Alvarez & Cavanaugh, 2004) – more information requires more processing time – not perfect measure, but expect monotonic relationship – search rate task < What is complexity?

Recognition memory task measure study time per object (30 objects)
(60 objects) Study 3: Implicit complexity judgment

• • • • • • • • • •
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • r= 0.52 7.0 7.2 7.4 7.6 0.00 0.25 0.50 0.75 1.00 Object Complexity Norms Log RT (ms) Novel object complexity norms NRT = 494 NC = 60

Study 3a: Novel Real Objects • • • • •
• • • • • • • • • • 1/1 1/2 1/3 1/4 1/5 2/2 2/3 2/4 2/5 3/3 3/4 3/5 4/4 4/5 5/5 r= −0.7 −0.25 0.00 0.25 0.50 0.50 0.75 1.00 complexity rating ratio effect size (cohen's d) Complexity Norms • • • • • • • • • • • • • • • 1/1 1/2 1/3 1/4 1/5 2/2 2/3 2/4 2/5 3/3 3/4 3/5 4/4 4/5 5/5 r= −0.71 −0.25 0.00 0.25 0.50 0.985 0.990 0.995 1.000 RT ratio effect size (cohen's d) RT Norms

• • • • • • • • • •
• • • • • 1/1 1/2 1/3 1/4 1/5 2/2 2/3 2/4 2/5 3/3 3/4 3/5 4/4 4/5 5/5 r= −0.8 −0.25 0.00 0.25 0.50 0.95 0.96 0.97 0.98 0.99 1.00 RT ratio effect size (cohen's d) Study 3b: Artificial Objects • • • • • • • • • • • • • • • 1/1 1/2 1/3 1/4 1/5 2/2 2/3 2/4 2/5 3/3 3/4 3/5 4/4 4/5 5/5 r= −0.87 −0.25 0.00 0.25 0.50 0.25 0.50 0.75 1.00 1.25 complexity rating ratio effect size (cohen's d) Complexity Norms RT Norms

Exp. 1-2: suggest a productive complexity bias with novel words
Exp. 3: Complexity bias related to processing time. Next: Is this bias present in natural languages? Study 4: Explicit complexity norms for English words Is this bias in natural language?

Study 4: English complexity norms Rate words for complexity alphabet

Complexity norms Normed 499 English words 30 words/participant N =
250 participants Word Lengths Word Length (characters) Frequency 2 4 6 8 10 12 0 20 40 60 80 100 120 140

Study 4: Results r CL F = .60 N =
250 Characters: Phonemes: r CL = .69 r CL F = .61 Syllables: r CL = .67 r CL F = .58 r= 0.69 1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Word Length (characters) Complexity Rating Reliable controlling for concreteness, familiarity and imagability

Study 5: Cross-linguistic Evidence that complexity is related to length
in English (controlling for other semantic variables) But: does this extend to other languages? Examined relationship between word lengths for normed words in 80 languages Google translate – Native speakers hand-checked 12 languages – Accuracy: 92%

0.0 0.2 0.4 0.6 english afrikaans maltese danish norwegian macedonian
yiddish dutch russian serbian croatian portuguese espernto galician basque bosnian welsh armanian italian swedish georgian belarusian icelandic estonian bulgarian german hungarian latvian ukranian spanish thai french nepali polish chinese czech hmong slovenian slovak mongolian hindi zulu vietnamese finnish swahili irish lao hausa filipino lithuanian haitian.creole romanian khmer punjabi catalan gujarati indonesian greek hebrew azerbaijani malay cebuana javanese albanian kanada turkish yoruba maori somali korean telugu urdu tamil bengali arabic latin japanese igbo persian marathi Language Pearson's r 0.0 0.2 0.4 0.6 english afrikaans maltese danish norwegian macedonian yiddish dutch russian serbian croatian portuguese espernto galician basque bosnian welsh armanian italian swedish georgian belarusian icelandic estonian bulgarian german hungarian latvian ukranian spanish thai french nepali polish chinese czech hmong slovenian slovak mongolian hindi zulu vietnamese finnish swahili irish lao hausa filipino lithuanian haitian.creole romanian khmer punjabi catalan gujarati indonesian greek hebrew azerbaijani malay cebuana javanese albanian kanada turkish yoruba maori somali korean telugu urdu tamil bengali arabic latin japanese igbo persian marathi Language Pearson's r 0.0 0.2 0.4 0.6 english afrikaans maltese danish norwegian macedonian yiddish dutch russian serbian croatian portuguese espernto galician basque bosnian welsh armanian italian swedish georgian belarusian icelandic estonian bulgarian german hungarian latvian ukranian spanish thai french nepali polish chinese czech hmong slovenian slovak mongolian hindi zulu vietnamese finnish swahili irish lao hausa filipino lithuanian haitian.creole romanian khmer punjabi catalan gujarati indonesian greek hebrew azerbaijani malay cebuana javanese albanian kanada turkish yoruba maori somali korean telugu urdu tamil bengali arabic latin japanese igbo persian marathi Language Pearson's r # open class words = 453 Correlation between complexity norm and word length

• • • • • • • • • •
• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Cross−Linguistic Complexity Bias Geographical distribution of complexity bias

Complexity bias by language family 0.0 0.1 0.2 0.3 0.4
Basque Kartvelian Indo−European Uralic Sino−Tibetan Tai−Kadai Hmong−Mien Austro−Asiatic Creoles and Pidgins Afro−Asiatic Altaic Austronesian Niger−Congo Korean Dravidian Japanese Language Family Pearson's r Complexity bias by language family

Conclusion Evidence for: – a complexity bias in the lexicon
– productive – related to a basic cognitive process Suggests: – complexity as constraint on arbitrariness in language – cognitive biases are reflected in the structure of the lexicon – communicative biases may shape the lexicon

Thank you

The length of words reflects their conceptual c...

The length of words reflects their conceptual complexity.

mllewis

More Decks by mllewis

Other Decks in Science

Featured

Transcript

Molly Lewis Michael C. Frank The role of communicative pressures

However, limits to arbitrariness (Köhler, 1929; Maurer, et al., 2006;

Complexity Bias A bias to map longer words (in terms

Complexity Bias Theories of communication predict tradeoff between length and

Outline I. Do participants have a productive complexity bias? –

Study 1a: Explicit complexity judgment

N=60 Least complex Most complex

Study 1b: Mapping task Map novel word to novel object,

Study 1b: Design Referent complexity x word length (within subject)

Example 1/5 Trial Quintile 1 2 3 4 5

Study 1b: Results • • • • • • •

Evidence for a productive complexity bias in an online mapping

Quintile 1 2 3 4 5 Example 1/5 Trial Study

Study 2: Results N = 750 Target biased to have

Evidence for a productive complexity bias in online mapping task

In visual cognition, use processing time as index of information

Recognition memory task measure study time per object (30 objects)

• • • • • • • • • •

Study 3a: Novel Real Objects • • • • •

• • • • • • • • • •

Exp. 1-2: suggest a productive complexity bias with novel words

Study 4: English complexity norms Rate words for complexity alphabet

Complexity norms Normed 499 English words 30 words/participant N =

Study 4: Results r CL F = .60 N =

Study 5: Cross-linguistic Evidence that complexity is related to length

0.0 0.2 0.4 0.6 english afrikaans maltese danish norwegian macedonian

• • • • • • • • • •

Complexity bias by language family 0.0 0.1 0.2 0.3 0.4

Conclusion Evidence for: – a complexity bias in the lexicon

Thank you