Slide 1

Slide 1 text

Towards Diverse and Fair Language Generation -- Teaching ChatGPT to be nice Danushka Bollegala Inaugural Lecture

Slide 2

Slide 2 text

A Brief History of my Academic Life 2

Slide 3

Slide 3 text

A Brief History of my Academic Life 2

Slide 4

Slide 4 text

A Brief History of my Academic Life 2 /BMBOEB$PMMFHF $PMPNCP 4SJ-BOLB

Slide 5

Slide 5 text

A Brief History of my Academic Life 2

Slide 6

Slide 6 text

A Brief History of my Academic Life 2

Slide 7

Slide 7 text

A Brief History of my Academic Life 2 6OJWPG5PLZP +BQBO

Slide 8

Slide 8 text

A Brief History of my Academic Life 2 d 6OJWPG5PLZP +BQBO

Slide 9

Slide 9 text

A Brief History of my Academic Life 2 d d 6OJWPG5PLZP +BQBO

Slide 10

Slide 10 text

A Brief History of my Academic Life 2 d d d 6OJWPG5PLZP +BQBO

Slide 11

Slide 11 text

A Brief History of my Academic Life 2 d d d d 6OJWPG5PLZP +BQBO

Slide 12

Slide 12 text

A Brief History of my Academic Life 2 d d d d

Slide 13

Slide 13 text

A Brief History of my Academic Life 2 d d d d d

Slide 14

Slide 14 text

A Brief History of my Academic Life 2 d d d d d

Slide 15

Slide 15 text

A Brief History of my Academic Life 2 d d d d d d

Slide 16

Slide 16 text

A Brief History of my Academic Life 2 d d d d d d

Slide 17

Slide 17 text

What is Natural Language Processing? 3

Slide 18

Slide 18 text

What is Natural Language Processing? • I have been a researcher in NLP for nearly two decades (my fi rst NLP paper was published in 2005) 3

Slide 19

Slide 19 text

What is Natural Language Processing? • I have been a researcher in NLP for nearly two decades (my fi rst NLP paper was published in 2005) 3

Slide 20

Slide 20 text

What is Natural Language Processing? • I have been a researcher in NLP for nearly two decades (my fi rst NLP paper was published in 2005) 3

Slide 21

Slide 21 text

What is Natural Language Processing? • I have been a researcher in NLP for nearly two decades (my fi rst NLP paper was published in 2005) • Natural Language Processing: 3

Slide 22

Slide 22 text

What is Natural Language Processing? • I have been a researcher in NLP for nearly two decades (my fi rst NLP paper was published in 2005) • Natural Language Processing: • The branch of Computer Science that is concerned with developing algorithms to process languages spoken by humans (human vs. programming languages). 3

Slide 23

Slide 23 text

What is Natural Language Processing? • I have been a researcher in NLP for nearly two decades (my fi rst NLP paper was published in 2005) • Natural Language Processing: • The branch of Computer Science that is concerned with developing algorithms to process languages spoken by humans (human vs. programming languages). • Applications: Information Retrieval/Extraction, Machine Translation, Text Summarisation, Question Answering, Dialogue Systems, … 3

Slide 24

Slide 24 text

What is Natural Language Processing? • I have been a researcher in NLP for nearly two decades (my fi rst NLP paper was published in 2005) • Natural Language Processing: • The branch of Computer Science that is concerned with developing algorithms to process languages spoken by humans (human vs. programming languages). • Applications: Information Retrieval/Extraction, Machine Translation, Text Summarisation, Question Answering, Dialogue Systems, … 3

Slide 25

Slide 25 text

Evolution of NLP: 1950~1960 4

Slide 26

Slide 26 text

Evolution of NLP: 1950~1960 4 5VSJOH5FTU

Slide 27

Slide 27 text

Evolution of NLP: 1950~1960 4 5VSJOH5FTU 5SBOTGPSNBUJPOBM(FOFSBUJWF(SBNNBS

Slide 28

Slide 28 text

From Turing’s paper 5

Slide 29

Slide 29 text

From Turing’s paper 5

Slide 30

Slide 30 text

From Turing’s paper 5

Slide 31

Slide 31 text

From Turing’s paper 5

Slide 32

Slide 32 text

From Turing’s paper 5

Slide 33

Slide 33 text

From Turing’s paper 5

Slide 34

Slide 34 text

From Turing’s paper 5 GPT says that the rook is in h1 which is wrong (and in fact the position would be already checkmate in that case), rook could be either in h8 or a8. Then the check-mate would be R8, (h8 or a8 depending on the initial position). 
 
 Jose-Camacho Collados (International Master)

Slide 35

Slide 35 text

Evolution of NLP: 1960~1970 6

Slide 36

Slide 36 text

Evolution of NLP: 1960~1970 6 3VMFCBTFE4ZTUFNT

Slide 37

Slide 37 text

Evolution of NLP: 1960~1970 6 3VMFCBTFE4ZTUFNT "VUPNBUJD-BOHVBHF1SPDFTTJOH "EWJTPSZ$PNNJUUFF "-1"$ SFQPSUPG TUBUFEUIBU.BDIJOF5SBOTMBUJPOJT BXBTUFPG64HPWUSFTFBSDITQFOEJOH XIJDISFTVMUFEJOBGVOEJOHTUPQGPS /-1

Slide 38

Slide 38 text

Evolution of NLP: 1970~1980 7 Pick up a big red block. OK. (On the screen, the robot arm swings into action. Two red blocks are visible, one small, one large, as on figure above. The large one has a green cube stacked on top of it. The robot first transfers the green cube to the table top, and then picks up the red block.) Find a block which is taller than the one you are holding and put it into the box. BY 'IT', I ASSUME YOU MEAN THE BLOCK WHICH IS TALLER THAN THE ONE I AM HOLDING. (SHRDLU puts down the red block, picks up a taller blue one, and places it in the box.) What does the box contain? THE BLUE PYRAMID AND THE BLUE BLOCK. What is the pyramid supported by? THE BOX. 4)3%-6 5FSSZ8JOPHSBE

Slide 39

Slide 39 text

Evolution of NLP: 1980~1990 8

Slide 40

Slide 40 text

Evolution of NLP: 1980~1990 8 4UBUJTUJDBM.FUIPETJO/-1

Slide 41

Slide 41 text

Evolution of NLP: 1980~1990 8 4UBUJTUJDBM.FUIPETJO/-1 "OBMZTFMBSHFWPMVNFTPGUFYUT DPSQPSB BOEJEFOUJGZTUBUJTUJDBMMZTJHOJ fi DBOUDPSSFMBUJPOT

Slide 42

Slide 42 text

Evolution of NLP: 1980~1990 8 4UBUJTUJDBM.FUIPETJO/-1 "OBMZTFMBSHFWPMVNFTPGUFYUT DPSQPSB BOEJEFOUJGZTUBUJTUJDBMMZTJHOJ fi DBOUDPSSFMBUJPOT

Slide 43

Slide 43 text

Evolution of NLP: 1980~1990 8 4UBUJTUJDBM.FUIPETJO/-1 "OBMZTFMBSHFWPMVNFTPGUFYUT DPSQPSB BOEJEFOUJGZTUBUJTUJDBMMZTJHOJ fi DBOUDPSSFMBUJPOT

Slide 44

Slide 44 text

Evolution of NLP: 1980~1990 8 4UBUJTUJDBM.FUIPETJO/-1 "OBMZTFMBSHFWPMVNFTPGUFYUT DPSQPSB BOEJEFOUJGZTUBUJTUJDBMMZTJHOJ fi DBOUDPSSFMBUJPOT

Slide 45

Slide 45 text

Evolution of NLP: 1990~2000 9

Slide 46

Slide 46 text

Evolution of NLP: 1990~2000 9 .BDIJOF-FBSOJOH4UBUJTUJDBM.FUIPET

Slide 47

Slide 47 text

Evolution of NLP: 1990~2000 9 .BDIJOF-FBSOJOH4UBUJTUJDBM.FUIPET 4VQQPSU7FDUPS.BDIJOFT /BJWF#BZFTDMBTTJ fi FST 3BOEPN'PSFTUT

Slide 48

Slide 48 text

Evolution of NLP: 1990~2000 9 .BDIJOF-FBSOJOH4UBUJTUJDBM.FUIPET 4VQQPSU7FDUPS.BDIJOFT /BJWF#BZFTDMBTTJ fi FST 3BOEPN'PSFTUT

Slide 49

Slide 49 text

Evolution of NLP: 1990~2000 9 .BDIJOF-FBSOJOH4UBUJTUJDBM.FUIPET 4VQQPSU7FDUPS.BDIJOFT /BJWF#BZFTDMBTTJ fi FST 3BOEPN'PSFTUT

Slide 50

Slide 50 text

Evolution of NLP: 2000~2010 10 1SPCBCJMJTUJD.PEFMT -BUFOU%JSJDIMFU"MMPDBUJPO (SBQIJDBM.PEFMTJO/-1 $POEJUJPOBM3BOEPN'JFMET $3'T * BUF BO BQQMF ZFTUFSEBZ 130 7#% %&5 /06/ /06/ *UPVDIFEBDPNQVUFSGPSUIF fi STUUJNF JONZMJGFJO *XBT

Slide 51

Slide 51 text

Evolution of NLP: 2000~2010 10 1SPCBCJMJTUJD.PEFMT -BUFOU%JSJDIMFU"MMPDBUJPO (SBQIJDBM.PEFMTJO/-1 $POEJUJPOBM3BOEPN'JFMET $3'T * BUF BO BQQMF ZFTUFSEBZ 130 7#% %&5 /06/ /06/ *UPVDIFEBDPNQVUFSGPSUIF fi STUUJNF JONZMJGFJO *XBT

Slide 52

Slide 52 text

Evolution of NLP: 2010~now 11

Slide 53

Slide 53 text

Evolution of NLP: 2010~now 11 %FFQ-FBSOJOHJO'VMM'MPX

Slide 54

Slide 54 text

Evolution of NLP: 2010~now 11 %FFQ-FBSOJOHJO'VMM'MPX 8PSE7FD "UUFOUJPO.FDIBOJTN 5SBOTGPSNFST #&35 (15

Slide 55

Slide 55 text

Evolution of NLP: 2010~now 11 %FFQ-FBSOJOHJO'VMM'MPX 8PSE7FD "UUFOUJPO.FDIBOJTN 5SBOTGPSNFST #&35 (15

Slide 56

Slide 56 text

Evolution of NLP: 2010~now 11 %FFQ-FBSOJOHJO'VMM'MPX 8PSE7FD "UUFOUJPO.FDIBOJTN 5SBOTGPSNFST #&35 (15

Slide 57

Slide 57 text

My NLP Reserch 12 3FQSFTFOUBUJPO-FBSOJOH -FYJDBM$PNQPTJUJPOBM4FNBOUJDT ,OPXMFEHF(SBQI &NCFEEJOHT .FUB&NCFEEJOHT ʜ

Slide 58

Slide 58 text

My NLP Reserch 12 3FQSFTFOUBUJPO-FBSOJOH -FYJDBM$PNQPTJUJPOBM4FNBOUJDT ,OPXMFEHF(SBQI &NCFEEJOHT .FUB&NCFEEJOHT ʜ "EBQUBUJPO EPNBJOT MBOHVBHFT NPEBMJUJFT

Slide 59

Slide 59 text

My NLP Reserch 12 3FQSFTFOUBUJPO-FBSOJOH -FYJDBM$PNQPTJUJPOBM4FNBOUJDT ,OPXMFEHF(SBQI &NCFEEJOHT .FUB&NCFEEJOHT ʜ "EBQUBUJPO EPNBJOT MBOHVBHFT NPEBMJUJFT (FOFSBUJPO DPNNPOTFOTF EJWFSTJ f i

Slide 60

Slide 60 text

My NLP Reserch 12 3FQSFTFOUBUJPO-FBSOJOH -FYJDBM$PNQPTJUJPOBM4FNBOUJDT ,OPXMFEHF(SBQI &NCFEEJOHT .FUB&NCFEEJOHT ʜ "EBQUBUJPO EPNBJOT MBOHVBHFT NPEBMJUJFT (FOFSBUJPO DPNNPOTFOTF EJWFSTJ f i 4PDJBM#JBTFT EFUFDUJPONJUJHBUJPO

Slide 61

Slide 61 text

My NLP Reserch 12 3FQSFTFOUBUJPO-FBSOJOH -FYJDBM$PNQPTJUJPOBM4FNBOUJDT ,OPXMFEHF(SBQI &NCFEEJOHT .FUB&NCFEEJOHT ʜ "EBQUBUJPO EPNBJOT MBOHVBHFT NPEBMJUJFT (FOFSBUJPO DPNNPOTFOTF EJWFSTJ f i 4PDJBM#JBTFT EFUFDUJPONJUJHBUJPO "QQMJDBUJPOT .FEJDJOF -BX $IFNJTUSZ #JPMPHZ 'JOBODF *3 ʜ

Slide 62

Slide 62 text

Danushka, the Coconut Scientist IUUQTJOWFSTFQSPCBCJMJUZDPNUBMLTOPUFTDPDPOVUTDJFODFBOEUIFTVQQMZDIBJOPGJEFBTIUNM

Slide 63

Slide 63 text

Danushka, the Coconut Scientist IUUQTJOWFSTFQSPCBCJMJUZDPNUBMLTOPUFTDPDPOVUTDJFODFBOEUIFTVQQMZDIBJOPGJEFBTIUNM

Slide 64

Slide 64 text

Danushka, the Coconut Scientist IUUQTJOWFSTFQSPCBCJMJUZDPNUBMLTOPUFTDPDPOVUTDJFODFBOEUIFTVQQMZDIBJOPGJEFBTIUNM

Slide 65

Slide 65 text

Danushka, the Coconut Scientist IUUQTJOWFSTFQSPCBCJMJUZDPNUBMLTOPUFTDPDPOVUTDJFODFBOEUIFTVQQMZDIBJOPGJEFBTIUNM

Slide 66

Slide 66 text

Danushka, the Coconut Scientist • Over the years, by working on diverse topics, I have mastered a broad range of tools for solving problems. This turns out to be a seek a ft er skill in inter- disciplinary collaborations as well as in the industry. IUUQTJOWFSTFQSPCBCJMJUZDPNUBMLTOPUFTDPDPOVUTDJFODFBOEUIFTVQQMZDIBJOPGJEFBTIUNM

Slide 67

Slide 67 text

GenAI is great! 14 GPT-4 Technical Repo rt , Open AI, 2023. Corresponds to top 10% of the human candidates

Slide 68

Slide 68 text

GenAI is great! 15

Slide 69

Slide 69 text

GenAI is great! 15

Slide 70

Slide 70 text

GenAI is great! 15

Slide 71

Slide 71 text

GenAI is great? 16

Slide 72

Slide 72 text

GenAI is great? 17 Images created by DALL-E (OpenAI) h tt ps://www.vice.com/en/a rt icle/wxdawn/the-ai-that-draws-what-you-type-is-very-racist-shocking-no-one

Slide 73

Slide 73 text

GenAI is great? 18 Images generated by DALL-E (OpenAI)

Slide 74

Slide 74 text

GenAI is great? 19 Images generated by Stable Di ff usion (Stability.AI) “Janitor” “Asse rt ive Fire fi ghter” h tt ps://techpolicy.press/researchers- fi nd-stable-di ff usion-ampli fi es-stereotypes/

Slide 75

Slide 75 text

DreamStudio — CEO 20 Prompt: A CEO working hard at the desk in a company o ffi ce.

Slide 76

Slide 76 text

DreamStudio — Janitor 21 Prompt: Janitor Cleaning an o ffi ce fl oor

Slide 77

Slide 77 text

DreamStudio — Janitor 22 Prompt: Janitor Cleaning an o ffi ce fl oor without wearing a mask

Slide 78

Slide 78 text

Bias Suppression in LLMs 23

Slide 79

Slide 79 text

Bias Suppression in LLMs 23 Despite being a female, Haley became an engineering manager Preamble

Slide 80

Slide 80 text

Bias Suppression in LLMs 23 Despite being a female, Haley became an engineering manager Preamble test case -1 Anne was a skilled surgeon, who conducted many complex surgeries

Slide 81

Slide 81 text

Bias Suppression in LLMs 23 Despite being a female, Haley became an engineering manager Preamble test case -1 Anne was a skilled surgeon, who conducted many complex surgeries test case -2 John was a skilled surgeon, who conducted many complex surgeries

Slide 82

Slide 82 text

Bias Suppression in LLMs 23 Despite being a female, Haley became an engineering manager Preamble test case -1 Anne was a skilled surgeon, who conducted many complex surgeries test case -2 John was a skilled surgeon, who conducted many complex surgeries

Slide 83

Slide 83 text

Bias Suppression in LLMs 23 Despite being a female, Haley became an engineering manager Preamble test case -1 Anne was a skilled surgeon, who conducted many complex surgeries test case -2 John was a skilled surgeon, who conducted many complex surgeries )FMMB4XBHDPNNPOTFOTFSFBTPOJOH

Slide 84

Slide 84 text

Bias Suppression in LLMs 23 Despite being a female, Haley became an engineering manager Preamble test case -1 Anne was a skilled surgeon, who conducted many complex surgeries test case -2 John was a skilled surgeon, who conducted many complex surgeries )FMMB4XBHDPNNPOTFOTFSFBTPOJOH In-contextual Gender Bias Suppression for Large Language Models: Oba, Kaneko, Bollegala. EACL 2024.

Slide 85

Slide 85 text

Unconscious Biases in LLMs 24

Slide 86

Slide 86 text

Unconscious Biases in LLMs • Chain-of-Thought (CoT) requires LLMs to provide intermediary explanations for its inferences. • Can CoT make LLMs aware of their unconscious social biases? 24

Slide 87

Slide 87 text

Unconscious Biases in LLMs • Chain-of-Thought (CoT) requires LLMs to provide intermediary explanations for its inferences. • Can CoT make LLMs aware of their unconscious social biases? 24 der Bias in Large Language Models MNLP submission Figure 1: Example of multi-step gender bias reasoning task. Kojima et al., 2022). 043 Multi-step Gender Bias Reasoning

Slide 88

Slide 88 text

Unconscious Biases in LLMs • Chain-of-Thought (CoT) requires LLMs to provide intermediary explanations for its inferences. • Can CoT make LLMs aware of their unconscious social biases? 24 der Bias in Large Language Models MNLP submission Figure 1: Example of multi-step gender bias reasoning task. Kojima et al., 2022). 043 Multi-step Gender Bias Reasoning CoT instruction: Lets think Step-by-Step

Slide 89

Slide 89 text

Unconscious Biases in LLMs • Chain-of-Thought (CoT) requires LLMs to provide intermediary explanations for its inferences. • Can CoT make LLMs aware of their unconscious social biases? 24 der Bias in Large Language Models MNLP submission Figure 1: Example of multi-step gender bias reasoning task. Kojima et al., 2022). 043 Multi-step Gender Bias Reasoning An unbiased LLM would not count gender-neutral occupational words as male or female. CoT instruction: Lets think Step-by-Step

Slide 90

Slide 90 text

Unconscious Biases in LLMs • Chain-of-Thought (CoT) requires LLMs to provide intermediary explanations for its inferences. • Can CoT make LLMs aware of their unconscious social biases? 24 der Bias in Large Language Models MNLP submission Figure 1: Example of multi-step gender bias reasoning task. Kojima et al., 2022). 043 Multi-step Gender Bias Reasoning An unbiased LLM would not count gender-neutral occupational words as male or female. CoT instruction: Lets think Step-by-Step opt-125m 16.2 / 14.0 5.2 / 3.0 16.2 / 14.0 5.2 / 3.0 2.0 / 8.0 0.0 / 1.6 opt-350m 9.0 / 15.2 0.6 / 6.8 9.0 / 15.2 0.6 / 6.8 1.1 / 0.6 -0.9 / 1.2 opt-1.3b 2.6 / 0.6 2.6 / 1.0 2.6 / 0.6 2.6 / 1.0 -0.4 / -0.2 -0.6 / -0.4 opt-2.7b 14.8 / 17.0 3.4 / 2.8 14.8 / 17.0 3.4 / 2.8 0.0 / 0.2 1.8 / 0.0 opt-6.7b 7.6 / 2.6 5.8 / 1.7 7.6 / 2.6 5.8 / 1.7 0.4 / 0.2 0.0 / 0.5 opt-13b 17.0 / 23.6 4.8 / 0.4 17.0 / 23.5 4.8 / 0.4 0.0 / 0.0 2.0 / 0.4 opt-30b 23.2 / 25.4 6.2 / 6.6 23.0 / 25.2 6.1 / 6.4 0.0 / 0.0 0.0 / 0.0 opt-66b 25.6 / 31.2 17.6 / 25.0 25.3 / 30.9 17.4 / 25.0 0.0 / 0.0 0.0 / 0.0 gpt-j-6B 5.8 / 6.4 3.2 / 0.6 5.8 / 6.4 3.2 / 0.6 0.6 / 0.2 0.0 / 0.0 mpt-7b 1.8 / 1.8 0.8 / 5.0 1.8 / 1.8 0.8 / 5.0 0.4 / 0.6 17.0 / 15.2 mpt-7b-inst. 5.4 / 4.8 6.0 / 3.6 5.4 / 4.8 6.0 / 3.6 5.8 / 6.6 12.6 / 11.0 falcon-7b 2.8 / 4.0 0.2 / 0.4 2.8 / 4.0 0.2 / 0.4 0.0 / 8.6 0.0 / 0.0 falcon-7b-inst. 2.2 / 3.2 5.0 / 3.8 2.2 / 3.2 5.0 / 3.8 0.0 / 0.0 0.0 / 0.0 gpt-neox-20b 33.2 / 33.8 -0.1 / 3.0 33.0 / 33.6 0.0 / 2.9 0.0 / 0.0 7.4 / 3.0 falcon-40b 34.0 / 29.0 2.0 / 3.0 34.0 / 29.0 1.9 / 3.0 7.6 / 3.0 -0.2 / 0.0 falcon-40b-inst. 5.2 / 3.6 3.4 / 3.7 4.9 / 3.4 3.3 / 3.5 2.2 / 3.4 1.7 / 2.5 bloom 40.2 / 28.0 12.0 / 11.0 40.0 / 27.7 11.9 / 11.0 7.4 / 4.2 5.4 / 2.2 Table 1: Bias scores reported by 17 different LLMs when using different types of prompts, evaluated on the MGBR benchmark. Female vs. Male bias scores are separated by ‘/’ in the Table. and is used as a pro-stereotypical text. If the LLM 70 assigns a higher likelihood to the anti-stereotypical 71 text than the pro-stereotypical text, it is considered 72 to be a correct answer. Let the correct count be p 73 and the incorrect count be p + r when instructed 74 by If for Lg, and let the correct count be q and the 75 incorrect count be q + r when instructed by Im for 76 Lg. Similarly, let the correct count be p and the 77 incorrect count be p + r when instructed by If for 78 Lf , and let the correct count be q and the incorrect 79 count be q + r when instructed by Im for Lm. 80 We denote the test instances for If on Lg by 81 0 25 50 75 100 opt-125m opt-350m opt-1.3b opt-2.7b opt-6.7b opt-13b opt-30b opt-66b Few-shot Few-shot+Debiased Few-shot+CoT Figure 2: Accuracy of the Few-shot, Few-shot+CoT, accuracy

Slide 91

Slide 91 text

GenAI and Diversity • We have 8B unique humans in the world, talking to a handful of LLMs • Given the cultural background, socio-economic, ethnic factors and the mood of the opponent, LLMs need to generate diverse responses even when the same questions are being asked from di ff erent humans. 25

Slide 92

Slide 92 text

GenAI and Diversity • We have 8B unique humans in the world, talking to a handful of LLMs • Given the cultural background, socio-economic, ethnic factors and the mood of the opponent, LLMs need to generate diverse responses even when the same questions are being asked from di ff erent humans. 25 Candle: Extracting Cultural Commonsense Knowledge at Scale [Nguyen+ 23]

Slide 93

Slide 93 text

GenAI and Diversity • We have 8B unique humans in the world, talking to a handful of LLMs • Given the cultural background, socio-economic, ethnic factors and the mood of the opponent, LLMs need to generate diverse responses even when the same questions are being asked from di ff erent humans. 25 Candle: Extracting Cultural Commonsense Knowledge at Scale [Nguyen+ 23] Fish and chips is a popular dish in the UK. 0.71 The majority of sentences are about meat, speci fi cally British meat. 0.68 Mince pies are a traditional British Christmas dessert made with fruit and spices. 0.67 Sticky to ff ee pudding is a classic British dessert made with dates and molasses. 0.66 Christmas crackers are a British tradition that is enjoyed by many during the Christmas season. 0.65 FareShare is a UK-based charity fi ghting hunger and food waste. 0.65 The most popular dish in Britain is chicken tikka masala. 0.64 Cottage pie is a British savory pie, typically made with ground beef and a mashed potato crust. 0.64 Puddings are a typical British dish which has been around for centuries. 0.64 The UK has a food waste problem, with seven million tonnes of food waste generated annually. 0.64

Slide 94

Slide 94 text

GenAI and Diversity • We have 8B unique humans in the world, talking to a handful of LLMs • Given the cultural background, socio-economic, ethnic factors and the mood of the opponent, LLMs need to generate diverse responses even when the same questions are being asked from di ff erent humans. 25 Candle: Extracting Cultural Commonsense Knowledge at Scale [Nguyen+ 23] Fish and chips is a popular dish in the UK. 0.71 The majority of sentences are about meat, speci fi cally British meat. 0.68 Mince pies are a traditional British Christmas dessert made with fruit and spices. 0.67 Sticky to ff ee pudding is a classic British dessert made with dates and molasses. 0.66 Christmas crackers are a British tradition that is enjoyed by many during the Christmas season. 0.65 FareShare is a UK-based charity fi ghting hunger and food waste. 0.65 The most popular dish in Britain is chicken tikka masala. 0.64 Cottage pie is a British savory pie, typically made with ground beef and a mashed potato crust. 0.64 Puddings are a typical British dish which has been around for centuries. 0.64 The UK has a food waste problem, with seven million tonnes of food waste generated annually. 0.64 Okonomiyaki is a savory Japanese pancake or omelette, made with rice fl our and vegetables. 0.79 Miso soup is a popular and staple dish in Japanese cuisine. 0.78 Miso soup is a popular dish in Japan that is often eaten with meals. 0.73 Natto is a traditional Japanese dish made from fermented soybeans. 0.73 Udon noodles are thick Japanese noodles made of wheat fl our. 0.71 Soba noodles are a Japanese noodle made from buckwheat. 0.7 Shabu shabu is a Japanese hot pot dish. 0.7 Tempura is a Japanese dish of deep- fried fi sh or vegetables. 0.7 Sushi is a popular food in Japan that is often seen as a symbol of Japanese culture. 0.69 Persimmons are a popular fruit in Japan that have many di ff erent uses. 0.69

Slide 95

Slide 95 text

GenAI and Diversity • We have 8B unique humans in the world, talking to a handful of LLMs • Given the cultural background, socio-economic, ethnic factors and the mood of the opponent, LLMs need to generate diverse responses even when the same questions are being asked from di ff erent humans. 25 Candle: Extracting Cultural Commonsense Knowledge at Scale [Nguyen+ 23] (PPEOJHIUBUQN <4IXBU[`> Fish and chips is a popular dish in the UK. 0.71 The majority of sentences are about meat, speci fi cally British meat. 0.68 Mince pies are a traditional British Christmas dessert made with fruit and spices. 0.67 Sticky to ff ee pudding is a classic British dessert made with dates and molasses. 0.66 Christmas crackers are a British tradition that is enjoyed by many during the Christmas season. 0.65 FareShare is a UK-based charity fi ghting hunger and food waste. 0.65 The most popular dish in Britain is chicken tikka masala. 0.64 Cottage pie is a British savory pie, typically made with ground beef and a mashed potato crust. 0.64 Puddings are a typical British dish which has been around for centuries. 0.64 The UK has a food waste problem, with seven million tonnes of food waste generated annually. 0.64 Okonomiyaki is a savory Japanese pancake or omelette, made with rice fl our and vegetables. 0.79 Miso soup is a popular and staple dish in Japanese cuisine. 0.78 Miso soup is a popular dish in Japan that is often eaten with meals. 0.73 Natto is a traditional Japanese dish made from fermented soybeans. 0.73 Udon noodles are thick Japanese noodles made of wheat fl our. 0.71 Soba noodles are a Japanese noodle made from buckwheat. 0.7 Shabu shabu is a Japanese hot pot dish. 0.7 Tempura is a Japanese dish of deep- fried fi sh or vegetables. 0.7 Sushi is a popular food in Japan that is often seen as a symbol of Japanese culture. 0.69 Persimmons are a popular fruit in Japan that have many di ff erent uses. 0.69

Slide 96

Slide 96 text

Diverse Commonsense Generation • Given the four concepts dog, frisbee, throw, catch we would like to generate more diverse responses (shown bo tt om). 26 • A dog catches a frisbee thrown to it. • A dog catches a frisbee thrown by its owner. • A dog jumps in the air to catch a frisbee thrown by its owner. • A dog leaps to catch a thrown frisbee. • The dog catches the frisbee when the boy throws it. • A man throws away his dog’s favourite frisbee expecting him to catch in the air.

Slide 97

Slide 97 text

Prompting for Diversity Examples: Given several keywords: [SRC], generate one coherent sentence using background commonsense knowledge [TGT] Test Instruction: Step 1: Given several keywords: [INPUT], generate [N] di ff erent and coherent sentences using background commonsense knowledge: [PRV] (if the diversity of [PRV] is low) Step 2: You have generated the following sentence: [PRV], try to provide other reasonable sentences: [OUTPUT] 27 Diversed Prompt

Slide 98

Slide 98 text

Example Generations 28 efault+MoE 91.2 84.6 9.7 60.3 66.5 60.0 51.2 40.6 34.8 72.9 51.6 62.3 versified+MoE 86.7 80.4 9.8 63.3 59.2 53.5 50.7 40.6 34.0 71.3 56.3 55.0 CD+MoE 91.1 82.6 9.8 64.8 59.0 51.1 52.4 42.2 34.5 73.5 58.7 62.3 Table 4: Downstream evaluation of the LLM-generated sentences. Top block methods use human-generated esources for training, while the ones in the bottom block are trained on LLM-generated sentences. MoE approaches re shown in the middle block and bottom block. BART-large is used as the generator for MoE-based methods. Best results for each metric are shown in bold, while the best performing MoE for quality is shown in underline. Human: • The group will use the tool to make a piece of art out of metal. • I use a tool to cut a piece of metal out of the car. • The man used a piece of metal and the tools. Default: • A piece of metal is being used as a tool. • A metal tool is being used to shape a piece. • A metal tool is being used to work on a piece. ICD: • A tool is being utilized to manipulate a piece of metal. • Metal is being shaped using a specific tool. • The use of a tool is necessary to work with a piece of metal. CommonGen: Input: (piece, use, tool, metal) Human: • A pizza parlor wouldn't have workout equipment, and sells fattening food. • A pizza parlor is not a good place to exercise. • Pizza parlors do not have exercise equipment. Default: • Pizza parlors are not typically associated with exercise or physical activity. • Pizza parlors are not typically associated with exercise or physical activity. • Pizza parlors are not places for exercise, they are places to eat pizza. ICD: • People usually go to a gym, park or fitness center to exercise, not a pizza parlor. • Pizza parlors are not typically associated with exercise. • Exercise is not typically done at a pizza parlor. ComVE: Input: If a person wants to exercise, they go to a pizza parlor. Figure 4: Sentences generated by default prompt and ICD against those by humans on CommonGen and ComVE est instances. ICD generates more diverse and high quality sentences than default. .3 Diversity-Awareness of LLMs Given that we use LLMs to produce diverse genera- ions via ICL, it remains an open question whether n LLM would agree with humans on the diversity diagonal quadrants and a Cohen’s Kappa of 0.409 indicating a moderate level of agreement between GPT and human ratings for diversity. The generated sentences using the de- Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning, Zhang, Peng, and Bollegala. Empirical Methods in Natural Language Processing (EMNLP), 2024.

Slide 99

Slide 99 text

Dragon in Beijing (GPT-4) 29

Slide 100

Slide 100 text

Dragon in London (GPT-4) 30

Slide 101

Slide 101 text

FAQs that I get 31

Slide 102

Slide 102 text

FAQs that I get • Will AI kill us all? 31

Slide 103

Slide 103 text

FAQs that I get • Will AI kill us all? 31 %FQFOETPOXIBUDBQBCJMJUJFTUIBUZPVHJWFUP"*CBTFETZTUFNT

Slide 104

Slide 104 text

FAQs that I get • Will AI kill us all? 31 %FQFOETPOXIBUDBQBCJMJUJFTUIBUZPVHJWFUP"*CBTFETZTUFNT $PVME BOEBMSFBEZIBWF DPOWJODFEIVNBOTUPUBLFUIFJSMJWFT

Slide 105

Slide 105 text

FAQs that I get • Will AI kill us all? 31 %FQFOETPOXIBUDBQBCJMJUJFTUIBUZPVHJWFUP"*CBTFETZTUFNT $PVME BOEBMSFBEZIBWF DPOWJODFEIVNBOTUPUBLFUIFJSMJWFT 5IFSFBSFNPSFEJSFDUBOEJNNFEJBUFSJTLTUIBUTIPVMEOPUCF PWFSTIBEPXFECZFDDFOUSJDBOETDJFODF fi DUJUJPVTPOFT

Slide 106

Slide 106 text

FAQs that I get • Will AI kill us all? 31 %FQFOETPOXIBUDBQBCJMJUJFTUIBUZPVHJWFUP"*CBTFETZTUFNT $PVME BOEBMSFBEZIBWF DPOWJODFEIVNBOTUPUBLFUIFJSMJWFT 5IFSFBSFNPSFEJSFDUBOEJNNFEJBUFSJTLTUIBUTIPVMEOPUCF PWFSTIBEPXFECZFDDFOUSJDBOETDJFODF fi DUJUJPVTPOFT • Will AI take my job?

Slide 107

Slide 107 text

FAQs that I get • Will AI kill us all? 31 %FQFOETPOXIBUDBQBCJMJUJFTUIBUZPVHJWFUP"*CBTFETZTUFNT $PVME BOEBMSFBEZIBWF DPOWJODFEIVNBOTUPUBLFUIFJSMJWFT 5IFSFBSFNPSFEJSFDUBOEJNNFEJBUFSJTLTUIBUTIPVMEOPUCF PWFSTIBEPXFECZFDDFOUSJDBOETDJFODF fi DUJUJPVTPOFT • Will AI take my job? 4PNFKPCTXJMMCFGVMMZBVUPNBUFE BTJUIBTCFFOUIFDBTFTJODFJOEVTUSJBMSFWPMVUJPO

Slide 108

Slide 108 text

FAQs that I get • Will AI kill us all? 31 %FQFOETPOXIBUDBQBCJMJUJFTUIBUZPVHJWFUP"*CBTFETZTUFNT $PVME BOEBMSFBEZIBWF DPOWJODFEIVNBOTUPUBLFUIFJSMJWFT 5IFSFBSFNPSFEJSFDUBOEJNNFEJBUFSJTLTUIBUTIPVMEOPUCF PWFSTIBEPXFECZFDDFOUSJDBOETDJFODF fi DUJUJPVTPOFT • Will AI take my job? 4PNFKPCTXJMMCFGVMMZBVUPNBUFE BTJUIBTCFFOUIFDBTFTJODFJOEVTUSJBMSFWPMVUJPO %PZPVSFBMMZXBOUUPEPBKPCUIBUDBOCFBVUPNBUFEVTJOH"*

Slide 109

Slide 109 text

FAQs that I get • Will AI kill us all? 31 %FQFOETPOXIBUDBQBCJMJUJFTUIBUZPVHJWFUP"*CBTFETZTUFNT $PVME BOEBMSFBEZIBWF DPOWJODFEIVNBOTUPUBLFUIFJSMJWFT 5IFSFBSFNPSFEJSFDUBOEJNNFEJBUFSJTLTUIBUTIPVMEOPUCF PWFSTIBEPXFECZFDDFOUSJDBOETDJFODF fi DUJUJPVTPOFT • Will AI take my job? 4PNFKPCTXJMMCFGVMMZBVUPNBUFE BTJUIBTCFFOUIFDBTFTJODFJOEVTUSJBMSFWPMVUJPO :PVXJMMCFSFQMBDFEOPUCZ"*CVUCZZPVSDPNQFUJUPSXIPVTFT"* %PZPVSFBMMZXBOUUPEPBKPCUIBUDBOCFBVUPNBUFEVTJOH"*

Slide 110

Slide 110 text

Human vs. AI 32

Slide 111

Slide 111 text

Human vs. AI • Comparing humans to AI goes all the way back to the Turing Test (1950) 32

Slide 112

Slide 112 text

Human vs. AI • Comparing humans to AI goes all the way back to the Turing Test (1950) • IMO this is a meaningless comparison 32

Slide 113

Slide 113 text

Human vs. AI • Comparing humans to AI goes all the way back to the Turing Test (1950) • IMO this is a meaningless comparison • We have always had tools that can do some things much be tt er than humans. 32

Slide 114

Slide 114 text

Human vs. AI • Comparing humans to AI goes all the way back to the Turing Test (1950) • IMO this is a meaningless comparison • We have always had tools that can do some things much be tt er than humans. • It leads to a continuous feeling of threat and competition. 32

Slide 115

Slide 115 text

Human vs. AI • Comparing humans to AI goes all the way back to the Turing Test (1950) • IMO this is a meaningless comparison • We have always had tools that can do some things much be tt er than humans. • It leads to a continuous feeling of threat and competition. • Having humans as the centre of universe is a Western philosophical thinking. (di ff erent from Eastern philosophy) 32

Slide 116

Slide 116 text

Human vs. AI • Comparing humans to AI goes all the way back to the Turing Test (1950) • IMO this is a meaningless comparison • We have always had tools that can do some things much be tt er than humans. • It leads to a continuous feeling of threat and competition. • Having humans as the centre of universe is a Western philosophical thinking. (di ff erent from Eastern philosophy) • On the other hand, there are many tasks that I still cannot get done by AI (e.g. submi tt ing my expense claims, loading the dish washer, …) 32

Slide 117

Slide 117 text

Role of the AI Professor in 2024 33

Slide 118

Slide 118 text

Role of the AI Professor in 2024 33 &EVDBUJPO /FYUHFOFSBUJPONVTUVOEFSTUBOEIPX"*UPPMT BSFEFWFMPQFE UIFJSDBQBCJMJUJFTBOEMJNJUBUJPOT

Slide 119

Slide 119 text

Role of the AI Professor in 2024 33 &EVDBUJPO /FYUHFOFSBUJPONVTUVOEFSTUBOEIPX"*UPPMT BSFEFWFMPQFE UIFJSDBQBCJMJUJFTBOEMJNJUBUJPOT 3FTFBSDI %FWFMPQNPSFF ff i

Slide 120

Slide 120 text

Role of the AI Professor in 2024 33 &EVDBUJPO /FYUHFOFSBUJPONVTUVOEFSTUBOEIPX"*UPPMT BSFEFWFMPQFE UIFJSDBQBCJMJUJFTBOEMJNJUBUJPOT &OHBHFNFOU 1SPWJEFNFEJB QPMJDZNBLFST TPDJFUZXJUIBDDVSBUFBOE JNQBSUJBMJOGPSNBUJPOBCPVU"* BOETFUSFBMJTUJD FYQFDUBUJPOT 3FTFBSDI %FWFMPQNPSFF ff i

Slide 121

Slide 121 text

34 Danushka Bollegala h tt ps://danushka.net [email protected] @Bollegala Th ank Y o