Slide 40
Slide 40 text
LLMS VS. TASK-
SPECIFIC MODELS
F-Score Speed (words/s)
GPT-3.5 1 78.6 < 100
GPT-4 1 83.5 < 100
spaCy 91.6 4,000
Flair 93.1 1,000
SOTA 2023 2 94.6 1,000
SOTA 2003 3 88.8 > 20,000
1. Ashok and Lipton (2023), 2. Wang et al. (2021),
3. Florian et al. (2003)
SOTA on few-
shot prompting
RoBERTa-base
CoNLL 2003 NER
Text Classification
accuracy on
% of examples
SST2 AG News Banking77 GPT-3 baseline
65
70
75
80
85
90
95
100
1% 5% 10% 20% 50% 100%
Explosion (2023), to be released