Slide 1

Slide 1 text

ARC AGI Kaggle with llama3 First Steps PyDataLondon 2024-07 lightning talk @IanOzsvald – ianozsvald.com

Slide 2

Slide 2 text

LLMs great at memorisation, can they reason? F. Chollet argues that they’re bad at reasoning $1M prize if LLM/other can solve these challenges Abstract shapes “initial → target” in JSON Open-weights models only (runs in off-line env) Abstraction & Reasoning Challenge By [ian]@ianozsvald[.com] Ian Ozsvald

Slide 3

Slide 3 text

What rules do you need? By [ian]@ianozsvald[.com] Ian Ozsvald

Slide 4

Slide 4 text

Llama.cpp with quantised Llama 8B (and 70B) Python llama.cpp bindings Ask for 200 solutions Try grid, list, grid+list representations Grid only – poor. List better. Grid+list slightly better First solution By [ian]@ianozsvald[.com] Ian Ozsvald

Slide 5

Slide 5 text

Llama (normally) writes code By [ian]@ianozsvald[.com] Ian Ozsvald Bad syntax, no code, raw_input, injection back into the training data (changing ints to strings)

Slide 6

Slide 6 text

Llama 3 8B IQ2 (heavy quant), some run correctly on 3x3 “train” problem Very fast, runs on 3090 (24GB VRAM) Do you use Llama 3? Alpaca? ROPE? Do you have text correctness metrics? Summary By [ian]@ianozsvald[.com] Ian Ozsvald