Having worked on Kaggle's LLM-based ARC AGI program-writing challenge for 6 months using Llama3, I'll give reflections on the lessons learned making an automatic program generator, evaluating it, coming up with strong representations for the challenge, chain-of and program-of-thought styles and some multi-stage critical thinking approaches. You'll get ideas for how to tune your own prompts and shortcuts to help you evaluate your own LLM usage with greater assurance in the face of non-deterministic outcomes.
Given at: PyDataGlobal 2024
Get updates via: https://notanumber.email/ and through https://www.linkedin.com/in/ianozsvald