Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Incorporating LLMs into practical NLP workflows

Incorporating LLMs into practical NLP workflows

In this talk, I'll show how large language models such as GPT-3 complement rather than replace existing machine learning workflows. Initial annotations are gathered from the OpenAI API via zero- or few-shot learning, and then corrected by a human decision maker using an annotation tool. The resulting annotations can then be used to train and evaluate models as normal. This process results in higher accuracy than can be achieved from the OpenAI API alone, with the added benefit that you'll own and control the model for runtime.

Video: https://youtu.be/Bd2ciwinFUE

Ines Montani

April 17, 2023
Tweet

Video

More Decks by Ines Montani

Other Decks in Programming

Transcript

  1. • supervised learning • tell computers exactly what to do

    • needs enough good data practical 
 workflows
  2. • supervised learning • tell computers exactly what to do

    • needs enough good data • ML + business logic practical 
 workflows
  3. working 
 with llms • iterative (prompting, parsing) • evaluation

    is extremely important • improve, not replace task-specific models
  4. working 
 with llms • iterative (prompting, parsing) • evaluation

    is extremely important • improve, not replace task-specific models scriptable workflows
  5. working 
 with llms • iterative (prompting, parsing) • evaluation

    is extremely important • improve, not replace task-specific models scriptable workflows human in the loop
  6. working 
 with llms • iterative (prompting, parsing) • evaluation

    is extremely important • improve, not replace task-specific models scriptable workflows human in the loop business logic
  7. TEXT CLASSIFIER ENTITY RECOGNIZER ENTITY LINKER ATTRIBUTE LOOKUP “Microsoft acquires

    software development platform GitHub for $7.5 billion”
  8. TEXT CLASSIFIER ENTITY RECOGNIZER ENTITY LINKER ATTRIBUTE LOOKUP CURRENCY NORMALIZER

    “Microsoft acquires software development platform GitHub for $7.5 billion”
  9. TEXT CLASSIFIER ENTITY RECOGNIZER ENTITY LINKER ATTRIBUTE LOOKUP CURRENCY NORMALIZER

    “Microsoft acquires software development platform GitHub for $7.5 billion” * *
  10. → github.com/explosion/prodigy-openai-recipes summary • LLMs are a great tool for

    creating better data 
 faster and iteratively • you’ll always need task-specific data
  11. → github.com/explosion/prodigy-openai-recipes summary • LLMs are a great tool for

    creating better data 
 faster and iteratively • you’ll always need task-specific data • many new applications in the future
  12. future 
 work • data structures for result parsing •

    workflows for robust evaluation • interactive prompt testing
  13. future 
 work • data structures for result parsing •

    workflows for robust evaluation • interactive prompt testing • support for open-source models