Herding LLMs Towards Structured NLP

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023
Herding LLMs Towards Structured NLP With spacy-llm 1

On Structured NLP (SNLP) • Goal of extracting a deﬁned set of attributes from texts ◦ Entities (locations, persons, …), lemmas, categories, … ◦ “Classic” NLP: predictive models ◦ SOTA models: often BERT-level transformer models • Real-life applications chain together several of these tasks ◦ E. g. entity recognition, entity linking, … • Downstream applications often depend on tangible, unambiguous information ◦ On doc level: e. g. doc category; on span level: e. g. entity; on token level: e. g. lemma, POS tags, … • Libraries: spaCy, Stanza, Gensim, Hugging Face, … 2

spaCy • Free, open-source library • Designed for production use • Focus on dev productivity • Free course: https://course.spacy.io 3

spaCy • A modular pipeline approach for linguistic analysis • Transforming unstructured text into structured data objects 4

On Large Language Models • Generative as opposed to predictive • Pros ◦ Great for prototyping, zero-shot, low dev effort, versatility, … ◦ Can yield great results, maybe surpassing small predictive model for some tasks with proper prompting • Cons ◦ Latency, costs/hardware requirements, free-form text, hallucinations • Libraries: Hugging Face, llama.cpp, LangChain, … • Providers: OpenAI, Anthropic, Cohere, Google, Amazon, … 5

SNLP & LLMs: Not a Dichotomy • Doing SNLP with LLMs is possible, but usually gets less attention than “AI magic” • LLMs generate relatively unconstrained responses ◦ Turn text into…text ◦ Can be narrowed down via e. g. pre-training, ﬁne-tuning, prompts, guardrails ◦ Parsing necessary for SNLP • Modular vs. monolithic approach (not necessarily, but often) • SNLP more likely to be suitable for industrial/real-life applications 6

7 “It's never great to realize you're the guy on the left. But, here I am.” https://explosion.ai/blog/against-llm-maximalism

Use Case: Clinical Trial Results • Task: extract information from human-written notes on clinical trial results (e. g. from a paper) • How does workﬂow compare between predictive (small) and generative (LLM)? • Specialized domain, available predictive models trained on general corpora not accurate enough 8

Use Case: Clinical Trial Results Patients: Eleven of 15 participants were women, median age was 9.2 years (range, 1.7-14.9 yr), and median weight was 26.8 kg (range, 8.5-55.2 kg). Baseline mean pulmonary artery pressure was 49 ± 19 mm Hg, and mean indexed pulmonary vascular resistance was 10 ± 5.4 Wood units. Etiology of pulmonary hypertensive varied, and all were on systemic pulmonary hypertensive medications. Interventions: Patients 1-5 received phenylephrine 1 μg/kg; patients 6-10 received arginine vasopressin 0.03 U/kg; and patients 11-15 received epinephrine 1 μg/kg. Hemodynamics was measured continuously for up to 10 minutes following study drug administration. Measurements and main results: After study drug administration, the ratio of pulmonary-to-systemic vascular resistance decreased in three of five patients receiving phenylephrine, five of five patients receiving arginine vasopressin, and three of five patients receiving epinephrine. Although all three medications resulted in an increase in aortic pressure, only arginine vasopressin consistently resulted in a decrease in the ratio of systolic pulmonary artery-to-aortic pressure. 9

Use Case: Clinical Trial Results - Predictive • Annotate data, then train supervised model • For this use case: required steps include ◦ NER/span categorization: identify patient groups, drugs, doses, frequencies, outcomes, … ◦ Relation extraction: ﬁnd the relations between identiﬁed entities 10

Use Case: Clinical Trial Results - Predictive 11 prodi.gy

Use Case: Clinical Trial Results - Predictive • Conﬁg for serializability & reproducibility of NLP pipelines • spaCy has built-in architectures for NER, spancat, textcat, tagger, dependency parser, … support for custom models, components • python -m spacy train my_config.cfg --output ./my_output 12

Use Case: Clinical Trial Results - Generative • Write prompt • Optional: few-shot examples • Assumption: model was trained on domain knowledge 13 Summarize the trial results in a structured fashion like so: • Patient group: <name> • Number of patients in the group: <number> • Treatment drug or substance: <drug> • Treatment dose: <drug> • Treatment frequency of administration: <frequency> • Treatment duration: <duration> • Outcome: <outcome>

Use Case: Clinical Trial Results - Generative 14 Patient group: Arginine Vasopressin Group Number of patients in the group: 5 Treatment drug or substance: Arginine vasopressin Treatment dose: 0.03 U/kg Treatment frequency of administration: Single administration Treatment duration: Not specified Outcome: The ratio of pulmonary-to-systemic vascular resistance decreased in all five patients receiving arginine vasopressin. Increase in aortic pressure observed. … Prompt Patient group: Phenylephrine Group Number of patients in the group: 5 Treatment drug or substance: Phenylephrine Treatment dose: 1 μg/kg Treatment frequency of administration: Single administration Treatment duration: Not specified Outcome: The ratio of pulmonary-to-systemic vascular resistance decreased in three of five patients receiving phenylephrine. Increase in aortic pressure observed.

Use Case: Clinical Trial Results - Generative NLP is solved! 15

Use Case: Clinical Trial Results - Generative (or maybe not) 16

Issues with LLMs for (S)NLP • Hallucinations / incorrect replies • Variability in responses • Issues with quantities • No mapping to standardized data structures • Latency + rate limits • Limited context length • Redundant/unnecessary work • Costly • Data leakage • … → Challenges in adaptation for real-world use cases 17 Treatment frequency of administration: “Administered once”, “Single administration”, “One-time dose”, “One time”, “Single dose”, “One-time administration”, “once”... Number of patients: 15 Treatment drug or substance: - Group 1: Patient 1-5 received phenylephrine 1 μg/kg - Group 2: Patient 6-10 received arginine vasopressin 0.03 U/kg - Group 3: Patient 11-15 received epinephrine 1 μg/kg

• Some issues are inherent to models/APIs: ◦ Hallucinations, math problems, round trip times etc. ◦ Can be mitigated with pre-training, ﬁne-tuning, RLHF, tool use etc. • Others can be mitigated with tooling around LLMs: ◦ Break problem down into several chained tasks ◦ Robust parsing ◦ Quality assurance ◦ Rich data structures for results and metadata ◦ Swap generative with predictive models for tasks where appropriate Issues with LLMs for (S)NLP 18

• Some issues are inherent to models/APIs: ◦ Hallucinations, math problems, round trip times etc. ◦ Can be mitigated with pre-training, ﬁne-tuning, RLHF, tool use etc. • Others can be mitigated with tooling around LLMs: ◦ Break problem down into several chained tasks ◦ Robust parsing ◦ Quality assurance ◦ Rich data structures for results and metadata ◦ Swap generative with predictive models for tasks where appropriate Issues with LLMs for (S)NLP 19 spacy-llm ]

• spaCy extension - uses spaCy’s data structures, pipeline concept, conﬁg system, other functionality • Core idea: pipeline of SNLP problems, solved with LLMs ◦ Each problem is solved by a task, which is responsible for the prompt, prompt splitting, parsing ◦ Highly conﬁgurable ◦ Map results onto spaCy’s data structure ◦ In pipeline: easy to swap out LLMs with predictive models and vice versa → easy prototyping • Currently at 0.6.4, 1.0.0 coming soon spacy-llm: SNLP with LLMs 20

• Models: integration with Hugging Face, LLM providers, LangChain • Tasks: ◦ Built-in tasks include NER, REL, sentiment analysis, summarization, translation, QA, entity linking, lemmatization, span categorization, text categorization, … ◦ Easy to add new tasks • Is a spaCy component: integrates into its conﬁg and pipeline system and supports all usual features like parallelization* and serialization • Batching, response logging for easier debugging, caching spacy-llm: Integrations 21

spacy-llm: Workﬂow & Use Cases 22

spacy-llm: Workﬂow and Use Cases I 23 • LLM-assisted annotation - for: evaluation data, training data, examples for few-shot learning prodi.gy LLM zero-shot predictions Manual curation

spacy-llm: Workﬂow and Use Cases II • Preprocessing text before prompting LLM ◦ PII: recognize and replace personal identiﬁable information ◦ Remove non-informative boilerplate snippets ◦ … 24 PII NER LLM

spacy-llm: Workflow and Use Cases III • Only send texts (sentences/paragraphs/documents) with certain topics or entities to the LLM ◦ Avoid unnecessary costs ◦ Adjust prompt according to earlier classification and/or identified entities ◦ … 25 TextCat LLM NER

spacy-llm: Workﬂow and Use Cases IV • LLM response postprocessing ◦ Quality assurance / fact-checking ◦ Response normalization: improve response robustness for downstream tasks ◦ Hook up to external knowledge bases ◦ … 26 Rules LLM Entity linking

Run pipeline: spacy-llm: How To 27 [nlp] lang = "en" pipeline = ["llm_ner"] [components] [components.llm_ner] factory = "llm" [components.llm_ner.task] @llm_tasks = "spacy.NER.v3" labels = SIZE,TYPE,TOPPING,PRODUCT [components.llm_ner.model] @llm_models = "spacy.GPT-3-5.v3" name = "gpt-3.5-turbo" nlp = assemble(config_path) doc = nlp(text) for ent in doc.ents: print(ent.text, ent.label_) [components.llm_ner.task] @llm_tasks = "spacy.NER.v3" labels = SIZE,TYPE [components.llm_ner.model] @llm_models = "spacy.Mistral.v1" name = "Mistral-7B-v0.1"

Recap • SNLP unlocks information from text and makes it available to down-stream business applications in a structured form • LLMs have impressive text generation/understanding abilities • It’s become super easy to prototype NLP applications with LLMs • When building a production-ready pipeline, you need to consider other traits such as customizability, robustness, inference cost, network latency, etc. • spaCy is a production-ready NLP framework written for developers • Its extension spacy-llm allows easy integration of LLMs into structured NLP pipelines • LLM-assisted annotation allows fast bootstrapping of training/evaluation data 28

Thank you! • [email protected] • https://www.linkedin.com/in/raphaelmitsch/ • https://github.com/explosion/spaCy • https://github.com/explosion/spacy-llm • https://explosion.ai/ • https://prodi.gy • https://explosion.ai/blog/against-llm-maximalism 29

Herding LLMs Towards Structured NLP

Herding LLMs Towards Structured NLP

Raphael Mitsch

Other Decks in Programming

Featured

Transcript

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023

2023-12-12 | Raphael Mitsch (Explosion) | Global AI Conference 2023