Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Unlock NER for Sensitive Data with NLOP - GDSC 2023

Unlock NER for Sensitive Data with NLOP - GDSC 2023

Today we’re going to learn about Natural Language Processing. NLP is a technology that teaches machines to understand human languages.
Specifically…we will learn how to use an NLP model to teach a machine to recognize when a human shares potentially sensitive information in a consumer or enterprise system.

Noble Ackerson

May 20, 2023
Tweet

More Decks by Noble Ackerson

Other Decks in Technology

Transcript

  1. NLP for Sensitive Data
    NORP
    DATE
    ORG
    GPE
    MONEY
    GPE
    PERSON QUANTITY
    Noble Ackerson
    Applied AI Product Lead,
    Former Google Developers Expert for Product Strategy
    Sensitive Text Detection with Custom Natural
    Language Processing (NLP) Models

    View Slide

  2. Agenda
    1 Demo: Custom Entity Recogniser
    Process
    Real world use-cases
    2
    3

    View Slide

  3. Demo
    Custom Named Entity Recogniser

    View Slide

  4. What is Natural Language Processing (NLP)?
    Input
    ❏ Natural Language
    ❏ Text
    ❏ Speech
    Output
    ❏ General text classification
    ❏ Entity extraction
    ❏ Machine translation
    ❏ Question and Answering
    ❏ Embeddings & Semantic Search
    ❏ Conversational machine interaction
    ❏ …
    Process
    ❏ Text representation
    ❏ Machine Learning Models
    ❏ Generative
    ❏ Semantic meaning
    ❏ Contextual understanding
    Generative AI (NLU with LLMs)

    View Slide

  5. Process
    1 Identifying the right NLP Use Cases
    Annotate data with the help of Generative AI
    Train a custom Named Entity Recognition (NER) model & refine
    Evaluate, test, custom NER model
    Deploy and integrate
    2
    3
    4
    5

    Today’s Tools
    ● Google Colab (shared)
    ● spaCy NLP library
    ● Google Cloud Platform

    View Slide

  6. Identifying the right NLP Use Cases
    Human-centered needs analysis for A.I.
    Don’t “A.I.”
    all the things.

    View Slide

  7. Identifying the right NLP Use Cases
    User patterns Scope

    View Slide

  8. Production grade ML/NLP
    DATA
    MANAGEMENT
    DATA
    COLLECTION
    EXPLORATION &
    ANALYSIS TOOLS
    NLP
    CODE
    FEATURE
    ENGINEERING
    (Labeling, Annotations)
    MODEL TRAINING
    AT SCALE
    AUTOMATION
    LOGGING &
    MANAGEMENT
    MONITORING
    SERVING
    INFRASTRUCTURE
    NEEDS
    ANALYSIS
    Adapted from: “Hidden Technical Debt in Machine Learning Systems”, D. Sculley et. al, Google

    View Slide

  9. Identifying the right NLP Use Cases
    Automation Augmentation
    User doesn’t know how to do something User feels responsible for task
    User can’t do something High stakes situation
    Task is boring, repetitive, or dangerous Complicated personal preferences
    If Machine Learning and NLP is needed, which type is best?

    View Slide

  10. Setting expectations
    NLP
    CODE
    DATA
    MANAGEMENT
    DATA
    COLLECTION
    EXPLORATION &
    ANALYSIS TOOLS

    View Slide

  11. Process
    1 Identifying the right NLP Use Cases
    Annotate data with the help of Generative AI
    Train a custom NER model & refine
    Evaluate, test, custom NER model
    Deploy and integrate
    2
    3
    4
    5

    View Slide

  12. Annotating your
    text for
    Entity Recognition
    Labeling

    View Slide

  13. Teacher: Class, pay attention
    Transformer Models:
    A brief bit about how
    NLU, LLM, GenAI
    play into NER workflow
    Deep
    Learning
    Machine
    Learning
    Natural
    Language
    Processing
    (NLP)
    Large
    Language
    Models
    (LLM)
    Artificial
    Intelligence

    View Slide

  14. Process
    1 Identifying the right NLP Use Cases
    Annotate data with the help of Generative AI
    Train a custom NER model & refine
    Evaluate, test, custom NER model
    Deploy and integrate
    2
    3
    4
    5

    View Slide

  15. The NLP workflow
    for training custom
    Entity Recognition
    …with Large Language Models

    View Slide

  16. Starting with
    Pre-trained
    NER Models

    View Slide

  17. Process
    1 Identifying the right NLP Use Cases
    Annotate data with the help of Generative AI
    Train a custom NER model & refine
    Evaluate, test, custom NER model
    Deploy and integrate
    2
    3
    4
    5

    View Slide

  18. View Slide

  19. View Slide

  20. Wrap up
    1 Demo: Custom Entity Recogniser
    Process
    Real world use-cases
    2
    3

    View Slide

  21. Healthcare
    Clinical
    Documentation
    Custom Entities for Clinical Documentation
    NORP
    DISEASE
    CONDITION
    GPE
    DOSAGES
    PERSON
    MEDICATION EHR

    View Slide

  22. Legal
    Contract
    Analysis
    Custom Entities for Legal Documents
    RIGHTS
    CLAUSE
    CONDITION
    GPE
    JURISDICTION
    PARTIES
    CONTRACT DATE

    View Slide

  23. Sentiment
    Analysis
    Use Case
    cloud.google.com/natural-language

    View Slide

  24. Translation
    Use Case
    https://cloud.google.com/translate

    View Slide

  25. All-industries
    Try Natural
    Language AI on
    Google Cloud
    cloud.google.com/natural-language

    View Slide

  26. Good luck and
    Thank you!
    Noble Ackerson
    AI Product Lead, GDE Alumni
    medium.com/@nobleackerson
    youtube.com/c/nobleackerson
    Resources
    What is Natural Language Processing? [Google Cloud]
    Natural Language Processing on Google Cloud [Cloud Skills Boost]
    TensorFlow Models NLP Library [tensorflow.org]

    View Slide

  27. Identifying Use Cases for LLMs
    Risk Tolerance Human Review Text (or Code) Intensive Business Value

    View Slide

  28. Token Classification with Custom NER models
    Inputs
    Training &
    Validation
    Datasets
    Deploy
    API and
    Versioning
    Prediction
    Clients
    (Online Systems)
    REST API call with
    input variables
    Trained
    Models
    Training
    Serving
    Preprocess
    & Feature
    Creation
    Preprocess
    Labeling &
    Annotation
    Train/Tune
    Model

    View Slide

  29. In case of demo fail

    View Slide