Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Review AI from LINE EC NLP

Review AI from LINE EC NLP

Event: iThome Hello World Dev Conference
Speaker: Vila Lin

LINE Developers Taiwan

September 23, 2024
Tweet

More Decks by LINE Developers Taiwan

Other Decks in Technology

Transcript

  1. 01 02 03 04 NLP E-commerce Past To Present Advance

    Brief Introduction CONTENT 05 Takeaway
  2. Vila Lin LINE TW, EC Data Lead • Education: NTHU

    MS • Specialty: • Machine Learning • Neuron Network • Design of Algorithm • Statistics
  3. NLP Evolution Neuron Network Traditional Approach Pre-Trained Model Prompt Engineeing

    SVM TF-IDF LDA Word2Vec CNN LSTM BERT GPT T5 ChatGPT LLaMA Claude Gemini
  4. NLP in E-Commerce Search Query Product Article Brand Store Product

    Specification Segmentation & Embedding NER Search Intention Search Suggestion Search Spell Checker Query Understanding Product Category Store Vendor Type Classification Article Advertisement NLG
  5. Trie HMM with states BMSE: Begin Middle Single End Viterbi

    Algorithm Choose path with maximum probability
  6. Quality • Word coverage • Appearance frequency • Design preprocessing

    • Collect dictionaries Hard to capture complex/non-linear relationship Computing cost increase with large datasets/complex states Problem HMM
  7. BiLSTM Architecture of BiLSTM Couple of LSTM (Long Short-Term Memory)

    Cross-BiLSTM-CNN Corpora Embedding NLP task Peng-Hsuan Li, Tsu-Jui Fu, WeiYun Ma. 2019. Remedying BiLSTM-CNN Deficiency in Modeling Cross-Context for NER.
  8. No pre-trained with large corpora Need more task-specific data Problem

    OOV words Fixed vocabulary is constraint on segmentation with new/rare word Parallelize limitation Sequential attribute of LSTM cause hard to parallelize computation
  9. BERT Architecture of BERT • BERT Base: 12 layers (110

    M parameters) • BERT Large: 24 layers (340M parameters) sentence 1 sentence 2 Encoder only Attention • Multi-Head Attention • Self-Attention tag tag Self-Attention Multi-Head Attention
  10. Need significant size of task-specific data Data augmentation Problem Semi-supervise

    learning Active learning Knowledge distillation External knowledge
  11. Discriminative vs. Generative AI Input data Discriminative Model learn relationship

    Input tag Output data Discriminative AI Input data Generative Model learn unstructured content Input unstructured data New data Generative AI
  12. BERT+GPT BERT GPT Model type Encoder only Decoder only Pre-Training

    MLM AR Direction Bidirectional Unidirectional Fine-tuning Task specific layer added on pre-trained model Task specific prompting with one-shot/few-shot adaption Use case Word segmentation Classification NER Text generation Summarization
  13. Generate Dataset by GPT Initial data Raw data Small labeled

    exist data Design data format and prompting GPT Synthetic data Strong contextual understand Zero-shot/Few-shot learning External knowledge Manual review Filter low-quality data
  14. LLM-Driven Fine-Tuning BERT Initial data Raw data GPT Synthetic data

    Optimize training parameters Fine-tuning data BERT Generate more data Refine prompt
  15. • Every NLP model is designed for several purpose. It’s

    like most of mathematical questions - not only one solution. • We’re able to cook appropriate solution with our goal and resource if we have gotten core concept about NLP models and frequent application. Takeaway