Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GoEmotion_NeuroMatchAcademy project

GoEmotion_NeuroMatchAcademy project

Avatar for Partha Pratim Saha

Partha Pratim Saha

June 07, 2025
Tweet

Transcript

  1. Emotion Classification in Conversational Text Dialogues Members: Zahra Rezai, Rel

    Guzman, Mauro Granado, Renee Vieira, Partha Pratim Saha TA: Joseph AKINYEMI, Erum Afzal “The BERTies” POD: Intelligent Lily
  2. POD: Intelligent Lilly The BERTies Project Overview & Goal •

    Research Question : Classify conversational dialogues into one of the emotions category using GoEmotions dataset • DL models : BERT and DistilBERT • ML models : Support Vector Machine, Logistics Regression, Multinomial Naive Bayes, and Random Forest • New possibilities : Natural Language Understanding and Natural Language Generation, improve Human-AI collaboration & human attributes in dialogues • Future work: classify multilabel emotions from multimodal data for LLMs
  3. POD: Intelligent Lilly The BERTies GoEmotions dataset Corpus of 58k

    carefully curated comments extracted from Reddit, with human annotations to 27 emotion categories + Neutral: admiration amusement approval caring desire excitement gratitude joy love optimism pride relief anger annoyance disappointment disapproval disgust embarrassment fear grief nervousness remorse sadness confusion curiosity realization surprise neutral We used the filtered version included in this dataset based on reter-agreement: training dataset: 43,410 + test dataset: 5,427 + validation dataset: 5,426
  4. POD: Intelligent Lilly The BERTies The importance of correct hyperparameters

    num_train_epochs=15, per_device_train_batch_size=16, per_device_eval_batch_size=16, warmup_steps=500, weight_decay=0.01, eval_steps=100, save_steps=100, gradient_accumulation_steps=8, learning_rate=3e-5, num_train_epochs=3, per_device_train_batch_size=32, per_device_eval_batch_size=32, warmup_steps=50, eval_steps=100, gradient_accumulation_steps=2, learning_rate=3e-5, lr_scheduler_type="reduce_lr_on_plateau", greater_is_better=False,
  5. POD: Intelligent Lilly The BERTies Bert-base-uncased and DistilBERT models DistilBERT

    is a small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than google-bert/bert-base-uncased, runs 60% faster while preserving over 95% of BERT’s performances as measured on the GLUE language understanding benchmark.