Slide 1

Slide 1 text

1

Slide 2

Slide 2 text

Agenda › Content Moderation using NLP Service › What is SmartText › How can User Adjust the Model on NLP Platform › Conclusion

Slide 3

Slide 3 text

Content Moderation using NLP Service

Slide 4

Slide 4 text

Content Moderation › The task requires lots of manpower › Number of Ad is increasing Problem Statement Case Introduction › The reviewer has to review all the Ad contents, decide if needs to reject, and provide the rejected reason 【 】 1 . 2 .

Slide 5

Slide 5 text

NLP Service › Obtain representation of text via deep neural network › Pre-trained language model can generate representation vectors with basic semantic meaning for each token of text › Ex: BERT, ELECTRA › Fine tune model based on pre-trained model for the target downstream task › Ex: sentiment classification, question answering Machine Learning Method Natural Language Processing › Process and parse natural language such as text to understand the semantic meaning

Slide 6

Slide 6 text

NLP Service Multi-label Classifier › Categorize text content into multi-labels from a customized set of labels n ELECTRA Dense Layer Sigmoid p1 p2 pn …

Slide 7

Slide 7 text

NLP Service Result › Accuracy: 0.9 › F1 Score: 0.89 Ad Content › 3 labels for main reject reasons , , › Each content can have 0 ~ 3 labels › Data count: 6429 › Approved: 3000 › Rejected: 3429 【蜂毒牙膏】修復牙⿒ 美⽩去⼝臭去黃

Slide 8

Slide 8 text

What is SmartText

Slide 9

Slide 9 text

SmartText - A Self-Service NLP Platform Help users create their own NLP models and integrate the models into application through APIs without coding Classifier Multilabel Classifier NER Duplication Detector LINE TODAY LINE VOOM LINE SHOPPING LINE FactChecker No Code

Slide 10

Slide 10 text

Machine Learning Pipeline Data Preparation Model Development Model Validation Scoping Deployment › Target of task › Application › Constraint › Collection › Cleaning › Which model › Training › Tuning › Baseline › Score › Expectation › How to use › Service integration

Slide 11

Slide 11 text

How SmartText Works Deployment › Build prediction API with BentoML › Package API as docker image and push it to Harbor › Deploy the API image on VKS cluster and create DNS for the API Model Development › Provide 7 NLP services with best model and setting › Classifier, Multi-label Classifier, Duplication Detector, Key Phrase Extraction, Related Search, Tokenizer, NER › Train on IU k8s clusters with GPU › Apply metrics for different models Model Validation

Slide 12

Slide 12 text

Package Image How SmartText Works Build Prediction API Model Training & Validation Deploy Image & Create DNS Automation › Set the following flow as Airflow DAG › 1 DAG for 1 NLP service

Slide 13

Slide 13 text

SmartText Portal A Web System for User to Use Smart Text › Build your own NLP model › Upload data, train model, try result Service Domain › A space for user to use NLP services based on their application › Each domain provides 7 NLP services

Slide 14

Slide 14 text

SmartText Portal Subtitle › Upload CSV files with specific format › Click delete button will delete all the data Upload Data › Start a new build or deploy previous build › Set training cronjob › Test the prediction API of the active model Build Try Result

Slide 15

Slide 15 text

SmartText Portal Subtitle › Upload CSV files with specific format › Click delete button will delete all the data Upload Data › Start a new build or deploy previous build › Set training cronjob › Test the prediction API of the active model Build Try Result

Slide 16

Slide 16 text

SmartText Portal Subtitle › Upload CSV files with specific format › Click delete button will delete all the data Upload Data › Start a new build or deploy previous build › Set training cronjob › Test the prediction API of the active model Build Try Result

Slide 17

Slide 17 text

SmartText Portal Subtitle › Upload CSV files with specific format › Click delete button will delete all the data Upload Data › Start a new build or deploy previous build › Set training cronjob › Test the prediction API of the active model Build Try Result

Slide 18

Slide 18 text

Machine Learning Pipeline › Target of task › Application › Constraint › Collection › Cleaning › Which model › Training › Tuning › Baseline › Score › Expectation › How to use › Service integration Data Preparation Model Development Model Validation Scoping Deployment

Slide 19

Slide 19 text

How can User Adjust the Model on NLP Platform

Slide 20

Slide 20 text

EDA Plot › Multi-label Classifier › Text length histogram › Category distribution › Category count distribution › Category correlation Exploratory Data Analysis › Use visualization and basic statistics to get an overview of the data › Propose › Know the information and the structure of the data › Check the outlier or unusual value › Find out correlation between the data

Slide 21

Slide 21 text

EDA Category Distribution Text Length Histogram 涉及誇⼤ 涉及療效 最⾼級

Slide 22

Slide 22 text

EDA Category Correlation Category Count Distribution 涉及誇⼤ 涉及療效 最⾼級 涉及誇⼤ 涉及療效 最⾼級

Slide 23

Slide 23 text

EDA SmartText Portal - Upload Data

Slide 24

Slide 24 text

XAI › A python package used to explain the output of any machine learning model › Based on the classic Shapley values from game theory SHAP (SHapley Additive exPlanations) Explainable AI › The AI whose decisions or predictions can be understood by human › Benefit › Improve user experience by helping users trust that AI is making good decisions › Figure out the bias of the AI by observing the explanation of AI’s decisions

Slide 25

Slide 25 text

XAI › Brown, Sally, Cony build a team to beat monsters in LINE GAME › They beat 100 monsters in 1 hour and earn 10000 coins as award › Suppose that in 1 hour: › Brown, Sally, Cony can beat 35, 15, 10 monsters respectively › Brown and Sally can beat 70 monsters together; Brown and Cony can beat 60 monsters together; Sally and Cony can beat 40 monsters together › How to split the award ? Example Shapley Value › A solution concept in cooperative game theory, introduced by Lloyd Shapley in 1951 › Used to fairly distribute gains and costs to several actors working in coalition › Calculate the average of all marginal contributions to all possible coalitions

Slide 26

Slide 26 text

XAI Brown Sally Cony B S C 35 35 30 B C S 35 40 25 S C B 60 15 25 S B C 55 15 30 C B S 50 40 10 C S B 60 30 10 Contribution 49.16667 29.16667 21.66667 70 - 35 100 - 70

Slide 27

Slide 27 text

XAI SHAP on Natural Language Model (Transformers)

Slide 28

Slide 28 text

XAI 【 】

Slide 29

Slide 29 text

XAI SmartText Portal - Try Result

Slide 30

Slide 30 text

Conclusion

Slide 31

Slide 31 text

Summary › Get the overview of data by means of statistics and visualization before training EDA XAI › Explain why AI makes the decisions and further convince users or reveal the bias SmartText › A self-service NLP platform that helps users create models and use them through API

Slide 32

Slide 32 text

Future Work › Monitor the usage and application of prediction API › Provide new data when using prediction API Monitoring and Relabeling Data Versioning › Increase flexibility of the data for model training › Realize the concept of data drift

Slide 33

Slide 33 text

Thank you