Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Agenda - About SHIELD - Team Tasks - History of SHIELD - Team Objective - Why need to abusing detection? - Abusing cases - How to detect abusers - Architecture - Workflow - Rule based Models - ML based Models - History of infra - Detected Abusers - What SHIELD will do

Slide 3

Slide 3 text

Agenda - About SHIELD - Team Tasks - History of SHIELD - Team Objective - Why need to abusing detection? - Abusing cases - How to detect abusers - Architecture - Workflow - Rule based Models - ML based Models - History of infra - Detected Abusers - What SHIELD will do

Slide 4

Slide 4 text

About SHIELD Team Tasks - Text Filtering (NLP) - Agent and Admin - Abusing Detection (Anomaly Detection)

Slide 5

Slide 5 text

About SHIELD History of SHIELD Start SHIELD Release first model Release story and birthday models Will release new models 2019.03 2020 2021 2019.02

Slide 6

Slide 6 text

About SHIELD Team Objective Make Brighter Green for LINE

Slide 7

Slide 7 text

Agenda - About SHIELD - Team Tasks - History of SHIELD - Team Objective - Why need to abusing detection? - Abusing cases - How to detect abusers - Architecture - Workflow - Rule based Models - ML based Models - History of infra - Detected Abusers - What SHIELD will do

Slide 8

Slide 8 text

Why need to abusing detection? Reason - Users want to use more comportable. - Protect user’s emotion. - Users want to see the contents they want it.

Slide 9

Slide 9 text

Abusing Cases Comment abusing cases - User who only counts meaningless numbers - User listing unknown strings - User who used it over 2,000 times in 1 hour

Slide 10

Slide 10 text

Agenda - About SHIELD - Team Members - History of SHIELD - Team Objective - Why need to abusing detection? - Abusing cases - How to detect abusers - Architecture - Workflow - Rule based Models - ML based Models - History of infra - Detected Abusers - What SHIELD will do

Slide 11

Slide 11 text

How to detect abusers ML base abusing detection ML base anomaly detection ML base spam detection Penalty Results Table Alert Warning Tables Data stream Infra aggregation module rule base abusing detection Log Anomaly detection Aggregation Table Data storage Real time Data processing Data analytics Anomaly Pattern Analyse Abusing Pattern Analyse Workflow

Slide 12

Slide 12 text

How to detect abusers Workflow Rule based Models ML based Models Request Analyze Decision Threshold Service Penalty Tuning Discard Yes No Research Implement Verify Service Penalty Tuning Discard Yes No

Slide 13

Slide 13 text

Alert system - The purpose of collecting and analyzing the anomaly cases - Warning, not penalty - Link with penalty if a certain pattern is found - Anomaly case is detected only using autoencoder - Isolation forest will also be operational in 2021 2H

Slide 14

Slide 14 text

How to detect abusers Rule based Models - LINE Timeline Story - LINE Timeline for Birthday Card - LINE Timeline

Slide 15

Slide 15 text

How to detect abusers ML based Models - DBSCAN - Autoencoder - density_based - Isolation Forest

Slide 16

Slide 16 text

Density_based - Process according to the user's usage by time zone - Divide into cells using standard deviation. - Self development algorithm - Change to normal depending on the surrounding cells from abnormal - Use the number of users in the cell to classify normal/abnormal.

Slide 17

Slide 17 text

Density_based

Slide 18

Slide 18 text

Density_based - Ambiguous abusing that does not fall under the rules - Abusing that cannot be detected as a pattern of the rule model

Slide 19

Slide 19 text

Density_based

Slide 20

Slide 20 text

DBSCAN - No need to set the number of clusters - Can find any shape of cluster - A Density based clustering algorithm - Can identify anomalies

Slide 21

Slide 21 text

DBSCAN - Hyper parameter : MinPts, Epsilon

Slide 22

Slide 22 text

DBSCAN - Missing abusers in density_based model

Slide 23

Slide 23 text

Autoencoder - Easy implement - Consist of an encoder and a decoder - Anomaly has a large reconstruction error. - Intuitive anomaly detection

Slide 24

Slide 24 text

Autoencoder

Slide 25

Slide 25 text

Autoencoder - Can not know what behavior the abusers will use - Can not monitor all actions of the abuser - Need to preemptive detection - Detection of changing abuser patterns

Slide 26

Slide 26 text

Isolation Forest - Isolate quickly anomaly data - Useful for high dimensional data sets - Don’t use density or distance - Split data randomly based on decision tree

Slide 27

Slide 27 text

Isolation Forest

Slide 28

Slide 28 text

Isolation Forest - Need to preemptive detection - Detect Abusing pattern according to major behaviors

Slide 29

Slide 29 text

How to detect abusers History of infra 2021 Clickhouse 2020 Elastic Search 2019 LINE common DW

Slide 30

Slide 30 text

How to detect abusers Characteristics LINE Common DW Store almost LINE log Convenient use Elastic Search Clickhouse ELK(Elastic search, Logstash, Kibana) Search engine Column based DB

Slide 31

Slide 31 text

How to detect abusers Pros. & Cons. LINE Common DW Data diversity Data persistence Elastic Search Clickhouse Time delay Slow response with query Fast storage speed Fast search Difficult big data aggregation Fastest response with query Easy data aggregation Lack of references

Slide 32

Slide 32 text

How to detect abusers Why do we use clickhouse? - Propose of use - LINE common data warehouse - Data analyze - Elastic search - Search - Clickhouse - Near real time data process

Slide 33

Slide 33 text

Agenda - About SHIELD - Team Tasks - History of SHIELD - Team Objective - Why need to abusing detection? - Abusing cases - How to detect abusers - Architecture - Workflow - Rule based Models - ML based Models - History of infra - Detected Abusers - What SHIELD will do

Slide 34

Slide 34 text

Detected Abusers Results of 2021 1H ML based results Rule based results 0 12,500 25,000 37,500 50,000 Jan. Feb. Mar. Apr. May Jun. Timeline 0 2,500 5,000 7,500 10,000 Jan. Feb. Mar. Apr. May Jun. Timeline

Slide 35

Slide 35 text

Detected Abusers Results of 2021 1H Rule based results 0 15 30 45 60 Jan. Feb. Mar. Apr. May Jun. Story Birthday

Slide 36

Slide 36 text

What SHIELD will do Next Plan - Improve text filter performance - Make user negative behavior score Text Filter Manager for abusing and result of text filtering - Make more convenient system - Make monitoring system for abusing Abusing Detection - Add rule and ML based models - Make alert system - Make user negative behavior score

Slide 37

Slide 37 text

Thank you