Agenda
- About SHIELD
- Team Tasks
- History of SHIELD
- Team Objective
- Why need to abusing detection?
- Abusing cases
- How to detect abusers
- Architecture
- Workflow
- Rule based Models
- ML based Models
- History of infra
- Detected Abusers
- What SHIELD will do
Slide 3
Slide 3 text
Agenda
- About SHIELD
- Team Tasks
- History of SHIELD
- Team Objective
- Why need to abusing detection?
- Abusing cases
- How to detect abusers
- Architecture
- Workflow
- Rule based Models
- ML based Models
- History of infra
- Detected Abusers
- What SHIELD will do
Slide 4
Slide 4 text
About SHIELD
Team Tasks
- Text Filtering (NLP)
- Agent and Admin
- Abusing Detection (Anomaly Detection)
Slide 5
Slide 5 text
About SHIELD
History of SHIELD
Start SHIELD Release first model Release story and
birthday models
Will release new
models
2019.03 2020 2021
2019.02
Slide 6
Slide 6 text
About SHIELD
Team Objective
Make Brighter Green for LINE
Slide 7
Slide 7 text
Agenda
- About SHIELD
- Team Tasks
- History of SHIELD
- Team Objective
- Why need to abusing detection?
- Abusing cases
- How to detect abusers
- Architecture
- Workflow
- Rule based Models
- ML based Models
- History of infra
- Detected Abusers
- What SHIELD will do
Slide 8
Slide 8 text
Why need to abusing detection?
Reason
- Users want to use more comportable.
- Protect user’s emotion.
- Users want to see the contents they want it.
Slide 9
Slide 9 text
Abusing Cases
Comment abusing cases
- User who only counts meaningless numbers
- User listing unknown strings
- User who used it over 2,000 times in 1 hour
Slide 10
Slide 10 text
Agenda
- About SHIELD
- Team Members
- History of SHIELD
- Team Objective
- Why need to abusing detection?
- Abusing cases
- How to detect abusers
- Architecture
- Workflow
- Rule based Models
- ML based Models
- History of infra
- Detected Abusers
- What SHIELD will do
Slide 11
Slide 11 text
How to detect abusers
ML base
abusing
detection
ML base
anomaly
detection
ML base
spam
detection
Penalty Results Table Alert Warning Tables
Data stream
Infra
aggregation
module
rule base
abusing detection
Log
Anomaly detection
Aggregation Table
Data storage
Real time
Data processing
Data analytics
Anomaly
Pattern
Analyse
Abusing
Pattern
Analyse
Workflow
Slide 12
Slide 12 text
How to detect abusers
Workflow
Rule based Models ML based Models
Request Analyze
Decision
Threshold
Service Penalty
Tuning Discard
Yes
No
Research Implement Verify
Service Penalty
Tuning Discard
Yes
No
Slide 13
Slide 13 text
Alert system
- The purpose of collecting and analyzing the anomaly cases
- Warning, not penalty
- Link with penalty if a certain pattern is found
- Anomaly case is detected only using autoencoder
- Isolation forest will also be operational in 2021 2H
Slide 14
Slide 14 text
How to detect abusers
Rule based Models
- LINE Timeline Story
- LINE Timeline for Birthday Card
- LINE Timeline
Slide 15
Slide 15 text
How to detect abusers
ML based Models
- DBSCAN
- Autoencoder
- density_based
- Isolation Forest
Slide 16
Slide 16 text
Density_based
- Process according to the user's usage by time zone
- Divide into cells using standard deviation.
- Self development algorithm
- Change to normal depending on the surrounding cells from abnormal
- Use the number of users in the cell to classify normal/abnormal.
Slide 17
Slide 17 text
Density_based
Slide 18
Slide 18 text
Density_based
- Ambiguous abusing that does not fall under the rules
- Abusing that cannot be detected as a pattern of the rule model
Slide 19
Slide 19 text
Density_based
Slide 20
Slide 20 text
DBSCAN
- No need to set the number of clusters
- Can find any shape of cluster
- A Density based clustering algorithm
- Can identify anomalies
Slide 21
Slide 21 text
DBSCAN
- Hyper parameter : MinPts, Epsilon
Slide 22
Slide 22 text
DBSCAN
- Missing abusers in density_based model
Slide 23
Slide 23 text
Autoencoder
- Easy implement
- Consist of an encoder and a decoder
- Anomaly has a large reconstruction error.
- Intuitive anomaly detection
Slide 24
Slide 24 text
Autoencoder
Slide 25
Slide 25 text
Autoencoder
- Can not know what behavior the abusers will use
- Can not monitor all actions of the abuser
- Need to preemptive detection
- Detection of changing abuser patterns
Slide 26
Slide 26 text
Isolation Forest
- Isolate quickly anomaly data
- Useful for high dimensional data sets
- Don’t use density or distance
- Split data randomly based on decision tree
Slide 27
Slide 27 text
Isolation Forest
Slide 28
Slide 28 text
Isolation Forest
- Need to preemptive detection
- Detect Abusing pattern according to major behaviors
Slide 29
Slide 29 text
How to detect abusers
History of infra
2021
Clickhouse
2020
Elastic Search
2019
LINE common
DW
Slide 30
Slide 30 text
How to detect abusers
Characteristics
LINE Common DW
Store almost LINE log
Convenient use
Elastic Search Clickhouse
ELK(Elastic search,
Logstash, Kibana)
Search engine
Column based DB
Slide 31
Slide 31 text
How to detect abusers
Pros. & Cons.
LINE Common DW
Data diversity
Data persistence
Elastic Search Clickhouse
Time delay
Slow response with
query
Fast storage speed
Fast search
Difficult big data
aggregation
Fastest response with
query
Easy data aggregation
Lack of references
Slide 32
Slide 32 text
How to detect abusers
Why do we use clickhouse?
- Propose of use
- LINE common data warehouse
- Data analyze
- Elastic search
- Search
- Clickhouse
- Near real time data process
Slide 33
Slide 33 text
Agenda
- About SHIELD
- Team Tasks
- History of SHIELD
- Team Objective
- Why need to abusing detection?
- Abusing cases
- How to detect abusers
- Architecture
- Workflow
- Rule based Models
- ML based Models
- History of infra
- Detected Abusers
- What SHIELD will do
Slide 34
Slide 34 text
Detected Abusers
Results of 2021 1H
ML based results
Rule based results
0
12,500
25,000
37,500
50,000
Jan. Feb. Mar. Apr. May Jun.
Timeline
0
2,500
5,000
7,500
10,000
Jan. Feb. Mar. Apr. May Jun.
Timeline
Slide 35
Slide 35 text
Detected Abusers
Results of 2021 1H
Rule based results
0
15
30
45
60
Jan. Feb. Mar. Apr. May Jun.
Story Birthday
Slide 36
Slide 36 text
What SHIELD will do
Next Plan
- Improve text filter performance
- Make user negative behavior score
Text Filter
Manager for abusing and result of text filtering
- Make more convenient system
- Make monitoring system for abusing
Abusing Detection
- Add rule and ML based models
- Make alert system
- Make user negative behavior score