Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SHIELD : Protect from Abusers in LINE Timeline

SHIELD : Protect from Abusers in LINE Timeline

LINE DEVDAY 2021

November 10, 2021
Tweet

More Decks by LINE DEVDAY 2021

Other Decks in Technology

Transcript

  1. Agenda - About SHIELD - Team Tasks - History of

    SHIELD - Team Objective - Why need to abusing detection? - Abusing cases - How to detect abusers - Architecture - Workflow - Rule based Models - ML based Models - History of infra - Detected Abusers - What SHIELD will do
  2. Agenda - About SHIELD - Team Tasks - History of

    SHIELD - Team Objective - Why need to abusing detection? - Abusing cases - How to detect abusers - Architecture - Workflow - Rule based Models - ML based Models - History of infra - Detected Abusers - What SHIELD will do
  3. About SHIELD Team Tasks - Text Filtering (NLP) - Agent

    and Admin - Abusing Detection (Anomaly Detection)
  4. About SHIELD History of SHIELD Start SHIELD Release first model

    Release story and birthday models Will release new models 2019.03 2020 2021 2019.02
  5. Agenda - About SHIELD - Team Tasks - History of

    SHIELD - Team Objective - Why need to abusing detection? - Abusing cases - How to detect abusers - Architecture - Workflow - Rule based Models - ML based Models - History of infra - Detected Abusers - What SHIELD will do
  6. Why need to abusing detection? Reason - Users want to

    use more comportable. - Protect user’s emotion. - Users want to see the contents they want it.
  7. Abusing Cases Comment abusing cases - User who only counts

    meaningless numbers - User listing unknown strings - User who used it over 2,000 times in 1 hour
  8. Agenda - About SHIELD - Team Members - History of

    SHIELD - Team Objective - Why need to abusing detection? - Abusing cases - How to detect abusers - Architecture - Workflow - Rule based Models - ML based Models - History of infra - Detected Abusers - What SHIELD will do
  9. How to detect abusers ML base abusing detection ML base

    anomaly detection ML base spam detection Penalty Results Table Alert Warning Tables Data stream Infra aggregation module rule base abusing detection Log Anomaly detection Aggregation Table Data storage Real time Data processing Data analytics Anomaly Pattern Analyse Abusing Pattern Analyse Workflow
  10. How to detect abusers Workflow Rule based Models ML based

    Models Request Analyze Decision Threshold Service Penalty Tuning Discard Yes No Research Implement Verify Service Penalty Tuning Discard Yes No
  11. Alert system - The purpose of collecting and analyzing the

    anomaly cases - Warning, not penalty - Link with penalty if a certain pattern is found - Anomaly case is detected only using autoencoder - Isolation forest will also be operational in 2021 2H
  12. How to detect abusers Rule based Models - LINE Timeline

    Story - LINE Timeline for Birthday Card - LINE Timeline
  13. How to detect abusers ML based Models - DBSCAN -

    Autoencoder - density_based - Isolation Forest
  14. Density_based - Process according to the user's usage by time

    zone - Divide into cells using standard deviation. - Self development algorithm - Change to normal depending on the surrounding cells from abnormal - Use the number of users in the cell to classify normal/abnormal.
  15. Density_based - Ambiguous abusing that does not fall under the

    rules - Abusing that cannot be detected as a pattern of the rule model
  16. DBSCAN - No need to set the number of clusters

    - Can find any shape of cluster - A Density based clustering algorithm - Can identify anomalies
  17. Autoencoder - Easy implement - Consist of an encoder and

    a decoder - Anomaly has a large reconstruction error. - Intuitive anomaly detection
  18. Autoencoder - Can not know what behavior the abusers will

    use - Can not monitor all actions of the abuser - Need to preemptive detection - Detection of changing abuser patterns
  19. Isolation Forest - Isolate quickly anomaly data - Useful for

    high dimensional data sets - Don’t use density or distance - Split data randomly based on decision tree
  20. How to detect abusers Characteristics LINE Common DW Store almost

    LINE log Convenient use Elastic Search Clickhouse ELK(Elastic search, Logstash, Kibana) Search engine Column based DB
  21. How to detect abusers Pros. & Cons. LINE Common DW

    Data diversity Data persistence Elastic Search Clickhouse Time delay Slow response with query Fast storage speed Fast search Difficult big data aggregation Fastest response with query Easy data aggregation Lack of references
  22. How to detect abusers Why do we use clickhouse? -

    Propose of use - LINE common data warehouse - Data analyze - Elastic search - Search - Clickhouse - Near real time data process
  23. Agenda - About SHIELD - Team Tasks - History of

    SHIELD - Team Objective - Why need to abusing detection? - Abusing cases - How to detect abusers - Architecture - Workflow - Rule based Models - ML based Models - History of infra - Detected Abusers - What SHIELD will do
  24. Detected Abusers Results of 2021 1H ML based results Rule

    based results 0 12,500 25,000 37,500 50,000 Jan. Feb. Mar. Apr. May Jun. Timeline 0 2,500 5,000 7,500 10,000 Jan. Feb. Mar. Apr. May Jun. Timeline
  25. Detected Abusers Results of 2021 1H Rule based results 0

    15 30 45 60 Jan. Feb. Mar. Apr. May Jun. Story Birthday
  26. What SHIELD will do Next Plan - Improve text filter

    performance - Make user negative behavior score Text Filter Manager for abusing and result of text filtering - Make more convenient system - Make monitoring system for abusing Abusing Detection - Add rule and ML based models - Make alert system - Make user negative behavior score