Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How Machine Learning helps LINE Fact Checker

How Machine Learning helps LINE Fact Checker

BY Jim Horng @LINE TECHPULSE 2019 https://techpulse.line.me/

LINE Developers Taiwan

December 04, 2019
Tweet

More Decks by LINE Developers Taiwan

Other Decks in Programming

Transcript

  1. > Jim Horng / LINE TODAY | Data TF How

    Machine Learning helps LINE Fact Checker
  2. Agenda > The Need for Fact Checker > Overview of

    ML Components > Overview of ML System > Challenges and Future Work
  3. Agenda > The Need for Fact Checker > Overview of

    ML Components > Overview of ML System > Challenges and Future Work
  4. Near-Duplication - Use Cases Verified Fake Message: "footage on Captain's

    Instagram Stories showed them wearing wedding rings on their both hands, which proves Captain America and Captain Marvel get married in Las Vegas" Query Result Type Captain America and Captain Marvel get married in Las Vegas True Partial The wedding in Las Vegas is hosted by Captain America and Captain Marvel couple True Semantically Ironman and Black Widow get married in Los Angeles False Syntactically
  5. Near-Duplication - Flow > Long Text ➔ Full Match •

    performs faster and trustworthy > Short Text ➔ Partial Match + Fuzzy Tolerance • 20% user query are partial texts of original messages
 ( E.g. sentence, topic of an article )
  6. Full Match - BERT Based ※Source from https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp- f8b21a9b6270 >

    Has Chinese pre-trained model > Can extract sentence vector from
 Upstream > Can capture semantics based on Context
  7. Topic probability: 
 traffic: 3.7%, life: 38%, art: 49%, health:

    2.54%, others: 1.6%, sport: 4.3%, education: 0.7%, law: 0.16% "They were in Vegas for the BillBoard Music Awards but, a few hours later, footage on Captain's Instagram Stories showed them wearing wedding rings on their both hands, which proves Captain America and Captain Marvel get married in Las Vegas " Message Classification
  8. Agenda > The Need for Fact Checker > Overview of

    ML Components > Overview of ML System > Challenges and Future Work
  9. ML System Under the Hood Serving Training Data Ingestion Async

    Execution Model Management Model Deployment Index-based Vector Search (ANN) Scheduling, Orchestration
  10. Agenda > The Need for Fact Checker > Overview of

    ML Components > Overview of ML System > Challenges and Future Work
  11. Similarity 65% 46% 82% Identify Message With Image > VGG16

    (ConvNet Configuration D) > Convolution network extracts image features, Able to capture objects and shape.
  12. Duplicated Reported Messages Training Store as New If New, by

    Near- Duplication Report Message Check By Cache / Search Engine Store As Cache / Search Engine House Keeping: Find Duplicates By Near- Duplication and Merge Training Store as New If New, by Near- Duplication Report Message