Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building an Adaptive Log Classification System. An Industrial Report

Exactpro
PRO
November 08, 2019

Building an Adaptive Log Classification System. An Industrial Report

Kirill Rudakov, Andrey Novikov, Anton Sitnikov, Evgenii Tsymbalov, Elena Treshcheva and Alexey Zverev

International Conference on Software Testing, Machine Learning and Complex Process Analysis (TMPA-2019)
7-9 November 2019, Tbilisi

Video: https://youtu.be/6IGRVRKL_M4

TMPA Conference website https://tmpaconf.org/
TMPA Conference on Facebook https://www.facebook.com/groups/tmpaconf/

Exactpro
PRO

November 08, 2019
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Anton Sitnikov, Kirill Rudakov, Andrey Novikov, Evgeny Tsymbalov, Elena Trescheva, Alexey Zverev
    Exactpro
    Building an Adaptive
    Logs Classification System
    An industrial report

    View Slide

  2. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Logs analysis in Fintech world
    ● Any feasible way to improve software quality is highly demanded
    ● Logs observation for passive testing – an elegant and non-disruptive way for
    early error discovery
    ● Log size is the problem

    View Slide

  3. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Reducing complexity

    View Slide

  4. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    From M’s of log lines to ~10..100k of signatures
    ● A signature is a log line with values replaced by placeholders (TIMESTAMP,
    ID, URL, and many more)
    ● Signature extraction is made by regular expressions. They need to be
    manually added over time

    View Slide

  5. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    …and then to 1000’s of clusters
    Cluster is a set of close signatures, ideally meaning same error type for a human
    engineer
    Why signatures are not enough?
    ● ID’s are difficult to catch
    ● Words can be added/deleted, which leads to different signatures
    ● Fast growing clusters are sign of missing signature extraction rule

    View Slide

  6. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Clustering approaches
    K-means on log lines
    Train by several days; see what’s new
    on new days.
    A signature makes new cluster if
    distant from all existing.
    + Can process massive logs
    + User defines radius ad hoc
    - Clusters are moving over time
    Agglomerative on signatures
    Fix cluster diameter; make new
    cluster if signature distant from
    existing clusters. Keep cluster
    history forever
    + Clusters stable, consistent history
    - Slower
    - Lots of small clusters, need to
    introduce “user macro-clusters”

    View Slide

  7. Initial classification
    New logs processing
    New logs (hour
    to day)
    Old clusters do not move.
    A user is notified when new
    cluster appears
    Initial class
    structure
    100k - 1M records a
    day
    ~10k signatures ~1k dense clusters
    Log archive
    (week)
    Hundreds of user clusters
    UI shows user clusters
    Human classification
    Vectorization in
    space of
    1,2,3-grams
    k-means-facilitated
    human overview,
    population of user
    clusters
    Greedy clustering to
    Jaccard-dense clusters
    Signature
    extraction.
    Vectorization in
    space of
    1,2,3-grams
    Two clusters mean the same?
    Human joins them to a user
    cluster and names it.
    User cluster contain single of
    multiple dense clusters
    New 1,2,3-gram?
    Add dimension, re-evaluate distances
    New signature?
    Calculate distance to old signatures
    Is adding to existing cluster break Jaccard
    compactness criterion? Then it is a new
    cluster.

    View Slide

  8. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Business use cases
    ● See new error types for day
    ● Find cluster (exact or nearest) by raw log line
    ● Compare error portraits of two days (and test runs)

    View Slide

  9. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Implementation

    View Slide

  10. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    DB/backend queries
    ● What new clusters appeared today?
    ● What clusters appeared on Day X but not on Day Y?
    ● When did Cluster X appears during the day, and how often?
    ● What cluster does this log line belong to?
    10

    View Slide

  11. REST Middleware for UI
    Clustering
    (after each parsing,
    called by parser/bash)
    UI server
    (React or Flask)
    Parsing
    (Regular,
    Bash-called)
    Model storage (file)
    ● Vocabulary
    ● Signatures x Vocabulary
    ● Distances (Jaccard
    matrix)
    ● Signatures
    ● Clusters & Rangers
    ● Settings
    mySQL
    Clustering
    Settings
    Macro-clusters
    UI settings
    signatures,
    appearances
    (each)
    Read
    cluster &
    new
    signatures
    Save
    re-calculated
    clusters
    Read
    clusters

    View Slide

  12. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University

    View Slide

  13. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Q&A

    View Slide

  14. 7-9 November, Tbilisi
    Ivane Javakhishvili Tbilisi State University
    Thanks

    View Slide