Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Distributed Log Search Based on Time Series Access and Service Relations

Distributed Log Search Based on Time Series Access and Service Relations

Tomoyuki KOYAMA

April 17, 2022
Tweet

More Decks by Tomoyuki KOYAMA

Other Decks in Research

Transcript

  1. Distributed Log Search based on
    Time Series Access and
    Service Relations
    Tomoyuki Koyama, Takayuki Kushida
    Tokyo University of Technology
    AINA-2022 / April 15, 2022
    1

    View Slide

  2. Server
    Introduction
    • Log is a message that is recorded events within the software.
    • Supports analysis of processing procedures in past
    • Helps administrators to find errors in software
    • Log Management
    • The total volume of logs increases as the number of logs increases.
    • Millions of logs need to be retrieved in a short time.
    2
    Software
    March 27, 2022 10:24:13 Started app
    March 27, 2022 10:25:40 Communicated node1
    March 27, 2022 10:25:40 Stored file1
    March 27, 2022 10:35:00 Stopped app Logs Administrator
    Search

    View Slide

  3. Introduction
    • Distributed Tracing is realized by logs.
    • It improves transactions traceability on microservices.
    • Each transaction is assigned a request identifier ’Request ID’.
    • Microservices store log messages with the identifier on log files.
    • Administrator finds the messages for root cause analysis.
    3
    [2021-10-22T00:27:09.383Z] "GET /paper/0416f705-df88-4d5f-82e8-095d4bd89e37/download HTTP/1.1" 200 -
    via_upstream - "-" 0 736954 134 133 "-" "Python/3.9 aiohttp/3.7.4.post0" "11c0553b-e1cd-9044-b4ce-
    49576dcbae6c" "paper-app.paper:4000" "10.42.2.65:8000" inbound|8000|| 127.0.0.6:37351 10.42.2.65:8000
    10.42.2.64:44452 outbound_.4000_._.paper-app.paper.svc.cluster.local default
    EXAMPLE: Log message
    Users
    Microservice A Microservice B
    Request ID=11c0553…
    Log files
    Log messages
    with Request ID
    Administrator
    Find log messages
    with Request ID
    Request ID=11c0553…
    HTTP Request (transaction)
    Request ID
    (1)
    (2)
    (3)
    (4)

    View Slide

  4. Introduction
    • Scatter-Gather Pattern enables horizontal scalability.
    • A method for large scale data processing
    • Scatter: The root node splits a task into several sub-tasks, and
    scatters sub-tasks to leaf nodes.
    • Gather: Leaf nodes return a result of the sub-task
    to the root node.
    • Prerequisite
    • Applies Scatter-Gather Pattern to log search
    for distributed tracing
    4
    Leaf nodes
    Root node
    Scatter
    Gather
    Admin

    View Slide

  5. Issue – Log Search
    ◆ A simple search method accesses all logs in parallel.
    5
    Search response time
    The volume of accessed logs
    corresponding
    As the volume of accessed logs on search increases, Search response time increases.
    As the total volume of logs increases, Search response time increases.
    ◆ Reduction of search response time is useful for trouble-shoot.
    Short response time reduces the total time for repairing troubles.
    Needs: The method for reduction the volume of logs on search

    View Slide

  6. Proposed Method
    • Proposes a fast log search method for distributed tracing
    • Reduces the number of accessed log data on Search.
    • Focuses on time-series access patterns of log data
    and service relations
    6
    B
    Microservices
    A C
    B
    Service Relations
    A
    C
    Logs
    Service
    Discovery
    Blocks
    Placement Rule
    Leaf nodes
    Root
    node
    Moving blocks by
    placement rule
    Istio
    Admin
    Search Query
    Store Phase
    Search Phase
    Clustering by
    datetime &
    microservice
    Block List

    View Slide

  7. Proposed Method – Reduction of search target logs
    Service relations correspond to Chronological order among logs.
    7
    Search target period
    Accessed Blocks
    in search targets
    Unaccessed Blocks
    in search targets
    Datetime
    Blocks: Microservice A
    Blocks: Microservice B
    Service A sends a request to Service B.
    =Time-series access patterns
    Service B writes a log message after
    Servce A writes a log message.
    A B
    Microservices
    Example)
    (1)Request
    (2)Response
    A B
    Service Relations
    Clustering by datetime
    & microservice
    Log:
    sent
    Reduces the number of accessed blocks on Search Phase
    Log:
    received

    View Slide

  8. Experimental Method
    • Measures search response time from
    search requests sent till the search
    responses recived
    • Creates 14 VMs on Hypervisor
    • CPU: 1[Core], RAM: 1[GB], Storage: 30[GB]
    • 1 root node, 13 leaf nodes
    • Stores production logs (paper search website) to leaf nodes
    • Enhances the volume of logs: 1,600 → 8,065,000 [messages]
    8
    Collected from production microservices
    request_id=xxx bs=8, s_dt_begin="2021-12-13T10:21:50", s_dt_end="2023-02-13T10:21:50"
    Search Query

    View Slide

  9. Experimental Results
    • Compares search response time between proposed method and
    all parallel method while the search target period expands
    • The proposed method is 52% shorter than all parallel method in
    response time maximally.
    9
    better
    Proposed method
    0
    0.5
    1
    1.5
    2
    2.5
    3
    3 6 9 12 15 18 21 24
    Response time [Sec]
    Date/time range in search query [Month]
    proposal-04
    proposal-08
    proposal-16
    proposal-32
    0
    0.5
    1
    1.5
    2
    2.5
    3
    3 6
    Response time [Sec]
    Dat
    all-p
    all-p
    Response time[sec]
    Search target period [month]
    All parallel method
    Search target period [month]
    21 24
    onth]
    al-16
    al-32
    0
    0.5
    1
    1.5
    2
    2.5
    3
    3 6 9 12 15 18 21 24
    Response time [Sec]
    Date/time range in search query [Month]
    all-parallel-04
    all-parallel-08
    all-parallel-16
    all-parallel-32
    Response time[sec]
    better
    Block Size=4[MB]
    Block Size=8[MB]
    Block Size=16[MB]
    Block Size=32[MB]
    Block Size=4[MB]
    Block Size=8[MB]
    Block Size=16[MB]
    Block Size=32[MB]

    View Slide

  10. Discussion
    ◆Block size
    • The proposed method sets fixed block size.
    • The number of log messages per block is homogeneous.
    • The file size which can be read and written simultaneously
    depends on Disk I/O performance per leaf node.
    • Block size has to be calculated from Disk I/O performance.
    • One of the methods is using iostat command which returns
    I/O performance.
    10

    View Slide

  11. Thank you for listening
    11

    View Slide