Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Massive Distributed and Parallel Log Analysis F...

Xiaokui Shu
December 13, 2013

Massive Distributed and Parallel Log Analysis For Organizational Security

Presentation for paper in Proceedings of the First International Workshop on Security and Privacy in Big Data

Xiaokui Shu

December 13, 2013
Tweet

More Decks by Xiaokui Shu

Other Decks in Technology

Transcript

  1. Massive Distributed and Parallel Log Analysis For Organizational Security Xiaokui

    Shu, John Smiy, Danfeng (Daphne) Yao, and Heshan Lin Department of Computer Science Virginia Tech Blacksburg, Virginia 24060
  2. Massive Distributed Log Analysis Challenge and Opportunity Security Log Analysis

    Log Explosion Analyzing TCP Request Log to Find a Port Scan Attack 1
  3. Massive Distributed Log Analysis Challenge and Opportunity Security Log Analysis

    Log Explosion Analyzing Logs From Multiple Machines to Find a Botnet 2 Pattern of connections in the attack channel Pattern of connections in the control channel
  4. Massive Distributed Log Analysis 3 Login log System service log

    System audit log Web server log File sharing log Firewall log IDS log VPN log Challenge and Opportunity Security Log Analysis Log Explosion Features of security logs in the modern era:  Large amount of logs  Vast variety of logs  Distributed log generation
  5. Massive Distributed Log Analysis 4 Data awareness Transparent data flow

    No DFS in clouds by default Design for immutable files MapReduce w/ Distributed File System Lightweight design User specific data flow Distributed log importing Streaming log analysis Our Approach Our Approach Overview Workflow App Example Our easy-to-use framework for parallel log analysis Existing general purpose parallel computing framework
  6. Our Approach Overview Workflow App Example Massive Distributed Log Analysis

    5 Streaming log analysis Distributed log importing
  7. Our Approach Overview Workflow App Example Massive Distributed Log Analysis

    6 Security Event Occurrence Counter (IP counter) o Denial-of-service attack detection o Botnet detection o User pattern analysis / anomaly detection We use a three-layer hashmap counter to realize the application
  8. Evaluation System Scalability Massive Distributed Log Analysis 7 Amazon EC2

    and S3 Free tier micro instances (1.0-1.2GHz, 512MB) Environment IP counter (HTTP log) Application
  9. Conclusion and Future Work Massive Distributed Log Analysis 8 We

    build a lightweight, easy to use distributed log analysis framework and perform an evaluation on it. Conclusion Framework generation More applications Streaming analysis improvement Future Work
  10. Related Work Massive Distributed Log Analysis 9 Dean, Jeffrey, and

    Sanjay Ghemawat. "MapReduce: simplified data processing on large clusters." Communications of the ACM 51.1 (2008): 107-113. Logothetis, Dionysios, et al. "In-situ mapreduce for log processing." 2011 USENIX Annual Technical Conference (USENIX ATC’11). 2011. Yang, Shun-Fa, Wei-Yu Chen, and Yao-Tsung Wang. "ICAS: An inter-VM IDS Log Cloud Analysis System." Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on. IEEE, 2011. Francois, Jerome, et al. "BotCloud: Detecting botnets using MapReduce." Information Forensics and Security (WIFS), 2011 IEEE International Workshop on. IEEE, 2011. Feng, Junqiu, et al. "Elastic stream cloud (ESC): A stream-oriented cloud computing platform for Rich Internet Application." High Performance Computing and Simulation (HPCS), 2010 International Conference on. IEEE, 2010. Andreolini, Mauro, Michele Colajanni, and Stefania Tosi. "A software architecture for the analysis of large sets of data streams in cloud infrastructures." Computer and Information Technology (CIT), 2011 IEEE 11th International Conference on. IEEE, 2011.