Upgrade to Pro — share decks privately, control downloads, hide ads and more …

BoT2013 海量資料時代的網路分析

Allen Own
September 12, 2013

BoT2013 海量資料時代的網路分析

Allen Own

September 12, 2013
Tweet

More Decks by Allen Own

Other Decks in Technology

Transcript

  1. Who Am I ॽख͍ (Allen Own) [email protected] DEVCORE Ꮦ˃੒ဧੂБڗ CHROOT

    ϓࡰ HITCON ̨ᝄᎡ܄ϋึਓᐼ̜ NISRA ༟τྠඟ௴፬ɛ ༟τҦঐږ޷ᆤᘩᒄϋڿࠏ
  2. Big Data Big data[1][2] is the term for a collection

    of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage, [3] search, sharing, transfer, analysis,[4] and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data, allowing correlations to be found to "spot business trends, determine quality of research, prevent diseases, link legal citations, combat crime, and determine real-time roadway traffic conditions."[5][6][7]
  3. How Big Data fights back against APTs and Malware? http://www.seculert.com/blog/2013/05/how-big-

    data-fights-back-against-apts-and-malware.html http://info.umbrella.com/infographic-using-big-data- for-malware-protection.html
  4. Internet Census 2012 http://internetcensus2012.bitbucket.org/ While playing around with the Nmap

    Scripting Engine (NSE) we discovered an amazing number of open embedded devices on the Internet. Many of them are based on Linux and allow login to standard BusyBox with empty or default credentials. We used these devices to build a distributed port scanner to scan all IPv4 addresses. These scans include service probes for the most common ports, ICMP ping, reverse DNS and SYN scans. We analyzed some of the data to get an estimation of the IP address usage.
  5. Internet Census 2012 http://internetcensus2012.bitbucket.org/download/ internet_census_2012.torrent Decompressing all data results in

    9TB of raw logfiles, but this code can also be used to recompress the data into gzip files. The gziped dataset should be ~1.5TB.
  6. elasticsearch http://www.elasticsearch.org/ flexible and powerful PQFOTPVSDF, distributed real- time search

    and analytics engine for the cloud. Case Study: Fog Creek, Stack Overflow, SoundCloud, StumbleUpon, Github, foursquare, Wordpress, salesforce
  7. logstash http://logstash.net logstash is a tool for managing events and

    logs. You can use it to collect logs, parse them, and store them for later use (like, for searching). Speaking of searching, logstash comes with a web interface for searching and drilling into all of your logs. *UJTGVMMZGSFFBOEGVMMZPQFOTPVSDF
  8. Kibana http://kibana.org Kibana is an PQFOTPVSDF (MIT License), browser based

    interface to Logstash and ElasticSearch. Once you have those in place, Kibana is a breeze to install and configure (really, I swear). And as you'll see below, none too hard to operate. Check out the screenshots for an idea of what Kibana is all about.