Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tales from the Ops Side - Creepy Crawler

Tales from the Ops Side - Creepy Crawler

Another tale from the Ops Side, this talk focuses on an event affecting one of our larger e-commerce clients where a bad actor was attempting to actively steal their content. This tale goes through the discovery and mitigation process our team undertook to combat this event.

VM Farms Inc.

June 25, 2018
Tweet

More Decks by VM Farms Inc.

Other Decks in Technology

Transcript

  1. Site down! • Very high load on the main database

    server (MySQL). • Looking closer, noticed a lot of similar queries (pagination). • Site recovered on it’s own within 10 minutes.
  2. Investigation • Excessive pagination - could it be a crawler?

    • Customer does a lot of SEO. • Checked logs, noticed a few Yahoo crawlers (less than 200/hr). • Collected info and went back to life.
  3. Same pattern • High load on DB server. • Traced

    some queries to a scheduled job running on an admin server. • Further investigation shows it runs
 every 5 min.
  4. Same pattern • High load on DB server. • Traced

    some queries to a scheduled job running on an admin server. • Further investigation shows it runs
 every 5 min. git blame says it’s 1yr old!
  5. Site recovers • Outage lasted about 15 min. • Cron

    job was prime suspect. • Job is a cache-refresher. • Waited for next execution. Nothing. • Will disable if site goes down again.
  6. Same pattern • High DB load. • Cron was running.

    • Terminated and disabled the job. • Site did not recover. • Where to look next?
  7. Staring at the logs • Something caught our eye. •

    Proxy logs showed many requests with Referrer of customer’s blog. • Grouped these requests together and analyzed further.
  8. Staring at the logs • Also noticed: • User-Agent was

    Baiduspider/2.0. • Request to /en/all-categories • Query String was:
 
 manufacturer=<VARYING NUMBER>
 
 &on_sale=yes
  9. Is it a crawler? • Was Baidu wreaking havoc? •

    Known to misbehave. • 10,829 unique IPs in last 24 hours. • All belonged to American Residential ISPs (Comcast, Verizon, Roadrunner, etc…).
  10. Suspicious • Customer mainly targets Canada with some US business.

    • High volume of American IPs during previous 2 outages. • Site recovered in 15 min. • Flow of American IPs stopped.
  11. Not Baidu • Convinced us this was malicious. Time to

    block. • Want to be surgical. • Because IPs were residential and varied, can’t be used for blocking.
  12. Surgical blocking • Requests for /en/all-categories • User-Agent was Baiduspider/2.0

    • Is on_sale=yes • Had a Referrer of customer’s blog!
  13. Checked logs • Our rules were not being matched anymore.

    • American IPs were coming through. • Referrer was now empty! • Attacker was active and adapting.
  14. Modify block • Remove Referrer condition. • Not preferable -

    less surgical. • Site recovers immediately.
  15. Log check • Same request pattern, same American IPs. •

    Still empty Referrer. • No longer Baiduspider/2.0. • User-Agent was now Chrome!
  16. User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML,

    like Gecko) Chrome/40.0.2214.115 Safari/537.36 This version was 6 months old!
  17. Old Chrome • All the suspicious requests had the same

    User-Agent. • Unusual since Chrome is good at auto- updating. • This version had known vulnerabilities.
  18. CVE-2014-9689
 aka: Gyrophone Gyroscopes found on modern smart phones are

    sufficiently sensitive to measure acoustic signals in the vicinity of the phone.
  19. CVE-2014-9689
 aka: Gyrophone Gyroscopes found on modern smart phones are

    sufficiently sensitive to measure acoustic signals in the vicinity of the phone. Likely not relevant
  20. Adapt block • Strange that so many requests have the

    same User-Agent. • Possible, but unlikely. Client was hesitant. • Adapted block to include this version of Chrome. • Site recovers immediately.
  21. New pattern • Different request: /en/gear-equipment. • Same American IPs,

    same User-Agent. • Decided to block User-Agent outright. • Site recovers immediately.
  22. New User-Agent Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko)

    Chrome/ 1413019477.612818924.1339797527.1263967477 Safari/537.36
  23. Getting very annoyed • Clearly an invalid version. • “Version”

    numbers would vary. • OK fine. Let’s play.
  24. Time for RegEx (\d{6,})\.(\d{6,})\.(\d{6,})\.(\d{6,}) This matches 4 sets of 6

    or more numbers like this 1413019477.612818924.1339797527.1263967477
  25. Different pattern • Nothing in the logs. No traffic flow.

    • DB was not loaded this time. • App servers had very high load. • Every app process consuming 100% CPU.
  26. Different pattern • Nothing in the logs. No traffic flow.

    • DB was not loaded this time. • App servers had very high load. • Every app process consuming 100% CPU. Attacker upping their game?
  27. strace • Attached strace to a php-fpm process. • No

    output. Very odd. • Restarted app processes.
  28. Watch logs • Some requests come in, then flow stops

    again. • CPU spikes back up immediately. • Quick analysis: about 200 request after restart. • 20 app servers x 10 processes each = 200.
  29. Isolate • Removed 1 server from the load balancer to

    isolate. • Configured rules to direct our IP to this server. • Reduced process count to 1. • Attached strace and loaded site in browser.
  30. Analysis • Lots of expected output. • Some calls to

    the DB and Redis. • Then nothing. CPU spikes.
  31. Deeper dive • Could it be Redis? No, server was

    healthy. • Turned on “slow request” logging in php-fpm. • Found our culprit! Awesome idea!
  32. 36 while (isset($hash[$code][$unKey]))){ 37 $unKey .= Mage::getStoreConfig('special_char'); 38 } $code

    is “manufacturer”
 $unKey is “headhaus"
 $hash[$code][$unKey] is 10560
  33. Infinite loop • 'special_char' is set to - • headhaus

    should become headhaus- then headhaus--, etc… • Mage::getStoreConfig values are cached in Redis, then memory. • Redis returns an empty string.
  34. Infinite loop • 'special_char' is set to - • headhaus

    should become headhaus- then headhaus--, etc… • Mage::getStoreConfig values are cached in Redis, then memory. • Redis returns an empty string. Did our attacker compromise Redis somehow?
  35. More scheduled jobs • admin server has a scheduled job

    at 3am to clear the cache. • Also had a job at 12am to refresh products. • Doesn’t really smell like an attack. • Went to bed. Needed to rest.
  36. Following morning • Previous day, customer had updated several components

    of their app. • Newer versions had performance improvements, including phpredis. • Customer was also able to replicate in staging!
  37. Trigger on demand • Running the product refresh job breaks

    the app. • Running cache-clearing job fixes the app. • What was the product refresh job doing?
  38. Simple logic • Clears several cached entries,
 including special_char. •

    Reads in attributes: Mage::getStoreConfig('attributes') • Refreshes catalogue.
  39. Simple logic • Clears several cached entries,
 including special_char. •

    Reads in attributes: Mage::getStoreConfig('attributes') • Refreshes catalogue. This was returning garbage!
  40. Compressed data • New version of phpredis supported compression. •

    admin server was not updated. • Could not parse returned value and fails. • Never re-populates special_char. • App breaks.
  41. Hypotheis • Due to the lack of sophistication of the

    scrape attempt (low skill), • And access to a very large number of compromised systems (>50k — high skill), • It’s likely the attackers rented out a botnet in an effort to steal their content in a rapid fashion.
  42. Lessons • Heavy applications are vulnerable. • Sometimes staring at

    logs works. Sometimes. • strace telling you nothing tells you everything.