Manage Large Data Sets with Streams

Manage Large Data Sets with Streams

Working with streams sounds scary and complicated but we’ll show you how to leverage streams to process large data imports without having to sell your house to buy RAM. Streams are a way to generalize data for easier processing in a linear way including the ability to seek around the stream. We’ll cover our own tips and tricks we’ve learned along the way to help you dive deep into processing streams.

Streams have been in PHP since back in the 4.x days however we continually see developers trying to iterate on huge data sets and often run out of memory. We’ll show you a better solution instead of “ ini_set(‘memory_limit’,’16GB’);”

Fef6ec2170ad1ecfcacdf2ead305f040?s=128

Joe Ferguson

May 23, 2019
Tweet

Transcript

  1. Manage Large Data Sets with Streams Joe Ferguson

  2. Who Am I? Joe Ferguson Senior Full Stack Developer @

    Preteckt Twitter: @JoePFerguson OSMI Board Member The Joindin Foundation & Joindin Leadership Team
  3. Agenda Streams: What they are and why you shouldn’t cross

    them Searching a 5 million line CSV Guzzling Streams with Guzzle
  4. https://www.php.net/manual/en/intro.stream.php a stream is a resource object which exhibits streamable

    behavior
  5. Linear Data 0 17.5 35 52.5 70 Record 1 Record

    2 Record 3 Record 4 Age
  6. scheme://target

  7. You’re already using them file() open() fwrite() fclose() file_get_contents() file_put_contents()

  8. Stream Transports

  9. Stream Wrappers

  10. Stream Filters

  11. Stream Context

  12. Size Matters

  13. Size Matters

  14. “I know how to fix it”

  15. “Job’s Done Boss!”

  16. Wait a minute… 2G 2G 2G 2G 2G 2G

  17. What’s Burning?

  18. Save the memory…

  19. Now I have this ISO

  20. Streaming People (Don’t worry, they’re not real)

  21. file_get_contents()

  22. Reading from pointers

  23. What’s a pointer?

  24. Reading from pointers

  25. Memory Usage

  26. Memory Usage

  27. Double Down

  28. Memory Usage

  29. Rewinding Streams

  30. Rewinding Streams

  31. Rewinding Streams

  32. Rewinding Streams

  33. Rewinding Streams

  34. Rewinding Streams

  35. Seeking Around Streams

  36. Seeking Around Streams

  37. Guzzling Streams

  38. Guzzling Streams

  39. Guzzling Streams

  40. Testing Our Response

  41. Guzzling Streams

  42. Read 100 bytes

  43. Read 100 bytes

  44. Content Type

  45. Content Type

  46. Content Type

  47. Drilling Down

  48. Drilling Down

  49. Inspecting a Post

  50. Interesting!

  51. Interesting!

  52. When to use Streams Reading files that may not fit

    in memory Downloading files from a remote system Fetching data from APIs
  53. Resources

  54. Joe Ferguson Twitter: @JoePFerguson Email: joe@joeferguson.me Contact Info: