Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[Jfokus] Riding the Jet Streams

0680be1c881abcf19219f09f1e8cf140?s=47 Viktor Gamov
February 08, 2017

[Jfokus] Riding the Jet Streams

0680be1c881abcf19219f09f1e8cf140?s=128

Viktor Gamov

February 08, 2017
Tweet

More Decks by Viktor Gamov

Other Decks in Technology

Transcript

  1. Riding Jet Streams @gAmUssA @hazelcast #jfokus #hazelcastjet http://bit.ly/streams_jfokus2017

  2. @gAmUssA @hazelcast #jfokus #hazelcastjet Solutions Architect @Hazelcast Developer Advocate @Hazelcast

    @gamussa in internetz Please, follow me on Twitter I’m very interesting © > whoami
  3. @gAmUssA @hazelcast #jfokus #hazelcastjet Agenda Quick refresh on Java 8

    Streams Distribute and Conquer Distributed Data Distributed Streams How we did all this
  4. @gAmUssA @hazelcast #jfokus #hazelcastjet Example: Word Count Map<Integer, String> where

    keys are line numbers and values are lines. Find how many times each word occurs
  5. @gAmUssA @hazelcast #jfokus #hazelcastjet What needs to be done? Iterate

    through all the lines Split the line into words Update running total of counts with new word
  6. fillMapWithData("war_and_peace_eng.txt", source); for (String line : source.values()) { for (String

    word : PATTERN.split(line)) { if (word.length() >= 5) count.compute( cleanWord(word).toLowerCase(), (w, c) -> c == null ? 1 : c + 1 ); } } System.out.println(count.get("andrew")); Iterate through all the lines
  7. fillMapWithData("war_and_peace_eng.txt", source); for (String line : source.values()) { for (String

    word : PATTERN.split(line)) { if (word.length() >= 5) count.compute( cleanWord(word).toLowerCase(), (w, c) -> c == null ? 1 : c + 1 ); } } System.out.println(count.get("andrew")); Split the line into words
  8. fillMapWithData("war_and_peace_eng.txt", source); for (String line : source.values()) { for (String

    word : PATTERN.split(line)) { if (word.length() >= 5) count.compute( cleanWord(word).toLowerCase(), (w, c) -> c == null ? 1 : c + 1 ); } } System.out.println(count.get("andrew")); Update running total of counts with new word
  9. fillMapWithData("war_and_peace_eng.txt", source); for (String line : source.values()) { for (String

    word : PATTERN.split(line)) { if (word.length() >= 5) count.compute( cleanWord(word).toLowerCase(), (w, c) -> c == null ? 1 : c + 1 ); } } System.out.println(count.get("andrew")); Print the result
  10. java.util.stream

  11. @gAmUssA @hazelcast #jfokus #hazelcastjet Java 8 Streams… An abstraction represents

    a sequence of elements Is not a data structure Convey elements from a source through a pipeline of operations Operation doesn’t modify a source
  12. @gAmUssA @hazelcast #jfokus #hazelcastjet Why I should care about Stream

    API? You’re Java developer
  13. What does regular Java developer think about Scala? advanced

  14. @gAmUssA @hazelcast #jfokus #hazelcastjet Why I should care about Stream

    API? You’re Java developer Many Java developers know Java It’s all about data processing
  15. @gAmUssA @hazelcast #jfokus #hazelcastjet java.util.stream map(), flatMap(), filter() reduce(), collect()

    sorted(), distinct() Intermediate operation Terminal operation Stateful Intermediate (Blocking) operation
  16. None
  17. None
  18. None
  19. @gAmUssA @hazelcast #jfokus #hazelcastjet Why would one need a cluster?

    One does not simply fit all Big Data in one machine
  20. @gAmUssA @hazelcast #jfokus #hazelcastjet Problem Data doesn’t fit just one

    machine
  21. @gAmUssA @hazelcast #jfokus #hazelcastjet Why would one need a cluster?

    One does not simply put all Big Data in one machine Data is too important to have it only one machine
  22. None
  23. None
  24. Replication on Sharding? http://book.mixu.net/distsys/single-page.html

  25. @gAmUssA @hazelcast #jfokus #hazelcastjet Another Requirements Easy to use Simple

    API Embeddable Cloud Native
  26. @gAmUssA @hazelcast #jfokus #hazelcastjet What’s Hazelcast IMDG? In-memory Data Grid

    Apache v2 Licensed Distributed Caches (IMap, JCache) Java Collections (IList, ISet, IQueue) Messaging (Topic, RingBuffer) Computation (ExecutorService, M-R)
  27. 1 900 Stars On GitHub 100% Open Source 134 contributors

  28. None
  29. None
  30. None
  31. @gAmUssA @hazelcast #jfokus #hazelcastjet Green Primary Green Backup Green Shard

  32. None
  33. @gAmUssA @hazelcast #jfokus #hazelcastjet What’s the problem? Use IMap.values().stream() ?

    Or IMap.entrySet().stream() ? 3 3
  34. @gAmUssA @hazelcast #jfokus #hazelcastjet Problem Data doesn’t fit just one

    machine
  35. None
  36. None
  37. @gAmUssA @hazelcast #jfokus #hazelcastjet EASY (actually, not)! Implement serializable version

    of the interfaces Introducing DistributedStream 3 7
  38. 3 8

  39. None
  40. 4 0 Jet Streams

  41. jet.hazelcast.org

  42. @gAmUssA @hazelcast #jfokus #hazelcastjet What’s Hazelcast Jet? General purpose distributed

    data processing framework Based on Direct Acyclic Graph to model data flow Built on top of Hazelcast IMDG Comparable to Apache Spark or Apache Flink 4 2
  43. None
  44. @gAmUssA @hazelcast #jfokus #hazelcastjet DAG vertex vertex vertex vertex SOURCE

    SINK
  45. None
  46. Benchmarks Compared to Spark, Flink, Hadoop doing word count, running

    on a cluster of 9 nodes, 40 cores each
  47. None
  48. @gAmUssA @hazelcast #jfokus #hazelcastjet Future (It’s bright!) Processing guarantees for

    stream processing Streaming features (windowing, triggering) Higher level streaming and batching APIs Integration with additional Hazelcast structures (ICache, IQueue ..)
  49. @gAmUssA @hazelcast #jfokus #hazelcastjet Future (It’s bright!) Event sourcing /

    CQRS Off-heap memory support RxJava More connectors to additional sources (JMS, JDBC..)
  50. @gAmUssA @hazelcast #jfokus #hazelcastjet Grab while it’s hot! jet.hazelcast.org hazelcast/hazelcast-jet

    http://bit.ly/streams_jfokus2017 documentation Source on Github Presentation materials
  51. @gAmUssA @hazelcast #jfokus #hazelcastjet Conclusion Java Stream API provides very

    white range of data processing tools War And Piece – is a Big (a lot of data) Book! Now we’re pretty sure that Andrew and Pierre are the main characters
  52. None
  53. SlidesCarnival icons are editable shapes. This means that you can:

    • Resize them without losing quality. • Change fill color and opacity. Isn’t that nice? :) Examples:
  54. Now you can use any emoji as an icon! And

    of course it resizes without losing quality and you can change the color. How? Follow Google instructions https://twitter.com/googledocs/status/730087240156643328 ✋❤ and many more...
  55. @gAmUssA @hazelcast #jfokus #hazelcastjet Extra graphics