Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Gatherers at JavaZone

Avatar for José José
August 29, 2025

Gatherers at JavaZone

The Stream API turned 10, and it saw a nice addition in the JDK 24: the Gatherer API. A Gatherer is an object that can model an intermediate operation, just as Collector can model a terminal operation. It brings new capabilities to the Stream API, that were not possible before. The Gatherer API is a complex API, made to solve complex problems. A Gatherer can be added to a parallel stream, even if it does not support parallelization itself. This presentation shows you how this API is working, what patterns it gives you, and what are the good use cases for it. You will hear about internal mutable states, integrators, stream interruption, combinations, and parallel streams. All these are the building blocks you need to understand to master this complex API.

Avatar for José

José

August 29, 2025
Tweet

More Decks by José

Other Decks in Education

Transcript

  1. The Gatherer API: the tool that was missing in the

    Stream API José Paumard Java Developer Advocate Java Platform Group
  2. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 4 Tune

    in! Inside Java Newscast JEP Café Road To 21 series Inside.java Inside Java Podcast Sip of Java Cracking the Java coding interview
  3. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 7 JEP

    485: Stream Gatherers Stream Gatherers - Deep Dive with the Expert https://youtu.be/v_5SKpfkI2U
  4. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 8 Short

    answer: an old API from the JDK 8 A Stream connects to a source many sources: collections, files, random generators, regular expressions A Stream has intermediate operations = an operation that returns another stream A Stream has a single terminal operation = an operation that returns something else What is a Stream?
  5. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 9 Several

    methods for terminal operations: reduce() findFirst(), findAny() toList() collect(), which takes a Collector Collectors are great to create your own reduction! Even if there are things they cannot do Terminal Operations
  6. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 10 Plenty

    of methods for intermediate operations: map(), filter(), flatMap(), mapMulti() limit(), dropWhile() distinct(), sorted() No way to create your own intermediate operation! This is what the Gatherer API brings you Intermediate Operations
  7. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 11 An

    interface (in fact more than one) interface Gatherer<T, A, R> An intermediate method on the Stream API Stream.gather(gatherer) A factory class: Gatherers What Does the Gatherer API Give You?
  8. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 12 Comparing

    With a Collector Stream<T> upstream = ...; Gatherer<T, ?, R> gatherer = ...; Stream<R> downstream = upstream.gather(gatherer); Stream<T> upstream = ...; Collector<T, ?, R> collector = ...; R result = upstream.collect(collector);
  9. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 13 Short

    answer: a functional interface! What is a Gatherer? interface Gatherer<T, A, R> { Integrator<A, T, R> integrator(); } @FunctionalInterface interface Integrator<A, T, R> { boolean integrate( A state, T element, Downstream<? super R> downstream); }
  10. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 14 Short

    answer: a functional interface! What is a Gatherer? Gatherer<T, ?, R> gatherer = () -> (_, _, _) -> true;
  11. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 15 A

    Gatherer receives elements from an upstream and pushes them to a downstream This downstream can be a call to an intermediate, or a terminal stream operation What is a Gatherer?
  12. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 16 Elements

    are pushed to a gatherer by calling its integrator A Gatherer pushes elements to the downstream its integrator receives What is a Gatherer? Gatherer<T, ?, R> gatherer = Gatherer.of( (_, element, downstream) -> downstream.push(element)); // returns a boolean
  13. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 17 A

    mapping gatherer All the elements are mapped and pushed to the downstream What is a Gatherer? Gatherer<T, ?, R> gatherer = Gatherer.of( (_, element, downstream) -> { var mapped = mapper.apply(element); return downstream.push(mapped); });
  14. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 18 A

    filtering gatherer Not all the elements are pushed to the downstream! What is a Gatherer? Gatherer<T, ?, R> gatherer = Gatherer.of( (_, element, downstream) -> { if (filter.test(element)) { return downstream.push(element)); } else { ??? } });
  15. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 19 A

    filtering gatherer What is a Gatherer? Gatherer<T, ?, R> gatherer = Gatherer.of( (_, element, downstream) -> { if (filter.test(element)) { return downstream.push(element)); } else { return true; } });
  16. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 20 Pushing

    elements to a downstream How is this boolean used by the API? What is a Gatherer? Gatherer<T, ?, R> gatherer = Gatherer.of( (_, element, downstream) -> downstream.push(element)); // returns a boolean
  17. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 21 1)

    You can push 0 or more elements to a downstream 2) A call to downstream.push() returns a boolean true means that this downstream accepts more elements false means it does not 3) Your integrator should follow this rule Pushing to a rejecting downstream does not thrown any exception Pushing to a Downstream
  18. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 22 If

    your Integrator does not decide on itself to return false ie it always transmits the value returned by downstream.push() Then you can declare it as a Greedy integrator Pushing to a Downstream Gatherer<T, ?, R> gatherer = Gatherer.of( Integrator.of((_, element, downstream) -> downstream.push(element)));
  19. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 23 If

    your Integrator does not decide on itself to return false ie it always transmits the value returned by downstream.push() Then you can declare it as a Greedy integrator Pushing to a Downstream Gatherer<T, ?, R> gatherer = Gatherer.of( Integrator.ofGreedy((_, element, downstream) -> downstream.push(element)));
  20. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 24 A

    Downstream holds a state: rejecting 1) Starts with false 2) Cannot commute from false to true 3) Can only commute on a call to push() A downstream could commute on a clock A downstream is not a thread safe object! Rejecting Downstreams
  21. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 25 Should

    you call isRejecting()? Downstream (_, element, downstream) -> { return downstream.push(mapper.apply(element)); }
  22. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 26 Should

    you call isRejecting()? Downstream (_, element, downstream) -> { if (downstream.isRejecting()) { return false; } return downstream.push(mapper.apply(element)); }
  23. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 27 Should

    you call isRejecting()? Downstream (_, element, downstream) -> { if (downstream.isRejecting()) { // return false; // worth it? } // return downstream.push(mapper.apply(element)); }
  24. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 28 Should

    you call isRejecting()? This is an integrator Downstream (_, element, downstream) -> { if (downstream.isRejecting()) { // return false; // worth it? NOPE! } // return downstream.push(mapper.apply(element)); }
  25. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates | 29

    Should you call isRejecting()? Calling isRejecting() is useless in an integrator. More on that later… Downstream
  26. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 30 How

    to push to the downstream? A Flatmapping Gatherer Function<T, Stream<R>> flatMapper = ...; Gatherer<T, ?, R> gatherer = Gatherer.of( (_, element, downstream) -> { Stream<R> elements = flatMapper.apply(element); elements.forEach(downstream::push); return true; //  });
  27. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 31 How

    to push to the downstream? A Flatmapping Gatherer Function<T, Stream<R>> flatMapper = ...; Gatherer<T, ?, R> gatherer = Gatherer.of( (_, element, downstream) -> { Stream<R> elements = flatMapper.apply(element); elements.takeWhile(_ -> !downstream.isRejecting()) .forEach(downstream::push); return !downstream.isRejecting(); });
  28. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 32 How

    to push to the downstream? Two bug fixes to go! (plus one) A Flatmapping Gatherer Function<T, Stream<R>> flatMapper = ...; Gatherer<T, ?, R> gatherer = Gatherer.of( (_, element, downstream) -> { Stream<R> elements = flatMapper.apply(element); return elements.allMatch(downstream::push); });
  29. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 33 How

    to push to the downstream? A stream has to be closed! A Flatmapping Gatherer Function<T, Stream<R>> flatMapper = ...; Gatherer<T, ?, R> gatherer = Gatherer.of( (_, element, downstream) -> { Stream<R> elements = flatMapper.apply(element); return elements.allMatch(downstream::push); });
  30. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 34 How

    to push to the downstream? 1 bug fix to go! (plus one) A Flatmapping Gatherer (_, element, downstream) -> { try (Stream<R> elements = flatMapper.apply(element);) { return elements.allMatch(downstream::push); } }
  31. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 35 How

    to push to the downstream? A downstream is not a thread safe object! A Flatmapping Gatherer (_, element, downstream) -> { try (Stream<R> elements = flatMapper.apply(element);) { return elements.allMatch(downstream::push); } }
  32. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 36 How

    to push to the downstream? The returned stream could be a parallel stream  A Flatmapping Gatherer (_, element, downstream) -> { try (Stream<R> elements = flatMapper.apply(element);) { return elements.allMatch(downstream::push); } }
  33. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 37 How

    to push to the downstream? One bonus fix to go! A Flatmapping Gatherer (_, element, downstream) -> { try (Stream<R> elements = flatMapper.apply(element);) { return elements.sequential() .allMatch(downstream::push); } }
  34. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 38 How

    to push to the downstream? A Flatmapping Gatherer (_, element, downstream) -> { try (Stream<R> elements = flatMapper.apply(element);) { Objects.requireNonNull(elements); return elements.sequential() .allMatch(downstream::push); } }
  35. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 39 How

    to push to the downstream? A Flatmapping Gatherer (_, element, downstream) -> { try (Stream<R> elements = flatMapper.apply(element);) { if (elements == null) return true; return elements.sequential() .allMatch(downstream::push); } }
  36. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 40 1)

    Carries a state: rejecting 2) Is not a thread-safe object Be careful with parallel streams! Wrapping up Downstream
  37. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 41 A

    Gatherer may carry an internal mutable state What would you need this state for? What About State? (state, element, downstream) -> { return true; }
  38. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 42 A

    limit Gatherer What About State? var gatherer = Gatherer.ofSequential( ( , element, downstream) -> { if ( count++ < limit) { return downstream.push(element); } else { return false; } });
  39. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 43 A

    limit Gatherer, how can you manage this count? What About State? var gatherer = Gatherer.ofSequential( ( , element, downstream) -> { if ( count++ < limit) { return downstream.push(element); } else { return false; } });
  40. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 44 A

    Gatherer may carry an internal mutable state This state is initialized with an initializer It can be mutable It is carried from one call of the integrator to the other What About State? (state, element, downstream) -> { return true; }
  41. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 45 A

    limit Gatherer What About State? class Counter { long count = 0L; } var gatherer = Gatherer.ofSequential( Counter::new, // the initializer (state, element, downstream) -> { if (state.count++ < limit) { return downstream.push(element); } else { return false; } });
  42. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 46 A

    limit Gatherer What About State? class Counter { long count = 0L; } var gatherer = Gatherer.ofSequential( () -> new Object() { long count = 0L; }, // the initializer (state, element, downstream) -> { if (state.count++ < limit) { // non-denotable type return downstream.push(element); } else { return false; } });
  43. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 47 A

    distinct Gatherer What About State? var gatherer = Gatherer.ofSequential( () -> new Object() { Set<T> set = new HashSet<>(); }, (state, element, downstream) -> { if (state.set.add(element)) { return downstream.push(element); } else { return true; } });
  44. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 48 What

    about a sorting (and distinct) Gatherer What About State? var gatherer = Gatherer.ofSequential( () -> new Object() { Set<T> set = new TreeSet<>(); }, (state, element, downstream) -> { state.set.add(element); // can you push element? return true; // when can you push the content of set? });
  45. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates | 49

    Pushing a final state is made with a Finisher A Finisher looks like an Integrator It is called once there is no more element to be pushed to the Integrator Nothing is called after it Pushing a Final State Integrator<A, T, R> integrator = (state, element, downstream) -> { return true; }
  46. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates | 50

    Pushing a final state is made with a Finisher A Finisher looks like an Integrator It is called once there is no more element to be pushed to the Integrator Nothing is called after it Pushing a Final State finisher = (state, element, downstream) -> { return true; }
  47. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates | 51

    Pushing a final state is made with a Finisher A Finisher looks like an Integrator It is called once there is no more element to be pushed to the Integrator Nothing is called after it Pushing a Final State finisher = (state, downstream) -> { }
  48. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates | 52

    A Finisher is a BiConsumer of the state and the downstream Pushing a Final State BiConsumer<A, Downstream<R>> finisher = (state, downstream) -> { }
  49. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 53 Adding

    a Finisher to the sorting (and distinct) gatherer Pushing a Final State var gatherer = Gatherer.ofSequential( () -> new Object() { Set<T> set = new TreeSet<>(); }, (state, element, downstream) -> { ... }, (state, downstream) -> { // finisher state.set.stream().allMatch(downstream::push); } );
  50. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 54 A

    Gatherer is built on 3 elements (so far): - An initializer for its internal mutable state - An integrator, that can be greedy - A finisher, to push the elements left in the state Wrapping up Gatherers
  51. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 55 Should

    you call isRejecting()? This is an integrator Downstream (_, element, downstream) -> { if (downstream.isRejecting()) { // return false; // worth it? NOPE! } // return downstream.push(mapper.apply(element)); }
  52. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 56 Should

    you call isRejecting()? This is a finisher Downstream (state, downstream) -> { if (!downstream.isRejecting()) { state.set.stream().allMatch(downstream::push); } }
  53. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 57 So

    far we built gatherers with two factory methods: - Gatherer.of(…) - Gatherer.ofSequential(…) Gatherers support parallel streams (of course!) there are parallel and sequential gatherers both can be called in parallel streams What about the internal mutable state? Parallel Gatherers
  54. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 58 1

    Principle: there is one state object per thread So, in a parallel stream: 1) Each thread creates its own instance of state 2) At the end of the day, you need a combiner Parallel Gatherers (state1, state2) -> { // do something return state; }
  55. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 59 Parallel

    Gatherers State integrator State element thread-1
  56. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 60 Parallel

    Gatherers integrator State element integrator State element integrator State element integrator State element Combiner thread-1 thread-2 thread-3 thread-4
  57. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 61 A

    Parallel Distinct Gatherer var gatherer = Gatherer.of( () -> new Object() { Set<T> set = new HashSet<>(); }, (state, element, downstream) -> { // executed in state.set.add(element); // different threads return true; }, (state1, state2) -> { // combiner state1.set.addAll(state2.set); // called before the finisher return state1; }, (state, downstream) -> { // finisher state.set.allMatch(downstream::push); } );
  58. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 62 A

    Sequential Gatherer cannot be called in more than one thread at the same time They do not have a combiner, so they cannot combine different states But they can be used in parallel streams! What About Sequential Gatherers?
  59. sequential gatherer 8/29/2025 Copyright © 2025, Oracle and/or its affiliates

    63 Sequential Gatherers in Parallel Streams Source upstream downsteam
  60. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 64 The

    management of the threads is the responsibility of the Fork / Join framework = transparent from the user point of view A sequential Gatherer is executed by a single thread, but it can jump from one thread to the other this is how the Fork / Join framework works Sequential Gatherers in Parallel Streams
  61. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 65 Sequential

    Gatherers in Parallel Streams https://github.com/ SvenWoltmann/stream-gatherers
  62. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 66 The

    Gatherers class - windowFixed() Ready-to-Use Gatherers var ints = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9); ints.stream() .gather(Gatherers.windowFixed(3)) .toList(); > [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
  63. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 67 The

    Gatherers class - windowSliding() Ready-to-Use Gatherers var ints = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9); ints.stream() .gather(Gatherers.windowSliding(3)) .toList(); > [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6], ...]
  64. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 68 The

    Gatherers class - scan() Ready-to-Use Gatherers var ints = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9); ints.stream() .gather(Gatherers.scan( () -> "", (scanned, e) -> s + e)) .toList(); > ["1", "12", "123", "1234", "12345", ...]
  65. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 69 The

    Gatherers class - fold() Ready-to-Use Gatherers var ints = List.of(1, 2, 3, 4, 5, 6, 7, 8, 9); ints.stream() .gather(Gatherers.fold( () -> "", (folded, e) -> folded + e)) .toList(); > ["123456789"]
  66. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 70 The

    Gatherers class - mapConcurrent() Maps a stream to another stream Each mapping is computed in its own, virtual thread. Takes a maxConcurrency parameter to control the max number of running virtual threads Ready-to-Use Gatherers
  67. 8/29/2025 Copyright © 2025, Oracle and/or its affiliates 71 Builts

    on: - An Initializer - An Integrator (can be greedy) - A Combiner - A Finisher Can decide to work in parallel or not A sequential Gatherer can work in a parallel stream A sequential Stream can compute mappings in parallel Wrapping up Gatherers