Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Stream Gatherers: The Missing Link in Java Streams

Stream Gatherers: The Missing Link in Java Streams

Ever wished you could do more with Java Streams? While adding custom terminal operations through Collectors is straightforward, creating new intermediate operations has always been challenging. This talk introduces Stream Gatherers, the feature that elegantly solves this limitation. Drawing from extensive experience developing the open source Gatherers4J library, we'll examine how Gatherers enable developers to create custom intermediate operations that seamlessly integrate with the existing Stream API. Through live coding and practical examples, you'll discover how to write custom Gatherers, understand their internal mechanics, and learn when they're the right tool for the job. This session is perfect for developers who want to level up their Java Stream expertise and expand their stream processing capabilities beyond what collectors alone can provide.

Avatar for Todd Ginsberg

Todd Ginsberg

September 22, 2025
Tweet

More Decks by Todd Ginsberg

Other Decks in Technology

Transcript

  1. Stream Gatherers TriJUG 2025-09-22 Todd Ginsberg The Missing Link In

    Java Streams @ToddGinsberg Lead Engineer - Payments Deutsche Bank
  2. Hello! Todd Ginsberg Raleigh, NC TriJUG organizer 30 years of

    professional experience Currently: Director, Lead Engineer - Payments Deutsche Bank Cary, NC Photo Credit: Andrew Byala @todd.ginsberg.com
  3. @todd.ginsberg.com Why Are We Here? Enhance the Stream API to

    support custom intermediate operations. This will allow stream pipelines to transform data in ways that are not easily achievable with the existing built-in intermediate operations. -- JEP-485: Stream Gatherers
  4. @todd.ginsberg.com Why Are We Here? Enhance the Stream API to

    support custom intermediate operations. This will allow stream pipelines to transform data in ways that are not easily achievable with the existing built-in intermediate operations. -- JEP-485: Stream Gatherers
  5. @todd.ginsberg.com Built-in Collectors averagingDouble() averagingInt() averagingLong() groupingBy() joining() minBy() maxBy()

    counting() partitioningBy() toList() toUnmodifiableList() toMap() toUnmodifiableMap() toSet() toUnmodifiableSet()
  6. @todd.ginsberg.com public interface Collector<IN, STATE, OUT> { Supplier<STATE> supplier(); BiConsumer<STATE,

    IN> accumulator(); BinaryOperator<STATE> combiner(); } Collector Interface
  7. @todd.ginsberg.com public interface Collector<IN, STATE, OUT> { Supplier<STATE> supplier(); BiConsumer<STATE,

    IN> accumulator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); } Collector Interface
  8. @todd.ginsberg.com public interface Collector<IN, STATE, OUT> { Supplier<STATE> supplier(); BiConsumer<STATE,

    IN> accumulator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Collector Interface
  9. @todd.ginsberg.com Parallel Evaluation [ A, B, C, D, E, F,

    G, H ] [ A, B, C, D ] [ E, F, G, H ]
  10. @todd.ginsberg.com Parallel Evaluation [ A, B, C, D, E, F,

    G, H ] [ A, B, C, D ] [ E, F, G, H ]
  11. @todd.ginsberg.com Parallel Evaluation [ A, B, C, D, E, F,

    G, H ] [ A, B, C, D ] [ E, F, G, H ] [ A, B ] [ C, D ] [ E, F ] [ G, H ]
  12. @todd.ginsberg.com Parallel Evaluation [ A, B, C, D, E, F,

    G, H ] [ A, B, C, D ] [ E, F, G, H ] [ A, B ] [ C, D ] [ E, F ] [ G, H ] State State State State
  13. @todd.ginsberg.com Parallel Evaluation [ A, B, C, D, E, F,

    G, H ] [ A, B, C, D ] [ E, F, G, H ] [ 2 ] [ 2 ] [ 2 ] [ 2 ] State State State State
  14. @todd.ginsberg.com Parallel Evaluation [ A, B, C, D, E, F,

    G, H ] [ 4 ] [ E, F, G, H ] [ 2 ] [ 2 ] State State State
  15. @todd.ginsberg.com Anatomy Of A Stream: Intermediate Operations map() mapToInt() mapToLong()

    mapToDouble() flatMap() flatMapToInt() flatMapToLong() flatMapToDouble() mapMulti() mapMultiToInt() mapMultiToLong() mapMultiToDouble() filter() distinct() sorted() peek() limit() skip() takeWhile() dropWhile()
  16. @todd.ginsberg.com Anatomy Of A Stream: Intermediate Operations map() mapToInt() mapToLong()

    mapToDouble() flatMap() flatMapToInt() flatMapToLong() flatMapToDouble() mapMulti() mapMultiToInt() mapMultiToLong() mapMultiToDouble() filter() distinct() sorted() peek() limit() skip() takeWhile() dropWhile()
  17. @todd.ginsberg.com Anatomy Of A Stream: Intermediate Operations map() flatMap() mapMulti()

    filter() distinct() sorted() peek() limit() skip() takeWhile() dropWhile()
  18. @todd.ginsberg.com How about fold()? Stream.of(1, 2, 3, 4, 5) .fold(

    () -> 1, (acc, next) -> acc * next ).findFirst()
  19. @todd.ginsberg.com How about fold()? Stream.of(1, 2, 3, 4, 5) .fold(

    () -> 1, (acc, next) -> acc * next ).findFirst()
  20. @todd.ginsberg.com How about fold()? Stream.of(1, 2, 3, 4, 5) .fold(

    () -> 1, (acc, next) -> acc * next ).findFirst()
  21. @todd.ginsberg.com How about fold()? Stream.of(1, 2, 3, 4, 5) .fold(

    () -> 1, (acc, next) -> acc * next ).findFirst() // Optional(120)
  22. @todd.ginsberg.com How about Kotlin’s mapIndexed()? Stream.of("A", "B", "C") .mapIndexed( (idx,

    elem) -> idx + "::" + elem ).toList() // ["0::A", "1::B", "2::C"]
  23. @todd.ginsberg.com Why Not? Over the years, many new intermediate operations

    have been suggested for the Stream API. Most of them make sense when considered in isolation, but adding all of them would make the (already large) Stream API more difficult to learn because its operations would be less discoverable. -- JEP-485: Stream Gatherers
  24. @todd.ginsberg.com Why Not? Over the years, many new intermediate operations

    have been suggested for the Stream API. Most of them make sense when considered in isolation, but adding all of them would make the (already large) Stream API more difficult to learn because its operations would be less discoverable. -- JEP-485: Stream Gatherers
  25. @todd.ginsberg.com Why Not? Over the years, many new intermediate operations

    have been suggested for the Stream API. Most of them make sense when considered in isolation, but adding all of them would make the (already large) Stream API more difficult to learn because its operations would be less discoverable. -- JEP-485: Stream Gatherers
  26. @todd.ginsberg.com What Do We Mean By “Any”? Maintain state End

    of Stream Signal Short Circuit Parallel Optional
  27. @todd.ginsberg.com What Do We Mean By “Any”? Maintain state End

    of Stream Signal Finite Streams Short Circuit Parallel Optional
  28. @todd.ginsberg.com What Do We Mean By “Any”? Maintain state End

    of Stream Signal Finite Streams Short Circuit Parallel Optional Infinite Streams
  29. @todd.ginsberg.com What Do We Mean By “Any”? Maintain state End

    of Stream Signal Finite Streams Short Circuit Parallel Optional Infinite Streams Any input/output ratio
  30. @todd.ginsberg.com Can’t We Just Use Collector? Maintain state End of

    Stream Signal Finite Streams Short Circuit Parallel Optional Infinite Streams Any input/output ratio
  31. @todd.ginsberg.com Can’t We Just Use Collector? YES Maintain state End

    of Stream Signal Finite Streams NO Short Circuit Parallel Optional Infinite Streams Any input/output ratio
  32. @todd.ginsberg.com public interface Collector<IN, STATE, OUT> { Supplier<STATE> supplier(); BiConsumer<STATE,

    IN> accumulator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Start With Something We Know…
  33. @todd.ginsberg.com public interface Collector<IN, STATE, OUT> { Supplier<STATE> supplier(); BiConsumer<STATE,

    IN> accumulator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  34. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> supplier(); BiConsumer<STATE,

    IN> accumulator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  35. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> supplier(); BiConsumer<STATE,

    IN> accumulator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  36. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> supplier(); BiConsumer<STATE,

    IN> accumulator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  37. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); BiConsumer<STATE,

    IN> accumulator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  38. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); BiConsumer<STATE,

    IN> accumulator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  39. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); BiConsumer<STATE,

    IN> integrator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  40. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); BiConsumer<STATE,

    IN> integrator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  41. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  42. @todd.ginsberg.com public interface Integrator<STATE, IN, OUT> { boolean integrate( STATE

    state, IN element, Downstream<OUT> downstream ); } Gatherer Interface: Integrator
  43. @todd.ginsberg.com public interface Integrator<STATE, IN, OUT> { boolean integrate( STATE

    state, IN element, Downstream<OUT> downstream ); } Gatherer Interface: Integrator
  44. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  45. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  46. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  47. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); Function<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  48. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); BiConsumer<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  49. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); BiConsumer<STATE, OUT> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  50. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); BiConsumer<STATE, Downstream<OUT>> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  51. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); BiConsumer<STATE, Downstream<OUT>> finisher(); Set<Characteristics> characteristics(); } Gatherer Interface: Under Construction
  52. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); BiConsumer<STATE, Downstream<OUT>> finisher(); Gatherer<> andThen(Gatherer<>); } Gatherer Interface: Under Construction
  53. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); BiConsumer<STATE, Downstream<OUT>> finisher(); Gatherer<> andThen(Gatherer<>); } Gatherer Interface
  54. @todd.ginsberg.com public interface Gatherer<IN, STATE, OUT> { Supplier<STATE> initializer(); Integrator<STATE,

    IN, OUT> integrator(); BinaryOperator<STATE> combiner(); BiConsumer<STATE, Downstream<OUT>> finisher(); Gatherer<> andThen(Gatherer<>); } Gatherer Interface
  55. @todd.ginsberg.com Parallel Evaluation [ A, B, C, D, E, F,

    G, H ] [ A, B, C, D ] [ E, F, G, H ] [ A, B ] [ C, D ] [ E, F ] [ G, H ]
  56. @todd.ginsberg.com Parallel Evaluation [ A, B, C, D, E, F,

    G, H ] [ A, B, C, D ] [ E, F, G, H ] [ A, B ] [ C, D ] [ E, F ] [ G, H ] S S S S
  57. @todd.ginsberg.com Parallel Evaluation [ A, B, C, D, E, F,

    G, H ] [ A, B, C, D ] [ E, F, G, H ] [ A, B ] [ C, D ] [ E, F ] [ G, H ] S D S D S D S D
  58. @todd.ginsberg.com Gatherers4j: 60+ Implementations! crossWith debounce dedupeConsecutive dedupeConsecutiveBy distinctBy dropEveryNth

    dropLast ensureOrdered ensureOrderedBy ensureSize filterIndexed filterInstanceOf filterOrdered filterOrderedBy foldIndexed group groupBy groupOrdered groupOrderedBy interleaveWith mapIndexed movingProduct movingProductBy movingSum movingSumBy orderByFrequency peekIndexed repeat repeatInfinitely reverse rotate runningProduct runningProductBy runningSum runningSumBy sampleFixedSize samplePercentage scanIndexed shuffle simpleMovingAverage simpleMovingAverageBy simpleRunningAverage simpleRunningAverageBy takeEveryNth takeLast takeUntil throttle uniquelyOccurring window withIndex zipWith zipWithNext exponentialMovingAverageWithAlpha exponentialMovingAverageWithAlphaBy exponentialMovingAverageWithPeriod exponentialMovingAverageWithPeriodBy runningPopulationStandardDeviation runningPopulationStandardDeviationBy runningSampleStandardDeviation runningSampleStandardDeviationBy
  59. @todd.ginsberg.com Gatherers4j: Indexing crossWith debounce dedupeConsecutive dedupeConsecutiveBy distinctBy dropEveryNth dropLast

    ensureOrdered ensureOrderedBy ensureSize filterIndexed filterInstanceOf filterOrdered filterOrderedBy foldIndexed group groupBy groupOrdered groupOrderedBy interleaveWith mapIndexed movingProduct movingProductBy movingSum movingSumBy orderByFrequency peekIndexed repeat repeatInfinitely reverse rotate runningProduct runningProductBy runningSum runningSumBy sampleFixedSize samplePercentage scanIndexed shuffle simpleMovingAverage simpleMovingAverageBy simpleRunningAverage simpleRunningAverageBy takeEveryNth takeLast takeUntil throttle uniquelyOccurring window withIndex zipWith zipWithNext exponentialMovingAverageWithAlpha exponentialMovingAverageWithAlphaBy exponentialMovingAverageWithPeriod exponentialMovingAverageWithPeriodBy runningPopulationStandardDeviation runningPopulationStandardDeviationBy runningSampleStandardDeviation runningSampleStandardDeviationBy
  60. @todd.ginsberg.com Gatherers4j: Time crossWith debounce dedupeConsecutive dedupeConsecutiveBy distinctBy dropEveryNth dropLast

    ensureOrdered ensureOrderedBy ensureSize filterIndexed filterInstanceOf filterOrdered filterOrderedBy foldIndexed group groupBy groupOrdered groupOrderedBy interleaveWith mapIndexed movingProduct movingProductBy movingSum movingSumBy orderByFrequency peekIndexed repeat repeatInfinitely reverse rotate runningProduct runningProductBy runningSum runningSumBy sampleFixedSize samplePercentage scanIndexed shuffle simpleMovingAverage simpleMovingAverageBy simpleRunningAverage simpleRunningAverageBy takeEveryNth takeLast takeUntil throttle uniquelyOccurring window withIndex zipWith zipWithNext exponentialMovingAverageWithAlpha exponentialMovingAverageWithAlphaBy exponentialMovingAverageWithPeriod exponentialMovingAverageWithPeriodBy runningPopulationStandardDeviation runningPopulationStandardDeviationBy runningSampleStandardDeviation runningSampleStandardDeviationBy
  61. @todd.ginsberg.com Gatherers4j: Window crossWith debounce dedupeConsecutive dedupeConsecutiveBy distinctBy dropEveryNth dropLast

    ensureOrdered ensureOrderedBy ensureSize filterIndexed filterInstanceOf filterOrdered filterOrderedBy foldIndexed group groupBy groupOrdered groupOrderedBy interleaveWith mapIndexed movingProduct movingProductBy movingSum movingSumBy orderByFrequency peekIndexed repeat repeatInfinitely reverse rotate runningProduct runningProductBy runningSum runningSumBy sampleFixedSize samplePercentage scanIndexed shuffle simpleMovingAverage simpleMovingAverageBy simpleRunningAverage simpleRunningAverageBy takeEveryNth takeLast takeUntil throttle uniquelyOccurring window withIndex zipWith zipWithNext exponentialMovingAverageWithAlpha exponentialMovingAverageWithAlphaBy exponentialMovingAverageWithPeriod exponentialMovingAverageWithPeriodBy runningPopulationStandardDeviation runningPopulationStandardDeviationBy runningSampleStandardDeviation runningSampleStandardDeviationBy
  62. @todd.ginsberg.com Gatherers4j: Math and Statistics crossWith debounce dedupeConsecutive dedupeConsecutiveBy distinctBy

    dropEveryNth dropLast ensureOrdered ensureOrderedBy ensureSize filterIndexed filterInstanceOf filterOrdered filterOrderedBy foldIndexed group groupBy groupOrdered groupOrderedBy interleaveWith mapIndexed movingProduct movingProductBy movingSum movingSumBy orderByFrequency peekIndexed repeat repeatInfinitely reverse rotate runningProduct runningProductBy runningSum runningSumBy sampleFixedSize samplePercentage scanIndexed shuffle simpleMovingAverage simpleMovingAverageBy simpleRunningAverage simpleRunningAverageBy takeEveryNth takeLast takeUntil throttle uniquelyOccurring window withIndex zipWith zipWithNext exponentialMovingAverageWithAlpha exponentialMovingAverageWithAlphaBy exponentialMovingAverageWithPeriod exponentialMovingAverageWithPeriodBy runningPopulationStandardDeviation runningPopulationStandardDeviationBy runningSampleStandardDeviation runningSampleStandardDeviationBy
  63. @todd.ginsberg.com Gatherers4j: Enforce Constraints crossWith debounce dedupeConsecutive dedupeConsecutiveBy distinctBy dropEveryNth

    dropLast ensureOrdered ensureOrderedBy ensureSize filterIndexed filterInstanceOf filterOrdered filterOrderedBy foldIndexed group groupBy groupOrdered groupOrderedBy interleaveWith mapIndexed movingProduct movingProductBy movingSum movingSumBy orderByFrequency peekIndexed repeat repeatInfinitely reverse rotate runningProduct runningProductBy runningSum runningSumBy sampleFixedSize samplePercentage scanIndexed shuffle simpleMovingAverage simpleMovingAverageBy simpleRunningAverage simpleRunningAverageBy takeEveryNth takeLast takeUntil throttle uniquelyOccurring window withIndex zipWith zipWithNext exponentialMovingAverageWithAlpha exponentialMovingAverageWithAlphaBy exponentialMovingAverageWithPeriod exponentialMovingAverageWithPeriodBy runningPopulationStandardDeviation runningPopulationStandardDeviationBy runningSampleStandardDeviation runningSampleStandardDeviationBy
  64. @todd.ginsberg.com Gatherers4j: ..By crossWith debounce dedupeConsecutive dedupeConsecutiveBy distinctBy dropEveryNth dropLast

    ensureOrdered ensureOrderedBy ensureSize filterIndexed filterInstanceOf filterOrdered filterOrderedBy foldIndexed group groupBy groupOrdered groupOrderedBy interleaveWith mapIndexed movingProduct movingProductBy movingSum movingSumBy orderByFrequency peekIndexed repeat repeatInfinitely reverse rotate runningProduct runningProductBy runningSum runningSumBy sampleFixedSize samplePercentage scanIndexed shuffle simpleMovingAverage simpleMovingAverageBy simpleRunningAverage simpleRunningAverageBy takeEveryNth takeLast takeUntil throttle uniquelyOccurring window withIndex zipWith zipWithNext exponentialMovingAverageWithAlpha exponentialMovingAverageWithAlphaBy exponentialMovingAverageWithPeriod exponentialMovingAverageWithPeriodBy runningPopulationStandardDeviation runningPopulationStandardDeviationBy runningSampleStandardDeviation runningSampleStandardDeviationBy
  65. @todd.ginsberg.com In Summary… Maintain state ✅ Short Circuit ✅ End

    of Stream Signal ✅ Finite & Infinite Streams ✅ Parallel Optional ✅ Any input/output ratio ✅
  66. @todd.ginsberg.com In Summary… In the future… “We will not add

    a new intermediate operation to the Stream class for each of the built-in gatherers defined in the Gatherers class, even though for the sake of uniformity it is tempting to do so. ” https://openjdk.org/jeps/485
  67. @todd.ginsberg.com In Summary… In the future… “...we might revise the

    set of built-in gatherers in future releases.” https://openjdk.org/jeps/485