Save 37% off PRO during our Black Friday Sale! »

Functional data structures in Java [RigaDevDays`2017]

Functional data structures in Java [RigaDevDays`2017]

Developers know and love data structures. Applications are often full of maps, trees, heaps, queues and much more. And we rarely bother to look under the hood to understand the tradeoffs between each of the data structures.

We’ll briefly discuss what makes data structures persistent, and why making persistent data structures perform well is a challenging task to do well. You’ll understand what the amortized performance is, and how lazy evaluation can turn the tables on performance, making persistent data structures fast again.

We’ll look at several purely functional data structures implemented in Java 8 and will discuss why are they efficient and when you maybe want to prefer these to the data structure built-in into the JDK. By attending this session, you’ll feel more comfortable with functional data structures and will be more likely to succeed using functional programming for problems that involve data crunching in the future.

5d01eb7205b787b5991db85a11ee5e68?s=128

Oleg Šelajev

May 16, 2017
Tweet

Transcript

  1. Oleg Šelajev @shelajev ZeroTurnaround Functional Data Structures in Java

  2. @shelajev

  3. WHAT DOES FUNCTIONAL MEAN?

  4. None
  5. None
  6. None
  7. https://courses.csail.mit.edu/6.851/spring12/illus.png

  8. MUTABLE DATA STRUCTURES interface Collection<E> { …
 void clear(); boolean

    addAll(Collection<? extends E> c); …
 }
  9. IMMUTABLE DATA STRUCTURES List<String> list = Collections.unmodifiableList(otherList);
 
 // Hurray!


    list.add("why so serious? :)”);
  10. PERSISTENT DATA STRUCTURES

  11. (›°□°ʣ›ớ ᵲᴸᵲ no assignments persistence seems expensive

  12. HOW TO BUILD A LIST final class Cons<T> implements List<T>{


    
 private final T head;
 private final List<T> tail;
 private final int length; }
  13. HOW TO PREPEND A LIST @Override
 default List<T> prepend(T element)

    {
 return new Cons<>(element, this);
 }
  14. HOW TO PREPEND A LIST

  15. HOW TO FOLD A LIST @Override
 default <U> U foldLeft(U

    zero, BiFunction<? super U, ? super T, ? extends U> f) {
 U xs = zero;
 for (T x : this) {
 xs = f.apply(xs, x);
 }
 return xs;
 }
  16. HOW TO REVERSE A LIST @Override
 default List<T> reverse() {


    return (length() <= 1) ? this : foldLeft(empty(), List::prepend);
 }
  17. HOW TO APPEND TO A LIST @Override
 default List<T> append(T

    element) {
 return foldRight(Cons.of(element), (x, xs) -> xs.prepend(x));
 }
  18. ALGORITHMIC COMPLEXITY

  19. None
  20. Amortized complexity is the total expense per operation, evaluated over

    a sequence of operations. The idea is to guarantee the total expense of the entire sequence, while permitting individual operations to be much more expensive than the average. AMORTIZED COMPLEXITY
  21. AMORTIZED COMPLEXITY Cons cons = Cons.of(1); cons.prepend(2);
 cons.prepend(3);
 ...
 cons.prepend(n);


    cons.reverse();
  22. AMORTIZED COMPLEXITY Cons cons = Cons.of(1); // O(1)
 cons.prepend(2); //

    O(1)
 cons.prepend(3); // O(1)
 ...
 cons.prepend(n); // O(1)
 cons.reverse(); // O(n)
  23. BANKER’S METHOD Cons cons = Cons.of(1); // O(1 + ⍺)


    cons.prepend(2); // O(1 + ⍺)
 cons.prepend(3); // O(1 + ⍺)
 ...
 cons.prepend(n); // O(1 + ⍺)
 cons.reverse(); // O(n) total: O(2N + N*⍺)
  24. BANKER’S METHOD Cons cons = Cons.of(1); // O(1 + ⍺)


    cons.prepend(2); // O(1 + ⍺)
 cons.prepend(3); // O(1 + ⍺)
 ...
 cons.prepend(n); // O(1 + ⍺)
 cons.reverse(); // O(n) total: O(N*(2+⍺))
  25. BANKER’S METHOD Cons cons = Cons.of(1); // O(1 + ⍺)


    cons.prepend(2); // O(1 + ⍺)
 cons.prepend(3); // O(1 + ⍺)
 ...
 cons.prepend(n); // O(1 + ⍺)
 cons.reverse(); // O(n) total: O(N*(2+⍺)) per operation: O(1+⍺)
  26. (›°□°ʣ›ớ ᵲᴸᵲ persistence allows calling expensive ops often

  27. EVALUATION MODES Call-by-name Call-by-value Call-by-need

  28. EVALUATION MODES Call-by-name Call-by-value Call-by-need

  29. HOW TO BUILD A QUEUE?

  30. HOW TO BUILD A QUEUE? WHEN YOU HAVE A LIST?

  31. HOW TO BUILD A QUEUE public final class Queue<T> {

    
 private final List<T> queue; }
  32. HOW TO BUILD A QUEUE public final class Queue<T> {

    
 private static final Queue<?> EMPTY = new Queue<>(List.empty(), List.empty());
 
 private final List<T> front;
 private final List<T> rear; }
  33. HOW TO BUILD A QUEUE

  34. HOW TO ENQUEUE @Override
 public Queue<T> enqueue(T element) {
 return

    new Queue<>(front, rear.prepend(element));
 }
  35. HOW TO TAIL A QUEUE @Override
 public Queue<T> tail() {


    
 return new Queue<>(front.tail(), rear); 
 }
  36. HOW TO PEEK public T peek() {
 if (isEmpty()) {


    throw new NoSuchElementException(“empty");
 } else {
 return front.head(); }
 }
  37. HOW TO DEQUEUE Queue queue = Queue.of(1, 2, 3);
 


    // = (1, Queue(2, 3))
 Tuple2<Integer, Queue> dequeued = queue.dequeue();
  38. HOW TO DEQUEUE @Override public Tuple2<T, Q> dequeue() {
 if

    (isEmpty()) {
 throw new NoSuchElementException("empty");
 } else {
 return Tuple.of(head(), tail());
 }
 }
  39. WHAT IF ONLY FRONT IS EMPTY? front.isEmpty() => rear.isEmpty()

  40. WHAT IF FRONT IS EMPTY? private Queue(List<T> front, List<T> rear)

    {
 final boolean frontIsEmpty = front.isEmpty();
 this.front = frontIsEmpty ? rear.reverse() : front;
 this.rear = frontIsEmpty ? front : rear;
 }
  41. ISN’T REVERSE EXPENSIVE?

  42. HOW TO BUILD A MAP? http://www.eso-schatzsucher.de/teso/wbb/index.php/Attachment/4046-alikr-3-karte-png/

  43. HOW TO BUILD A MAP? YOU HAVE A LIST AND

    A QUEUE http://www.eso-schatzsucher.de/teso/wbb/index.php/Attachment/4046-alikr-3-karte-png/
  44. HOW TO BUILD A MAP

  45. HOW TO BUILD A MAP

  46. HOW TO BUILD A MAP slide by Rich Hickey

  47. UPDATING THE MAP slide by Rich Hickey

  48. NORMAL TREES

  49. None
  50. None
  51. None
  52. None
  53. None
  54. None
  55. http://vavr.io/

  56. None
  57. java.util.PriorityQueue<Integer> java.util.concurrent.PriorityBlockingQueue<Integer> scala.collection.mutable.PriorityQueue<Integer> scalaz.Heap<Integer> javaslang.collection.PriorityQueue<Integer> BENCHMARKS

  58. BENCHMARKS @State(Scope.Benchmark)
 public static class Base {
 
 @Param({ "10",

    "1000", "100000" })
 public int CONTAINER_SIZE;
 
 public Integer[] ELEMENTS;
 int expectedAggregate = 0;
 
 @Setup
 public void setup() {
 ELEMENTS = getRandomValues(CONTAINER_SIZE, 0);
 
 for (int element : ELEMENTS) {
 expectedAggregate ^= element;
 }
 }
 }
  59. ENQUEUE @Benchmark
 public void slang_persistent() {
 PriorityQueue<Integer> q = PriorityQueue.empty();


    for (Integer element : ELEMENTS) {
 q = q.enqueue(element);
 }
 assertEquals(q.size(), CONTAINER_SIZE);
 }
  60. DEQUEUE @Benchmark
 public void slang_persistent(Initialized state) {
 PriorityQueue<Integer> values =

    state.slangPersistent;
 
 int aggregate = 0;
 while (!values.isEmpty()) {
 Tuple2<Integer, PriorityQueue<Integer>> dequeue = values.dequeue();
 aggregate ^= dequeue._1;
 values = dequeue._2;
 }
 assertEquals(values.size(), 0);
 assertEquals(aggregate, expectedAggregate);
 }
  61. SORT @Benchmark
 public void slang_persistent() {
 PriorityQueue<Integer> values = PriorityQueue.empty();


    for (Integer element : ELEMENTS) {
 values = values.enqueue(element);
 }
 assertEquals(values.size(), CONTAINER_SIZE);
 
 int aggregate = 0;
 while (!values.isEmpty()) {
 final Tuple2<Integer, PriorityQueue<Integer>> dequeue = values.dequeue();
 aggregate ^= dequeue._1;
 values = dequeue._2;
 }
 assertEquals(values.size(), 0);
 assertEquals(aggregate, expectedAggregate);
 }
  62. Operation Ratio 10 100 1000 Dequeue slang_persistent/java_blocking_mutable 0.64× 0.61× 0.49×

    Dequeue slang_persistent/java_mutable 0.19× 0.14× 0.45× Dequeue slang_persistent/scalaz_persistent 152.74× 71.36× 16.30× Dequeue slang_persistent/scala_mutable 0.21× 0.17× 0.45× Enqueue slang_persistent/java_blocking_mutable 1.73× 0.86× 0.71× Enqueue slang_persistent/java_mutable 0.32× 0.36× 0.62× Enqueue slang_persistent/scalaz_persistent 2.43× 2.45× 2.60× Enqueue slang_persistent/scala_mutable 0.57× 0.64× 0.92× Sort slang_persistent/java_blocking_mutable 0.73× 0.32× 0.21× Sort slang_persistent/java_mutable 0.18× 0.09× 0.11× Sort slang_persistent/java_treeset_mutable 0.94× 1.35× 1.13× Sort slang_persistent/scalaz_persistent 277.27× 48.84× 10.53× Sort slang_persistent/scala_mutable 0.21× 0.14× 0.20×
  63. RESULTS (SAFE HARBOR) ~ 2-4x slower than the mutable blocking

    Java version ~ 3x slower than the mutable Scala version ~ 10x faster than the persistent Scalaz version. Vavr’s priority queue might be
  64. None
  65. https://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504

  66. What's new in purely functional data structures since Okasaki? http://cstheory.stackexchange.com/a/1550

  67. oleg@zeroturnaround.com @shelajev Find me and chat with me!