Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Functional data structures in Java [Voxxed Zurich `17]

Functional data structures in Java [Voxxed Zurich `17]

Developers know and love data structures. Applications are often full of maps, trees, heaps, queues and much more. And we rarely bother to look under the hood to understand the tradeoffs between each of the data structures.

We'll briefly discuss what makes data structures persistent, and why making persistent data structures perform well is a challenging task to do well. You'll understand what the amortized performance is, and how lazy evaluation can turn the tables on performance, making persistent data structures fast again.

We'll look at several purely functional data structures implemented in Java 8 and will discuss why are they efficient and when you maybe want to prefer these to the data structure built-in into the JDK. By attending this session, you'll feel more comfortable with functional data structures and will be more likely to succeed using functional programming for problems that involve data crunching in the future.

Oleg Šelajev

February 23, 2017
Tweet

More Decks by Oleg Šelajev

Other Decks in Programming

Transcript

  1. Oleg Šelajev
    @shelajev
    ZeroTurnaround
    Functional Data Structures
    in Java

    View Slide

  2. @shelajev

    View Slide

  3. View Slide

  4. WHAT DOES FUNCTIONAL MEAN?

    View Slide

  5. View Slide

  6. View Slide

  7. https://courses.csail.mit.edu/6.851/spring12/illus.png

    View Slide

  8. MUTABLE DATA STRUCTURES
    interface Collection {
    …

    void clear();
    boolean addAll(Collection extends E> c);
    …

    }

    View Slide

  9. IMMUTABLE DATA STRUCTURES
    List list =
    Collections.unmodifiableList(otherList);


    // Hurray!

    list.add("why so serious? :)”);

    View Slide

  10. PERSISTENT DATA STRUCTURES

    View Slide

  11. FUNCTIONAL DATA STRUCTURES

    View Slide

  12. (›°□°ʣ›ớ ᵲᴸᵲ
    NO ASSIGNMENTS
    PERSISTENCE

    View Slide

  13. HOW TO BUILD A LIST
    final class Cons implements List{


    private final T head;

    private final List tail;

    private final int length;
    }

    View Slide

  14. HOW TO PREPEND A LIST
    @Override

    default List prepend(T element) {

    return new Cons<>(element, this);

    }

    View Slide

  15. HOW TO PREPEND A LIST

    View Slide

  16. HOW TO FOLD A LIST
    @Override

    default U foldLeft(U zero,
    BiFunction super U, ? super T, ? extends U> f) {

    U xs = zero;

    for (T x : this) {

    xs = f.apply(xs, x);

    }

    return xs;

    }

    View Slide

  17. HOW TO REVERSE A LIST
    @Override

    default List reverse() {

    return (length() <= 1) ?
    this :
    foldLeft(empty(), List::prepend);

    }

    View Slide

  18. HOW TO APPEND TO A LIST
    @Override

    default List append(T element) {

    return foldRight(Cons.of(element),
    (x, xs) -> xs.prepend(x));

    }

    View Slide

  19. ALGORITHMIC COMPLEXITY

    View Slide

  20. View Slide

  21. Amortized complexity is the total expense per operation,
    evaluated over a sequence of operations.
    The idea is to guarantee the total expense of the entire sequence,
    while permitting individual operations to be much more expensive
    than the average.
    AMORTIZED COMPLEXITY

    View Slide

  22. AMORTIZED COMPLEXITY
    Cons cons = Cons.of(1);
    cons.prepend(2);

    cons.prepend(3);

    ...

    cons.prepend(n);

    cons.reverse();

    View Slide

  23. AMORTIZED COMPLEXITY
    Cons cons = Cons.of(1); // O(1)

    cons.prepend(2); // O(1)

    cons.prepend(3); // O(1)

    ...

    cons.prepend(n); // O(1)

    cons.reverse(); // O(n)

    View Slide

  24. BANKER’S METHOD
    Cons cons = Cons.of(1); // O(1 + ⍺)

    cons.prepend(2); // O(1 + ⍺)

    cons.prepend(3); // O(1 + ⍺)

    ...

    cons.prepend(n); // O(1 + ⍺)

    cons.reverse(); // O(n)
    TOTAL: O(2N + N*⍺)

    View Slide

  25. BANKER’S METHOD
    Cons cons = Cons.of(1); // O(1 + ⍺)

    cons.prepend(2); // O(1 + ⍺)

    cons.prepend(3); // O(1 + ⍺)

    ...

    cons.prepend(n); // O(1 + ⍺)

    cons.reverse(); // O(n)
    TOTAL: O(N*(2+⍺))

    View Slide

  26. BANKER’S METHOD
    Cons cons = Cons.of(1); // O(1 + ⍺)

    cons.prepend(2); // O(1 + ⍺)

    cons.prepend(3); // O(1 + ⍺)

    ...

    cons.prepend(n); // O(1 + ⍺)

    cons.reverse(); // O(n)
    TOTAL: O(N*(2+⍺))
    PER OPERATION: O(1+⍺)

    View Slide

  27. (›°□°ʣ›ớ ᵲᴸᵲ
    PERSISTENCE ALLOWS CALLING
    EXPENSIVE OPS
    OFTEN

    View Slide

  28. MODES
    STRICT
    LAZY WITHOUT MEMOIZATION
    LAZY WITH MEMOIZATION

    View Slide

  29. HOW TO BUILD A QUEUE?

    View Slide

  30. HOW TO BUILD A QUEUE?
    WHEN YOU HAVE A LIST?

    View Slide

  31. HOW TO BUILD A QUEUE
    public final class Queue {

    private final List queue;
    }

    View Slide

  32. HOW TO BUILD A QUEUE
    public final class Queue {

    private static final Queue> EMPTY =
    new Queue<>(List.empty(), List.empty());


    private final List front;

    private final List rear;
    }

    View Slide

  33. HOW TO BUILD A QUEUE

    View Slide

  34. HOW TO ENQUEUE
    @Override

    public Queue enqueue(T element) {

    return new Queue<>(front,rear.prepend(element));

    }

    View Slide

  35. HOW TO TAIL A QUEUE
    @Override

    public Queue tail() {


    return new Queue<>(front.tail(), rear);

    }

    View Slide

  36. HOW TO PEEK
    public T peek() {

    if (isEmpty()) {

    throw new NoSuchElementException(“empty");

    } else {

    return front.head();
    }

    }

    View Slide

  37. HOW TO DEQUEUE
    Queue queue = Queue.of(1, 2, 3);


    // = (1, Queue(2, 3))

    Tuple2 dequeued = queue.dequeue();

    View Slide

  38. HOW TO DEQUEUE
    @Override
    public Tuple2 dequeue() {

    if (isEmpty()) {

    throw new NoSuchElementException("empty");

    } else {

    return Tuple.of(head(), tail());

    }

    }

    View Slide

  39. WHAT IF ONLY FRONT IS EMPTY?
    front.isEmpty() => rear.isEmpty()

    View Slide

  40. WHAT IF FRONT IS EMPTY?
    private Queue(List front, List rear) {

    final boolean frontIsEmpty = front.isEmpty();

    this.front = frontIsEmpty ?
    rear.reverse() : front;

    this.rear = frontIsEmpty ? front : rear;

    }

    View Slide

  41. ISN’T REVERSE EXPENSIVE?

    View Slide

  42. HOW TO BUILD A MAP?
    http://www.eso-schatzsucher.de/teso/wbb/index.php/Attachment/4046-alikr-3-karte-png/

    View Slide

  43. HOW TO BUILD A MAP?
    YOU HAVE A LIST AND A QUEUE
    http://www.eso-schatzsucher.de/teso/wbb/index.php/Attachment/4046-alikr-3-karte-png/

    View Slide

  44. HOW TO BUILD A MAP

    View Slide

  45. HOW TO BUILD A MAP

    View Slide

  46. HOW TO BUILD A MAP
    slide by Rich Hickey

    View Slide

  47. UPDATING THE MAP
    slide by Rich Hickey

    View Slide

  48. NORMAL TREES

    View Slide

  49. View Slide

  50. View Slide

  51. View Slide

  52. View Slide

  53. View Slide

  54. View Slide

  55. http://www.javaslang.io/

    View Slide

  56. View Slide

  57. java.util.PriorityQueue
    java.util.concurrent.PriorityBlockingQueue
    scala.collection.mutable.PriorityQueue
    scalaz.Heap
    javaslang.collection.PriorityQueue
    BENCHMARKS

    View Slide

  58. BENCHMARKS
    @State(Scope.Benchmark)

    public static class Base {


    @Param({ "10", "1000", "100000" })

    public int CONTAINER_SIZE;


    public Integer[] ELEMENTS;

    int expectedAggregate = 0;


    @Setup

    public void setup() {

    ELEMENTS = getRandomValues(CONTAINER_SIZE, 0);


    for (int element : ELEMENTS) {

    expectedAggregate ^= element;

    }

    }

    }

    View Slide

  59. ENQUEUE
    @Benchmark

    public void slang_persistent() {

    PriorityQueue q = PriorityQueue.empty();

    for (Integer element : ELEMENTS) {

    q = q.enqueue(element);

    }

    assertEquals(q.size(), CONTAINER_SIZE);

    }

    View Slide

  60. DEQUEUE
    @Benchmark

    public void slang_persistent(Initialized state) {

    PriorityQueue values = state.slangPersistent;


    int aggregate = 0;

    while (!values.isEmpty()) {

    Tuple2> dequeue =
    values.dequeue();

    aggregate ^= dequeue._1;

    values = dequeue._2;

    }

    assertEquals(values.size(), 0);

    assertEquals(aggregate, expectedAggregate);

    }

    View Slide

  61. SORT
    @Benchmark

    public void slang_persistent() {

    PriorityQueue values = PriorityQueue.empty();

    for (Integer element : ELEMENTS) {

    values = values.enqueue(element);

    }

    assertEquals(values.size(), CONTAINER_SIZE);


    int aggregate = 0;

    while (!values.isEmpty()) {

    final Tuple2> dequeue = values.dequeue();

    aggregate ^= dequeue._1;

    values = dequeue._2;

    }

    assertEquals(values.size(), 0);

    assertEquals(aggregate, expectedAggregate);

    }

    View Slide

  62. Operation Ratio 10 100 1000
    Dequeue slang_persistent/java_blocking_mutable 0.64× 0.61× 0.49×
    Dequeue slang_persistent/java_mutable 0.19× 0.14× 0.45×
    Dequeue slang_persistent/scalaz_persistent 152.74× 71.36× 16.30×
    Dequeue slang_persistent/scala_mutable 0.21× 0.17× 0.45×
    Enqueue slang_persistent/java_blocking_mutable 1.73× 0.86× 0.71×
    Enqueue slang_persistent/java_mutable 0.32× 0.36× 0.62×
    Enqueue slang_persistent/scalaz_persistent 2.43× 2.45× 2.60×
    Enqueue slang_persistent/scala_mutable 0.57× 0.64× 0.92×
    Sort slang_persistent/java_blocking_mutable 0.73× 0.32× 0.21×
    Sort slang_persistent/java_mutable 0.18× 0.09× 0.11×
    Sort slang_persistent/java_treeset_mutable 0.94× 1.35× 1.13×
    Sort slang_persistent/scalaz_persistent 277.27× 48.84× 10.53×
    Sort slang_persistent/scala_mutable 0.21× 0.14× 0.20×

    View Slide

  63. RESULTS (SAFE HARBOR)
    ~ 2-4x slower than the mutable blocking Java version
    ~ 3x slower than the mutable Scala version
    ~ 10x faster than the persistent Scalaz version.
    SLANG PRIORITY QUEUE MIGHT BE

    View Slide

  64. View Slide

  65. https://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504

    View Slide

  66. What's new in purely functional data
    structures since Okasaki?
    http://cstheory.stackexchange.com/a/1550

    View Slide

  67. http://bit.ly/xrebel-vdz17

    View Slide

  68. [email protected]
    @shelajev
    Find me and chat with me!

    View Slide