Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scala Collections: Why Not? - Paul Phillps

Scala Collections: Why Not? - Paul Phillps

NewCircle Training

January 29, 2014
Tweet

More Decks by NewCircle Training

Other Decks in Technology

Transcript

  1. – R. Buckminster Fuller “When I'm working on a problem,

    I never think about beauty. I think only how to solve the problem.” ! “But when I have finished, if the solution is not beautiful, I know it is wrong.” (syntax highlighting donated by paulp)
  2. “When I'm working on a problem, I never think about

    beauty. I think only how to solve the problem.” ! “But when I have finished, if the solution is not beautiful, I know it is wrong.” trait ParSeqViewLike[ +T, +Coll <: Parallel, +CollSeq, +This <: ParSeqView[T, Coll, CollSeq] with ParSeqViewLike[T, Coll, CollSeq, This, ThisSeq], +ThisSeq <: SeqView[T, CollSeq] with SeqViewLike[T, CollSeq, ThisSeq] ] extends GenSeqView[T, Coll] with GenSeqViewLike[T, Coll, This] with ParIterableView[T, Coll, CollSeq] with ParIterableViewLike[T, Coll, CollSeq, This, ThisSeq] with ParSeq[T] with ParSeqLike[T, This, ThisSeq] – R. Buckminster Fuller
  3. The Winding Stairway • Five years on scala • Rooting

    for scala/typesafe • But I quit a dream job... • ...because I lost faith
  4. Should you care? • I offer my credentials only to

    bear witness to my credibility • I suspect I have written more scala code than anyone else, ever. • What’s visible in compiler/library represents only a small fraction of it
  5. The early bird gets the can of worms • I

    don’t wish to make slagging a full-time gig • Tonight let’s focus less on what’s wrong with scala and more on how to do it better • Video from last night will be available if that’s your thing
  6. Is Scala too complex? • Yes. • Is anyone fooled

    by specious comparisons of language grammar size? Who cares? • Half the time when someone hits a bug they can’t tell whether it is a bug in scala or the expected behavior • That definitely includes me
  7. • A meme is going around that scala is too

    complex • Option A: Own it • Option B: Address it • Option C: Obscure it Perceived Problem Option C
  8. // A fictional idealized version of the genuine method def

    map[B](f: (A) 㱺 B): Map[B] ! // The laughably labeled "full" signature def map[B, That](f: ((A, B)) 㱺 B) (implicit bf: CanBuildFrom[Map[A, B], B, That]): That Thus is born the “use case” neither has any basis in reality!
  9. // markers to distinguish Map's class type parameters scala> class

    K ; class V defined class K, V ! scala> val host = typeOf[Map[K, V]] host: Type = Map[K,V] ! scala> val method = host member TermName("map") method: Symbol = method map ! // Correct signature for map has FOUR distinct identifiers scala> method defStringSeenAs (host memberType method) res0: String = \ def map[B, That](f: ((K, V)) => B) (implicit bf: CBF[Map[K,V],B,That]): That the true name of map
  10. • Maybe you’re thinking “So it’s a bug. Bugs get

    fixed.” • “As soon as the situation is known, of course it will be fixed? At the very least it will be marked somehow?” • Nope! Your time has no value. 44 MONTHS
  11. map “map” Signature def map[B](f: A => B): F[B] def

    map[B, That](f: A => B) (implicit bf: CanBuildFrom[Repr, B, That]): That Elegance Among the purest and most reusable abstractions known to computing science <—- Not this. Advantages Can reason abstractly about code Can map a BitSet to a BitSet without typing “toBitSet” Spokespicture Slightly Caricatured
  12. // Fancy, we get a Bitset back! scala> BitSet(1, 2,

    3) map (_.toString.toInt) res0: BitSet = BitSet(1, 2, 3) ! // Except… scala> BitSet(1, 2, 3) map (_.toString) map (_.toInt) res1: SortedSet[Int] = TreeSet(1, 2, 3) ! // Um… scala> (BitSet(1, 2, 3) map identity)(1) <console>:21: error: type mismatch; found : Int(1) required: CanBuildFrom[BitSet,Int,?] (BitSet(1, 2, 3) map identity)(1) ^ // What’s going on? The docs said map is A => B ! // The primary docs can’t be wrong, can they? The Bitset Gimmick
  13. scala> def f[T](x: T) = (x, new Object) f: [T](x:

    T)(T, Object) ! scala> SortedSet(1 to 10: _*) res0: SortedSet[Int] = TreeSet(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) ! scala> SortedSet(1 to 10: _*) map (x => f(x)._1) res1: SortedSet[Int] = TreeSet(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) ! // What happened to the “Sorted” in my SortedSet? scala> SortedSet(1 to 10: _*) map f map (_._1) res2: Set[Int] = Set(5, 10, 1, 6, 9, 2, 7, 3, 8, 4) similarly
  14. scala> val f: Int => Int = _ % 3

    f: Int => Int = <function1> ! scala> val g: Int => Int = _ => System.nanoTime % 1000000 toInt g: Int => Int = <function1> ! scala> Set(3, 6, 9) map f map g res0: Set[Int] = Set(633000) ! scala> Set(3, 6, 9) map (f andThen g) res1: Set[Int] = Set(305000, 307000, 308000) and in a similar vein
  15. Java Interop: the cruelest joke • It’s impossible to call

    scala’s map from java! • See all the grotesque details at SI-4389 “I played with it until it got too tedious. I think the signatures work fine. What does not work is that the variances of CanBuildFrom cannot be modelled in Java, so types do not match. And it seems Java does not even let me override with a cast. So short answer: You can't call these things from Java because instead of declaration side variance you have only a broken wildcard system.” ! — Martin Odersky WONTFIX
  16. • Implementation details infest everything • And every detail is

    implementation-defined • Casts (556 explicit asInstanceOf) and suppression of variance checks abound • Specificity rules render contravariance useless • Implicit selection and type inference inextricably bound - so type inference is largely frozen because any change will break existing code
  17. • Inheritance of implementation is the hammer for every nail…

    • …yet “final” and “private”, critical for a hope of correctness under inheritance, are almost unknown • All manner of behavior is ill-defined, which means users have to program defensively and compiler team cannot optimize • Can xs filter (_ => true) return xs?
  18. // WHY infer this utterly useless type? scala> List(1, 2)

    ::: List(3, 4.0) res0: List[AnyVal] = List(1, 2, 3.0, 4.0) ! scala> PspList(1, 2) ::: PspList(3, 4.0) <console>:23: error: type mismatch; found : PspList[Int] required: PspList[Double] Why is covariance such an object of worship? Types exist so we don’t have to live like this!
  19. scala> trait Ord[-A] defined trait Ord ! scala> implicit def

    x1: Ord[Any] = { println("Any") ; ??? } ; implicit def x2: Ord[List[Double]] = { println("List[Double]") ; ??? } x1: Ord[Any] x2: Ord[List[Double]] ! // An implementation isn’t the only thing missing here scala> implicitly[Ord[List[Double]]] Any scala.NotImplementedError: an implementation is missing Speaking of variance
  20. Abstracting over mutability • An inherited implementation is ALWAYS wrong

    somewhere!! • Example: how do you write "drop" so it's reusable?! • In a mutable class, drop MUST NOT share, but in an immutable class, drop MUST share! • Half the overrides in collections exist to stave off the incorrectness which looms above. This is nuts.! • Not to mention “Map”, “Set”, etc. in three namespaces
  21. % ack --no-filename 'def slice\(' src/library/ ! 1 override def

    slice(from: Int, until: Int): Iterator[A] = 2 def slice(from: Int, until: Int): Iterator[A] = { 3 def slice(from: Int, until: Int): Repr = 4 def slice(from: Int, until: Int): Repr = { 5 def slice(from: Int, until: Int): Repr = { 6 def slice(start: Int): PagedSeq[T] = slice(start, UndeterminedEnd) 7 def slice(unc_from: Int, unc_until: Int): Repr 8 override /*IterableLike*/ def slice(from: Int, until: Int): Vector[A] = 9 override /*TraversableLike*/ def slice(from: Int, until: Int): Repr = { 10 override def slice(_start: Int, _end: Int): PagedSeq[T] = { 11 override def slice(from1: Int, until1: Int): IterableSplitter[T] = 12 override def slice(from1: Int, until1: Int): SeqSplitter[T] = 13 override def slice(from: Int, until: Int) = { 14 override def slice(from: Int, until: Int) = { 15 override def slice(from: Int, until: Int): List[A] = { 16 override def slice(from: Int, until: Int): Repr = self.slice(from, until) 17 override def slice(from: Int, until: Int): Repr = { 18 override def slice(from: Int, until: Int): Stream[A] = { 19 override def slice(from: Int, until: Int): String = { 20 override def slice(from: Int, until: Int): This = 21 override def slice(from: Int, until: Int): This = 22 override def slice(from: Int, until: Int): Traversable[A] 23 override def slice(from: Int, until: Int): WrappedString = { 24 override def slice(unc_from: Int, unc_until: Int): Repr = { How many ways are there to write ‘slice’ ?
  22. scala> List(1, 2, 3).toSet() res0: Boolean = false ! scala>

    123456789.round res1: Int = 123456792 ! scala> List(1, 2, 3) contains "your mom" res2: Boolean = false ! scala> def sum[T](xs: Iterable[T]): Int = xs.map(_.hashCode).sum Some perennial favorites
  23. Two complementary ways to define Set[A]. Complementary - and NOT

    the same thing! sets Intensional Extensional Specification Membership test Members Variance Set[-A] Set[+A] Defining Signature A => Boolean Iterable[A] Size Unknowable Known Duplicates(*) Meaningless Disallowed
  24. scala> class xs[A] extends Set[A] error: class xs has 4

    unimplemented members. ! // Intensional/extensional, conflated. // Any possibility of variance eliminated. def iterator: Iterator[A] def contains(elem: A): Boolean // What are these doing in the interface? // Why can I define a Seq without them? def -(elem: A): Set[A] def +(elem: A): Set[A] What's going on here?
  25. tyranny of the interface • Mandating "def size: Int" for

    all collections is the fast track to Glacialville! • Countless times have I fixed xs.size != 0 • Collections are both worlds: all performance/ termination trap, no exploiting of size information! • A universal size method must be SAFE and CHEAP
  26. Psp Collections • So here is a little of what

    I would do differently • I realized since agreeing to this talk that I may have to go cold turkey to escape scala’s orbit. It’s just too frustrating to use. • Which means this may never go anywhere • But you can have whatever gets done
  27. trait Collections { type CC[+X] // the overarching container type

    (in scala: any covariant collection, e.g. List, Vector) type Min[+X] // least type constructor which can be reconstituted to CC[X] (scala: GenTraversableOnce) type Opt[+X] // the container type for optional results (in scala: Option) type CCPair[+X] // some representation of a divided CC[A] (at simplest, (CC[A], CC[A])) type ~>[-V1, +V2] // some means of composing operations (at simplest, Function1) ! type Iso[A] = CC[A] ~> CC[A] // e.g. filter, take, drop, reverse, etc. type Map[-A, +B] = CC[A] ~> CC[B] // e.g. map, collect type FlatMap[-A, +B] = CC[A] ~> Min[B] // e.g. flatMap type Grouped[A, DD[X]] = CC[A] ~> CC[DD[A]] // e.g. sliding type Fold[-A, +R] = CC[A] ~> R // e.g. fold, but also subsumes all operations on CC[A] type Flatten[A] = CC[Min[A]] ~> CC[A] // e.g. flatten type Build[A] = Min[A] ~> CC[A] // for use in e.g. sliding, flatMap type Pure[A] = A ~> CC[A] // we may not need ! trait Relations[A] { type MapTo[+B] = Map[A, B] // an alias incorporating the known A type FoldTo[+R] = Fold[A, R] // another one type This = CC[A] // the CC[A] under consideration type Twosome = CCPair[A] // a (CC[A], CC[A]) representation type Self = Iso[A] // a.k.a. CC[A] => CC[A], e.g. tail, filter, reverse type Select = FoldTo[A] // a.k.a. CC[A] => A, e.g. head, reduce, max type Find = FoldTo[Opt[A]] // a.k.a. CC[A] => Opt[A], e.g. find type Split = FoldTo[Twosome] // a.k.a. CC[A] => (CC[A], CC[A]), e.g. partition, span } } Conceptual Integrity
  28. “Do not multiply entities unnecessarily” • mutable / immutable •

    Seq / Set / Map • parallel / sequential • view / regular 24 Combinations!
  29. Surface Area Reduced 96% • A Set is a Seq

    without duplicates. • A Map is a Set paired with a function K => V. • A mutable collection has nothing useful in common with an immutable collection. Write your own mutable collections. • If we can’t get sequential collections right, we have no hope of parallel collections. Write your own parallel collections. • “Views” should be how it always works.
  30. scala> def f(xs: Iterable[Int]) = xs.size f: (xs: Seq[Int])Int !

    // O(1) scala> f(Set(1)) res0: Int = 1 ! // O(n) scala> f(List(1)) res1: Int = 1 ! // O(NOES) scala> f(Stream continually 1) <ctrl-C> predictability: size matters
  31. scala> val xs = Foreach from BigInt(1) xs: Foreach[BigInt] =

    unfold from 1 ! scala> xs.size <console>:22: error: value size is not a member of Foreach[BigInt] xs.size ^ ! scala> xs.sizeInfo res0: SizeInfo = <inf> ! scala> (xs.m take 10000).sizeInfo res1: SizeInfo = 10000 ! scala> (xs.m take 10000 map (_ + 1)).sizeInfo res2: SizeInfo = 10000 Don’t ask unanswerable questions unless you enjoy hearing lies
  32. // It’s 2014 and our language still allows this? scala>

    List(1, 2, 3) contains "1" res0: Boolean = false ! scala> PspList(1, 2, 3) contains "1" <console>:23: error: type mismatch; found : String("1") required: Int PspList(1, 2, 3) contains "1" ^ the joy of the invariant leaf
  33. // Us scala> timed(Indexed.to(1, 1000000).m map (_ + 1) \

    map (_ + 1) map (_ + 1) drop 999999 take 1 sum) Elapsed: 1.427 ms res0: Int = 1000003 ! // Them scala> timed((1 to 1000000).view map (_ + 1) \ map (_ + 1) map (_ + 1) drop 999999 take 1 sum) Elapsed: 201.825 ms res1: Int = 1000003 What’s a view?
  34. scala> "abc" map (_.toInt.toChar) res1: String = abc ! scala>

    "abc" map (_.toInt) map (_.toChar) res2: IndexedSeq[Char] = Vector(a, b, c) ! // psp to the rescue scala> "abc".m map (_.toInt) map (_.toChar) res3: psp.core.View[String,Char] = view of abc ! scala> res3.force res4: String = abc Save map!