Introduction of performance tips of Scala.
This presentation performed at ScalaMatsuri2017 in Japan.
[petitviolet/scalamatsuri2017: sample code project for ScalaMatsuri2017 presentation](https://github.com/petitviolet/scalamatsuri2017/tree/master)
To build `responsive` system, it’s necessary and important to improve performance even little. そうやって積み重ねて来たScalaで出来る改善の 一部を紹介します I will introduce a few performance tips on implementing such ad-network service.
beginners. • The topics covered are not surprising, but are basics we should all learn. • These will help you when performance is not as you expected. 知っている人には当たり前のことだが、知っておきたい話
Seq[Int] = (1 to NUM).map { _ => Random.nextInt(NUM) } import scala.util.Sorting Sorting.stableSort(target) Sortingクラスもある scala.util.Sorting can be used to sort sequence.
val N: Int = ??? val func: Int => Int = ??? val seq: Seq[Int] = (1 to N).toSeq val set: Set[Int] = (1 to N).toSet seq map func set map func SeqとSetのmapでどれくらい速度が違うでしょう? • How much performance will be different?
or 10000 val seq: Seq[Int] = (1 to N).toSeq val set: Set[Int] = (1 to N).toSet seq map { _ % 2 } seq map { _ % 1000 } // set size N to 2 set map { _ % 2 } // set size N to 1000 set map { _ % 1000 } SeqとSetのmapでどれくらい速度が違うでしょう?
possible • call `hashCode` and `equals` to eliminate duplication • https://gist.github.com/petitviolet/dc5b44fae57277ae915bd770ba4f2435 • also flatMap, collect, etc. • Collections - Performance Characteristics • http://docs.scala-lang.org/overviews/collections/performance-characteristics • `Set#contains` is faster than `Seq` Setであるための計算が裏側に隠れて遅くなってしまうので注意
`:: Nil` just create new instance • `Seq()` and `List()` append to `ListBuffer` :: Nilはただnewしてるだけだから速い 1 :: Nil == new ::(1, Nil) val b = newBuilder[A] // mutable.ListBuffer b ++= elems b.result()
the first argument is different. def byName(value: => String, flag: Boolean): String = if (flag) value else "" def byValue(value: String, flag: Boolean): String = if (flag) value else ""
Logger, isDebugEnabled: Boolean) { def debug(msg: => String) = if (isDebugEnabled) logger.debug(msg) def info(msg: => String) = logger.info(msg) } Call-by-name can be used to implement, such as a logger.
instance will be represented as its `val` filed type. Therefore, `Awesome` instance is just an `Int`. class Awesome(val value: Int) extends AnyVal { def double = value * 2 }
a performance point of view? object Normal { case class User(id: Id, name: Name) case class Id(value: Long) case class Name(value: String) } Normal.User(Normal.Id(1L), Normal.Name("hoge")) object Value { // not Value-class case class UserValue(id: IdValue, name: NameValue) case class IdValue(value: Long) extends AnyVal case class NameValue(value: String) extends AnyVal } Value.UserValue(Value.IdValue(1L), Value.NameValue("hoge"))
instantiation ops/sec Normal class Value class Normal.User(Normal.Id(1L), Normal.Name("hoge")) Value.UserValue(Value.IdValue(1L), Value.NameValue("hoge"))
object Value { // not Value class case class UserValue(id: IdValue, name: NameValue) case class IdValue(value: Long) extends AnyVal case class NameValue(value: String) extends AnyVal } Value.UserValue(Value.IdValue(1L), Value.NameValue("hoge")) implicit class AwesomeInt(val n: Int) extends AnyVal { def double = n * 2 } In addition, value-class helps implicit conversion performance.
= for { a <- Future { sleep(100); println("finish 100"); 100 } b <- Future { sleep(50); println("finish 50"); 50 } } yield a + b result.onComplete(println) finish 100 finish 50 150 Executed in order from top.
val aF = Future { sleep(100); 100 } val bF = Future { sleep(50); 50 } val result: Future[Int] = for { a <- aF b <- bF } yield a + b Pay attention to the timing to call `Future.apply`. // use `Future#zip` val result: Future[Int] = for { (a, b) <- Future { sleep(100); 100 } zip Future { sleep(50); 50 } } yield a + b
• ThreadLocalRandom is faster than Random • Avoid to use Set#map as possible • Use `:: Nil` to create `Seq[T]` • Call-by-name make an arguments lazy • Value-class is useful in various situation • Be careful about timing of `Future.apply` まとめ
Seq#sorted • https://gist.github.com/petitviolet/07f8459f8a40afd54bea00343548b080 • ThreadLocalRandom is faster than Random • https://gist.github.com/petitviolet/89886339e028779172a293a01e18d8f0 • Avoid to use Set#map as possible • https://gist.github.com/petitviolet/dc5b44fae57277ae915bd770ba4f2435 • Use `:: Nil` to create `Seq[T]` • https://gist.github.com/petitviolet/b67d63aad1a23350f5fa5266d077efca • Call-by-name make an arguments lazy • https://gist.github.com/petitviolet/bb0f14aad27021c46efecd972690d9b3 • Value-class is useful in various situation • https://gist.github.com/petitviolet/026979105dbe447c93d47a778a604619 • Be careful about timing of `Future.apply` • https://gist.github.com/petitviolet/d5ed010ce4ce12bc250b3f259c97ff02 もうちょっと詳しく
there should be other things to do. • e.g.) caching to avoid I/O, RDBMS index,… • Faster is better than slower • May such tips help you tipsに頼る前に何かしらやるべきことはある 遅いよりは速い方が良いので、tipsでも何かしら役に立てば