Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Speed, Correctness, or Simplicity: Choose 3

Speed, Correctness, or Simplicity: Choose 3

This talk introduces the floating point filter implementation in Spire (spire.math.FpFilter).

Tom Switzer

January 30, 2015
Tweet

Other Decks in Programming

Transcript

  1. Speed,  Correctness,  or  Simplicity:    
    Choose  3  
    Tom  Switzer  
    @9xxit  
     
    h;ps://github.com/9xxit/fpfilter-­‐talk  

    View Slide

  2. Overview  
     
    Floa9ng  point  is  “good  enough”…  
     
    most  of  the  9me.  

    View Slide

  3. Op9ons  
    Use  Double,  live  with  the  errors.  
     
    Use  higher  precision  type,  
    live  with  performance  loss.  
     
    But,  there  is  a  3rd  op9on…  

    View Slide

  4. Floa9ng  Point  Filters  
     
    Use  floa9ng  point  when  you  can.  
     
    Use  higher  precision  when  you  can’t.  

    View Slide

  5. Err…  Not  So  Simple  
    Solve  problem  using  floa9ng  point  
    approxima9on…  
     
    Maintain  an  error  bound  on  approxima9on.  
     
    Re-­‐evaluate  with  exact  type  if  error  too  large.  

    View Slide

  6. The  Catch  

    View Slide

  7.  
     
    What  is  the  determinant  of  my  matrix?  

    View Slide

  8. Not  Good  For:  Minimizing  Errors  
    in  Floa9ng  Point  Arithme9c  

    View Slide

  9.  
     
    What  is  the  sign  of  the  determinant  of  my  
    matrix?  

    View Slide

  10. Good  For:  Making  a  Decision  

    View Slide

  11. FpFilter[A]  
    Simple  wrapper:  FpFilter[Rational]
    Standard  Opera2ons  
    +,  -­‐,  *,  /,  .sqrt,  etc  
     
    Fast  predictes  
    signum,  compare,  isWhole,  etc.  

    View Slide

  12. FpFilter[A]  
    class FpFilter[A](
    apx: Double,
    mes: Double,
    ind: Int,
    exact: => A
    ) {

    }
    floa9ng  point  approxima9on  
    error  bounds  

    View Slide

  13. FpFilter[A]  
    class FpFilter[A](
    apx: Double,
    mes: Double,
    ind: Int,
    exact: => A
    ) {

    }
    error  bounds  
    “Exact  Geometric  Computa2on  
    Using  Cascading.”  
    Burnikel,  Funke  &  Seel.  

    View Slide

  14. FpFilter[A]  
    class FpFilter[A](
    apx: Double,
    mes: Double,
    ind: Int,
    exact: => A
    ) {

    }
    error  bounds  
    thunk  for  higher  precision  
    Welcome  to  …  

    View Slide

  15. …  Macro  City  
    def abs(implicit ev: Signed[A]): FpFilter[A] = macro FpFilter.absImpl[A]
    def unary_- (implicit ev: Rng[A]) : FpFilter[A] = macro FpFilter.negateImpl[A]
    def +(rhs: FpFilter[A])(implicit ev: Semiring[A]): FpFilter[A] = macro FpFilter.plusImpl[A]
    def -(rhs: FpFilter[A])(implicit ev: Rng[A]): FpFilter[A] = macro FpFilter.minusImpl[A]
    def *(rhs: FpFilter[A])(implicit ev: Semiring[A]): FpFilter[A] = macro FpFilter.timesImpl[A]
    def /(rhs: FpFilter[A])(implicit ev: Field[A]): FpFilter[A] = macro FpFilter.divideImpl[A]
    def sqrt(implicit ev: NRoot[A]): FpFilter[A] = macro FpFilter.sqrtImpl[A]
    def <(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.ltImpl[A]
    def >(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.gtImpl[A]
    def <=(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.ltEqImpl[A]
    def >=(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.gtEqImpl[A]
    def ===(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.eqImpl[A]
    def signum(implicit ev: Signed[A]): Int = macro FpFilter.signImpl[A]

    View Slide

  16. …  Macro  City  
    •  Operator  fusion  
    – No  intermediate  alloca9ons  
    •  In-­‐line  approxima9on  +  error  bounds  
    – Fast,  Double  arithme9c  
    •  Thunk  becomes  inner  defs  
    – Compile  down  to  private  methods  

    View Slide

  17. Turn  this…  
    (x + y).signum

    View Slide

  18. …  into  this.  
    val fpf$tmp$macro$38 = x.value;
    val fpf$apx$macro$39 = fpf$tmp$macro$38;
    val fpf$mes$macro$40 = java.lang.Math.abs(fpf$tmp$macro$38);
    def fpf$exact$macro$42 = spire.algebra.Field.apply[spire.math.Algebraic](Algebraic.AlgebraicAlgebra).fromDouble(fpf
    $tmp$macro$38);
    val fpf$tmp$macro$43 = y.value;
    val fpf$apx$macro$44 = fpf$tmp$macro$43;
    val fpf$mes$macro$45 = java.lang.Math.abs(fpf$tmp$macro$43);
    def fpf$exact$macro$47 = spire.algebra.Field.apply[Algebraic](Algebraic.AlgebraicAlgebra).fromDouble(fpf$tmp$macro
    $43);
    val fpf$apx$macro$48 = fpf$apx$macro$39.+(fpf$apx$macro$44);
    val fpf$mes$macro$49 = fpf$mes$macro$40.+(fpf$mes$macro$45);
    def fpf$exact$macro$51 = Algebraic.AlgebraicAlgebra.plus(
    fpf$exact$macro$42, fpf$exact$macro$47);
    val fpf$err$macro$52 = fpf$mes$macro$49.$times(1).$times(2.220446049250313E-16);
    if (fpf$apx$macro$48 > fpf$err$macro$52 && fpf$apx$macro$48 < Double.POSITIVE_INFINITY)
    1
    else
    if (fpf$apx$macro$48 < fpf$err$macro$52.unary_$minus && fpf$apx$macro$48 > Double.NEGATIVE_INFINITY)
    -1
    else
    if (fpf$err$macro$52 == 0.0)
    0
    else
    Algebraic.AlgebraicAlgebra.signum(fpf$exact$macro$51)

    View Slide

  19. Examples  

    View Slide

  20. 2D  Orienta2on  

    View Slide

  21. p  
    q  
    r  

    View Slide

  22. p  
    q  
    r  

    View Slide

  23. p  
    q  
    r  
    RIGHT  

    View Slide

  24. p  
    r  
    q  

    View Slide

  25. p  
    r  
    q  
    LEFT  

    View Slide

  26. p  
    r  
    q  

    View Slide

  27. p  
    r  
    q  
    NO  TURN  

    View Slide

  28. trait Turn[@spec A] {
    def apply(
    px: A, py: A,
    qx: A, qy: A,
    rx: A, ry: A
    ): Int
    }

    View Slide

  29. object FastTurn extends
    Turn[Double] {
    def apply(
    px: Double, py: Double,
    qx: Double, qy: Double,
    rx: Double, ry: Double
    ): Int = signum {
    (qx - px) * (ry - py) -
    (rx - px) * (qy - py)
    }
    }

    View Slide

  30. Accuracy  of  Fast  Turn  

    View Slide

  31. object ExactTurn extends Turn[Double] {
    def apply(
    px: Double, py: Double,
    qx: Double, qy: Double,
    rx: Double, ry: Double
    ): Int = {
    val pxa = Algebraic(px)
    val pya = Algebraic(py)
    val qxa = Algebraic(qx)
    val qya = Algebraic(qy)
    val rxa = Algebraic(rx)
    val rya = Algebraic(ry)
    ((qxa - pxa) * (rya - pya) –
    (rxa - pxa) * (qya - pya)).signum
    }
    }

    View Slide

  32. 10,000x  Slower!  

    View Slide

  33. Let’s  try  again…  

    View Slide

  34. object FilteredTurn extends Turn[Double] {
    def apply(
    px: Double, py: Double,
    qx: Double, qy: Double,
    rx: Double, ry: Double
    ): Int = {
    val pxf = FpFilter.exact[Algebraic](px)
    val pyf = FpFilter.exact[Algebraic](py)
    val qxf = FpFilter.exact[Algebraic](qx)
    val qyf = FpFilter.exact[Algebraic](qy)
    val rxf = FpFilter.exact[Algebraic](rx)
    val ryf = FpFilter.exact[Algebraic](ry)
    ((qxf - pxf) * (ryf - pyf) –
    (rxf - pxf) * (qyf - pyf)).signum
    }
    }

    View Slide

  35. FilteredTurn  Speed  
    Rela9ve  to  FastTurn  

    View Slide

  36. Polynomial  Root  Finding  

    View Slide

  37. Polynomial[A]  

    View Slide

  38. Interval[A]  
    Root  

    View Slide

  39. “Quadra2c  Interval  Refinement  for  
    Real  Roots.”  John  AbboT.  
    QIR  for  short.  

    View Slide

  40. QIR  (N=8)  

    View Slide

  41. QIR  (N=8)  

    View Slide

  42. QIR  (N=8)  

    View Slide

  43. QIR  (N=8)  

    View Slide

  44. QIR  (N=8)  

    View Slide

  45. QIR  (N=8)  

    View Slide

  46. QIR  (N=8)  

    View Slide

  47. QIR  (N=8)  

    View Slide

  48. QIR  (N=16)  

    View Slide

  49. QIR  (N=16)  

    View Slide

  50. QIR  (N=16)  

    View Slide

  51. QIR  
    •  Requires  2  polynomial  evalua9ons  
    – High  precision  generally  required  
    •  Very  fast  convergence  (quadra9c)  
    – Under  some  assump9ons  
    •  Occasionally  fails  when  assump9ons  not  met  
    – Fallsback  to  bisec9on!  

    View Slide

  52. QIR  (N=8)  

    View Slide

  53. QIR  (N=8)  

    View Slide

  54. QIR  (N=8)  
    FAILED!  

    View Slide

  55. Falls  back  to  bisec9on…  

    View Slide

  56. Bisec9on  

    View Slide

  57. Bisec9on  

    View Slide

  58. Bisec9on  

    View Slide

  59. Bisec9on  

    View Slide

  60. Bisec9on  

    View Slide

  61. Bisec9on  

    View Slide

  62. Bisec9on  

    View Slide

  63. Bisec9on  
     
    Requires  only  sign  tests  
     
    Converges  slowly,  1  bit  at  a  9me!  

    View Slide

  64. trait SignTest[A] {
    def apply(
    poly: Polynomial[A],
    x: A
    ): Sign
    }

    View Slide

  65. final class FilteredSignTest[@sp A: Semiring](
    implicit A: IsAlgebraic[A]
    ) extends SignTest[A] {
    def apply(poly: Polynomial[A], x: A): Sign = {
    val x0 = FpFilter(A.toDouble(x), A.toAlgebraic(x))
    @tailrec
    def loop(acc: FpFilter[Algebraic], i: Int): Sign =
    if (i >= 0) {
    val c = poly.nth(i)
    val cftr = FpFilter(A.toDouble(c),
    A.toAlgebraic(c))
    loop(cftr + acc * x0, i - 1)
    } else {
    Sign(acc.signum)
    }
    loop(FpFilter.approx(Algebraic.Zero), poly.degree)
    }
    }

    View Slide

  66. Accuracy  Using  Double  

    View Slide

  67. Speed  Up  from  Exact  Sign  Test  
    Fast  (d=8)  
    Fast  (d=16)  
    Fast  (d=32)  
    Filtered  (d=8)  
    Filtered  (d=16)  
    Filtered  (d=16)  

    View Slide

  68. Summary  
    •  Works  like  any  other  number  type  
    – Operator  fusion  +  inlining  within  expressions  
    •  Speeds  up  predicates  
    – Sign  tests,  comparisons,  etc.  
    •  Near-­‐Double  performance  
    – 2-­‐4x  in  most  cases  
    h;p://github.com/9xxit/fpfilter-­‐talk  

    View Slide

  69. Thanks!  
    h;p://github.com/non/spire  
     
    Tom  Switzer  @9xxit  

    View Slide