Speed, Correctness, or Simplicity: Choose 3

Speed, Correctness, or Simplicity: Choose 3 Tom
Switzer @9xxit h;ps://github.com/9xxit/fpﬁlter-‐talk

Overview Floa9ng point is “good enough”…
most of the 9me.

Op9ons Use Double, live with the errors.
Use higher precision type, live with performance loss. But, there is a 3rd op9on…

Floa9ng Point Filters Use ﬂoa9ng point when you
can. Use higher precision when you can’t.

Err… Not So Simple Solve problem using ﬂoa9ng point
approxima9on… Maintain an error bound on approxima9on. Re-‐evaluate with exact type if error too large.

The Catch

What is the determinant of my matrix?

Not Good For: Minimizing Errors in Floa9ng Point Arithme9c

What is the sign of the determinant of
my matrix?

Good For: Making a Decision

FpFilter[A] Simple wrapper: FpFilter[Rational] Standard Opera2ons +, -‐,
*, /, .sqrt, etc Fast predictes signum, compare, isWhole, etc.

FpFilter[A] class FpFilter[A]( apx: Double, mes: Double, ind: Int,
exact: => A ) { … } ﬂoa9ng point approxima9on error bounds

exact: => A ) { … } error bounds “Exact Geometric Computa2on Using Cascading.” Burnikel, Funke & Seel.

exact: => A ) { … } error bounds thunk for higher precision Welcome to …

… Macro City def abs(implicit ev: Signed[A]): FpFilter[A] =
macro FpFilter.absImpl[A] def unary_- (implicit ev: Rng[A]) : FpFilter[A] = macro FpFilter.negateImpl[A] def +(rhs: FpFilter[A])(implicit ev: Semiring[A]): FpFilter[A] = macro FpFilter.plusImpl[A] def -(rhs: FpFilter[A])(implicit ev: Rng[A]): FpFilter[A] = macro FpFilter.minusImpl[A] def *(rhs: FpFilter[A])(implicit ev: Semiring[A]): FpFilter[A] = macro FpFilter.timesImpl[A] def /(rhs: FpFilter[A])(implicit ev: Field[A]): FpFilter[A] = macro FpFilter.divideImpl[A] def sqrt(implicit ev: NRoot[A]): FpFilter[A] = macro FpFilter.sqrtImpl[A] def <(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.ltImpl[A] def >(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.gtImpl[A] def <=(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.ltEqImpl[A] def >=(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.gtEqImpl[A] def ===(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.eqImpl[A] def signum(implicit ev: Signed[A]): Int = macro FpFilter.signImpl[A]

… Macro City •  Operator fusion – No intermediate
alloca9ons •  In-‐line approxima9on + error bounds – Fast, Double arithme9c •  Thunk becomes inner defs – Compile down to private methods

Turn this… (x + y).signum

… into this. val fpf$tmp$macro$38 = x.value; val fpf$apx$macro$39
= fpf$tmp$macro$38; val fpf$mes$macro$40 = java.lang.Math.abs(fpf$tmp$macro$38); def fpf$exact$macro$42 = spire.algebra.Field.apply[spire.math.Algebraic](Algebraic.AlgebraicAlgebra).fromDouble(fpf $tmp$macro$38); val fpf$tmp$macro$43 = y.value; val fpf$apx$macro$44 = fpf$tmp$macro$43; val fpf$mes$macro$45 = java.lang.Math.abs(fpf$tmp$macro$43); def fpf$exact$macro$47 = spire.algebra.Field.apply[Algebraic](Algebraic.AlgebraicAlgebra).fromDouble(fpf$tmp$macro $43); val fpf$apx$macro$48 = fpf$apx$macro$39.+(fpf$apx$macro$44); val fpf$mes$macro$49 = fpf$mes$macro$40.+(fpf$mes$macro$45); def fpf$exact$macro$51 = Algebraic.AlgebraicAlgebra.plus( fpf$exact$macro$42, fpf$exact$macro$47); val fpf$err$macro$52 = fpf$mes$macro$49.$times(1).$times(2.220446049250313E-16); if (fpf$apx$macro$48 > fpf$err$macro$52 && fpf$apx$macro$48 < Double.POSITIVE_INFINITY) 1 else if (fpf$apx$macro$48 < fpf$err$macro$52.unary_$minus && fpf$apx$macro$48 > Double.NEGATIVE_INFINITY) -1 else if (fpf$err$macro$52 == 0.0) 0 else Algebraic.AlgebraicAlgebra.signum(fpf$exact$macro$51)

Examples

2D Orienta2on

p q r RIGHT

p r q LEFT

p r q NO TURN

trait Turn[@spec A] { def apply( px: A, py: A,
qx: A, qy: A, rx: A, ry: A ): Int }

object FastTurn extends Turn[Double] { def apply( px: Double, py:
Double, qx: Double, qy: Double, rx: Double, ry: Double ): Int = signum { (qx - px) * (ry - py) - (rx - px) * (qy - py) } }

Accuracy of Fast Turn

object ExactTurn extends Turn[Double] { def apply( px: Double, py:
Double, qx: Double, qy: Double, rx: Double, ry: Double ): Int = { val pxa = Algebraic(px) val pya = Algebraic(py) val qxa = Algebraic(qx) val qya = Algebraic(qy) val rxa = Algebraic(rx) val rya = Algebraic(ry) ((qxa - pxa) * (rya - pya) – (rxa - pxa) * (qya - pya)).signum } }

10,000x Slower!

Let’s try again…

object FilteredTurn extends Turn[Double] { def apply( px: Double, py:
Double, qx: Double, qy: Double, rx: Double, ry: Double ): Int = { val pxf = FpFilter.exact[Algebraic](px) val pyf = FpFilter.exact[Algebraic](py) val qxf = FpFilter.exact[Algebraic](qx) val qyf = FpFilter.exact[Algebraic](qy) val rxf = FpFilter.exact[Algebraic](rx) val ryf = FpFilter.exact[Algebraic](ry) ((qxf - pxf) * (ryf - pyf) – (rxf - pxf) * (qyf - pyf)).signum } }

FilteredTurn Speed Rela9ve to FastTurn

Polynomial Root Finding

Polynomial[A]

Interval[A] Root

“Quadra2c Interval Reﬁnement for Real Roots.” John AbboT.
QIR for short.

QIR (N=8)

QIR (N=16)

QIR •  Requires 2 polynomial evalua9ons – High precision
generally required •  Very fast convergence (quadra9c) – Under some assump9ons •  Occasionally fails when assump9ons not met – Fallsback to bisec9on!

QIR (N=8)

QIR (N=8) FAILED!

Falls back to bisec9on…

Bisec9on

Bisec9on Requires only sign tests Converges
slowly, 1 bit at a 9me!

trait SignTest[A] { def apply( poly: Polynomial[A], x: A ):
Sign }

final class FilteredSignTest[@sp A: Semiring]( implicit A: IsAlgebraic[A] ) extends
SignTest[A] { def apply(poly: Polynomial[A], x: A): Sign = { val x0 = FpFilter(A.toDouble(x), A.toAlgebraic(x)) @tailrec def loop(acc: FpFilter[Algebraic], i: Int): Sign = if (i >= 0) { val c = poly.nth(i) val cftr = FpFilter(A.toDouble(c), A.toAlgebraic(c)) loop(cftr + acc * x0, i - 1) } else { Sign(acc.signum) } loop(FpFilter.approx(Algebraic.Zero), poly.degree) } }

Accuracy Using Double

Speed Up from Exact Sign Test Fast (d=8)
Fast (d=16) Fast (d=32) Filtered (d=8) Filtered (d=16) Filtered (d=16)

Summary •  Works like any other number type
– Operator fusion + inlining within expressions •  Speeds up predicates – Sign tests, comparisons, etc. •  Near-‐Double performance – 2-‐4x in most cases h;p://github.com/9xxit/fpﬁlter-‐talk

Thanks! h;p://github.com/non/spire Tom Switzer @9xxit

Speed, Correctness, or Simplicity: Choose 3

Speed, Correctness, or Simplicity: Choose 3

Other Decks in Programming

Featured

Transcript