Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Speed, Correctness, or Simplicity: Choose 3
Search
Tom Switzer
January 30, 2015
Programming
1
350
Speed, Correctness, or Simplicity: Choose 3
This talk introduces the floating point filter implementation in Spire (spire.math.FpFilter).
Tom Switzer
January 30, 2015
Tweet
Share
Other Decks in Programming
See All in Programming
Performance for Conversion! 分散トレーシングでボトルネックを 特定せよ
inetand
0
140
CJK and Unicode From a PHP Committer
youkidearitai
PRO
0
110
詳解!defer panic recover のしくみ / Understanding defer, panic, and recover
convto
0
240
CloudflareのChat Agent Starter Kitで簡単!AIチャットボット構築
syumai
2
480
Oracle Database Technology Night 92 Database Connection control FAN-AC
oracle4engineer
PRO
1
440
テストカバレッジ100%を10年続けて得られた学びと品質
mottyzzz
2
590
はじめてのMaterial3 Expressive
ym223
2
270
Cache Me If You Can
ryunen344
2
680
私の後悔をAWS DMSで解決した話
hiramax
4
210
今だからこそ入門する Server-Sent Events (SSE)
nearme_tech
PRO
2
150
Testing Trophyは叫ばない
toms74209200
0
860
請來的 AI Agent 同事們在寫程式時,怎麼用 pytest 去除各種幻想與盲點
keitheis
0
120
Featured
See All Featured
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.9k
What’s in a name? Adding method to the madness
productmarketing
PRO
23
3.7k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
840
jQuery: Nuts, Bolts and Bling
dougneiner
64
7.9k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Rails Girls Zürich Keynote
gr2m
95
14k
How to train your dragon (web standard)
notwaldorf
96
6.2k
GraphQLの誤解/rethinking-graphql
sonatard
72
11k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3k
Fireside Chat
paigeccino
39
3.6k
Raft: Consensus for Rubyists
vanstee
140
7.1k
Transcript
Speed, Correctness, or Simplicity: Choose 3 Tom
Switzer @9xxit h;ps://github.com/9xxit/fpfilter-‐talk
Overview Floa9ng point is “good enough”…
most of the 9me.
Op9ons Use Double, live with the errors.
Use higher precision type, live with performance loss. But, there is a 3rd op9on…
Floa9ng Point Filters Use floa9ng point when you
can. Use higher precision when you can’t.
Err… Not So Simple Solve problem using floa9ng point
approxima9on… Maintain an error bound on approxima9on. Re-‐evaluate with exact type if error too large.
The Catch
What is the determinant of my matrix?
Not Good For: Minimizing Errors in Floa9ng Point Arithme9c
What is the sign of the determinant of
my matrix?
Good For: Making a Decision
FpFilter[A] Simple wrapper: FpFilter[Rational] Standard Opera2ons +, -‐,
*, /, .sqrt, etc Fast predictes signum, compare, isWhole, etc.
FpFilter[A] class FpFilter[A]( apx: Double, mes: Double, ind: Int,
exact: => A ) { … } floa9ng point approxima9on error bounds
FpFilter[A] class FpFilter[A]( apx: Double, mes: Double, ind: Int,
exact: => A ) { … } error bounds “Exact Geometric Computa2on Using Cascading.” Burnikel, Funke & Seel.
FpFilter[A] class FpFilter[A]( apx: Double, mes: Double, ind: Int,
exact: => A ) { … } error bounds thunk for higher precision Welcome to …
… Macro City def abs(implicit ev: Signed[A]): FpFilter[A] =
macro FpFilter.absImpl[A] def unary_- (implicit ev: Rng[A]) : FpFilter[A] = macro FpFilter.negateImpl[A] def +(rhs: FpFilter[A])(implicit ev: Semiring[A]): FpFilter[A] = macro FpFilter.plusImpl[A] def -(rhs: FpFilter[A])(implicit ev: Rng[A]): FpFilter[A] = macro FpFilter.minusImpl[A] def *(rhs: FpFilter[A])(implicit ev: Semiring[A]): FpFilter[A] = macro FpFilter.timesImpl[A] def /(rhs: FpFilter[A])(implicit ev: Field[A]): FpFilter[A] = macro FpFilter.divideImpl[A] def sqrt(implicit ev: NRoot[A]): FpFilter[A] = macro FpFilter.sqrtImpl[A] def <(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.ltImpl[A] def >(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.gtImpl[A] def <=(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.ltEqImpl[A] def >=(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.gtEqImpl[A] def ===(rhs: FpFilter[A])(implicit ev0: Signed[A], ev1: Rng[A]): Boolean = macro FpFilter.eqImpl[A] def signum(implicit ev: Signed[A]): Int = macro FpFilter.signImpl[A]
… Macro City • Operator fusion – No intermediate
alloca9ons • In-‐line approxima9on + error bounds – Fast, Double arithme9c • Thunk becomes inner defs – Compile down to private methods
Turn this… (x + y).signum
… into this. val fpf$tmp$macro$38 = x.value; val fpf$apx$macro$39
= fpf$tmp$macro$38; val fpf$mes$macro$40 = java.lang.Math.abs(fpf$tmp$macro$38); def fpf$exact$macro$42 = spire.algebra.Field.apply[spire.math.Algebraic](Algebraic.AlgebraicAlgebra).fromDouble(fpf $tmp$macro$38); val fpf$tmp$macro$43 = y.value; val fpf$apx$macro$44 = fpf$tmp$macro$43; val fpf$mes$macro$45 = java.lang.Math.abs(fpf$tmp$macro$43); def fpf$exact$macro$47 = spire.algebra.Field.apply[Algebraic](Algebraic.AlgebraicAlgebra).fromDouble(fpf$tmp$macro $43); val fpf$apx$macro$48 = fpf$apx$macro$39.+(fpf$apx$macro$44); val fpf$mes$macro$49 = fpf$mes$macro$40.+(fpf$mes$macro$45); def fpf$exact$macro$51 = Algebraic.AlgebraicAlgebra.plus( fpf$exact$macro$42, fpf$exact$macro$47); val fpf$err$macro$52 = fpf$mes$macro$49.$times(1).$times(2.220446049250313E-16); if (fpf$apx$macro$48 > fpf$err$macro$52 && fpf$apx$macro$48 < Double.POSITIVE_INFINITY) 1 else if (fpf$apx$macro$48 < fpf$err$macro$52.unary_$minus && fpf$apx$macro$48 > Double.NEGATIVE_INFINITY) -1 else if (fpf$err$macro$52 == 0.0) 0 else Algebraic.AlgebraicAlgebra.signum(fpf$exact$macro$51)
Examples
2D Orienta2on
p q r
p q r
p q r RIGHT
p r q
p r q LEFT
p r q
p r q NO TURN
trait Turn[@spec A] { def apply( px: A, py: A,
qx: A, qy: A, rx: A, ry: A ): Int }
object FastTurn extends Turn[Double] { def apply( px: Double, py:
Double, qx: Double, qy: Double, rx: Double, ry: Double ): Int = signum { (qx - px) * (ry - py) - (rx - px) * (qy - py) } }
Accuracy of Fast Turn
object ExactTurn extends Turn[Double] { def apply( px: Double, py:
Double, qx: Double, qy: Double, rx: Double, ry: Double ): Int = { val pxa = Algebraic(px) val pya = Algebraic(py) val qxa = Algebraic(qx) val qya = Algebraic(qy) val rxa = Algebraic(rx) val rya = Algebraic(ry) ((qxa - pxa) * (rya - pya) – (rxa - pxa) * (qya - pya)).signum } }
10,000x Slower!
Let’s try again…
object FilteredTurn extends Turn[Double] { def apply( px: Double, py:
Double, qx: Double, qy: Double, rx: Double, ry: Double ): Int = { val pxf = FpFilter.exact[Algebraic](px) val pyf = FpFilter.exact[Algebraic](py) val qxf = FpFilter.exact[Algebraic](qx) val qyf = FpFilter.exact[Algebraic](qy) val rxf = FpFilter.exact[Algebraic](rx) val ryf = FpFilter.exact[Algebraic](ry) ((qxf - pxf) * (ryf - pyf) – (rxf - pxf) * (qyf - pyf)).signum } }
FilteredTurn Speed Rela9ve to FastTurn
Polynomial Root Finding
Polynomial[A]
Interval[A] Root
“Quadra2c Interval Refinement for Real Roots.” John AbboT.
QIR for short.
QIR (N=8)
QIR (N=8)
QIR (N=8)
QIR (N=8)
QIR (N=8)
QIR (N=8)
QIR (N=8)
QIR (N=8)
QIR (N=16)
QIR (N=16)
QIR (N=16)
QIR • Requires 2 polynomial evalua9ons – High precision
generally required • Very fast convergence (quadra9c) – Under some assump9ons • Occasionally fails when assump9ons not met – Fallsback to bisec9on!
QIR (N=8)
QIR (N=8)
QIR (N=8) FAILED!
Falls back to bisec9on…
Bisec9on
Bisec9on
Bisec9on
Bisec9on
Bisec9on
Bisec9on
Bisec9on
Bisec9on Requires only sign tests Converges
slowly, 1 bit at a 9me!
trait SignTest[A] { def apply( poly: Polynomial[A], x: A ):
Sign }
final class FilteredSignTest[@sp A: Semiring]( implicit A: IsAlgebraic[A] ) extends
SignTest[A] { def apply(poly: Polynomial[A], x: A): Sign = { val x0 = FpFilter(A.toDouble(x), A.toAlgebraic(x)) @tailrec def loop(acc: FpFilter[Algebraic], i: Int): Sign = if (i >= 0) { val c = poly.nth(i) val cftr = FpFilter(A.toDouble(c), A.toAlgebraic(c)) loop(cftr + acc * x0, i - 1) } else { Sign(acc.signum) } loop(FpFilter.approx(Algebraic.Zero), poly.degree) } }
Accuracy Using Double
Speed Up from Exact Sign Test Fast (d=8)
Fast (d=16) Fast (d=32) Filtered (d=8) Filtered (d=16) Filtered (d=16)
Summary • Works like any other number type
– Operator fusion + inlining within expressions • Speeds up predicates – Sign tests, comparisons, etc. • Near-‐Double performance – 2-‐4x in most cases h;p://github.com/9xxit/fpfilter-‐talk
Thanks! h;p://github.com/non/spire Tom Switzer @9xxit