Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Floating Point 101
Search
kida
February 06, 2013
Programming
7
310
Floating Point 101
A very very basic introduction to FP.
With some inaccuracies.
kida
February 06, 2013
Tweet
Share
More Decks by kida
See All by kida
Cognitive Supervision for Laser Phonomicrosurgery
kida
0
48
Towards Cognitive Supervision in robot-assisted surgery
kida
0
190
Other Decks in Programming
See All in Programming
[堅牢.py #1] テストを書かない研究者に送る、最初にテストを書く実験コード入門 / Let's start your ML project by writing tests
shunk031
11
5.5k
高単価案件で働くための心構え
nullnull
0
160
Reactive Thinking with Signals and the new Resource API
manfredsteyer
PRO
0
110
『実践MLOps』から学ぶ DevOps for ML
nsakki55
2
460
ゼロダウンタイムでミドルウェアの バージョンアップを実現した手法と課題
wind111
0
210
Herb to ReActionView: A New Foundation for the View Layer @ San Francisco Ruby Conference 2025
marcoroth
0
180
例外処理を理解して、設計段階からエラーを見つけやすく、起こりにくく #phpconfuk
kajitack
12
6.3k
Stay Hacker 〜九州で生まれ、Perlに出会い、コミュニティで育つ〜
pyama86
2
2.4k
Module Harmony
petamoriken
2
520
関数の挙動書き換える
takatofukui
4
750
CloudflareのSandbox SDKを試してみた
syumai
0
180
歴史から学ぶ「Why PHP?」 PHPを書く理由を改めて理解する / Learning from History: “Why PHP?” Rediscovering the Reasons for Writing PHP
seike460
PRO
0
160
Featured
See All Featured
Code Review Best Practice
trishagee
72
19k
Six Lessons from altMBA
skipperchong
29
4.1k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
118
20k
Designing for Performance
lara
610
69k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
Music & Morning Musume
bryan
46
7k
YesSQL, Process and Tooling at Scale
rocio
174
15k
It's Worth the Effort
3n
187
29k
GraphQLの誤解/rethinking-graphql
sonatard
73
11k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.8k
Visualization
eitanlees
150
16k
Mobile First: as difficult as doing things right
swwweet
225
10k
Transcript
FLOATING 101 POINT
FLOATING 100.999998 POINT
engineers we are
researchers we are
3.14159265358979 3238462643383279 5028841971693993 7510582097494459 2307816406286208 NUMBERS WE PLAY WITH ALL
DAY LONG
well, sometimes even at night. (yawn).
So, what is a floating point?
A floating point is ± D 1 .D 2 D
3 ···D n x Be
A floating point is sign ± D 1 .D 2
D 3 ···D n x Be
A floating point is significand ± D 1 .D 2
D 3 ···D n x Be
A floating point is base ± D 1 .D 2
D 3 ···D n x Be
A floating point is exponent ± D 1 .D 2
D 3 ···D n x Be
A floating point represents ± (D 1 + D 2
* B-1 + D 3 * B-2 + … + D n * B(n-1)) * Be
For example + 3.14 x 100 = (3 + 1*0.1
+ 4*0.01)*1 = 3.14
The point can float ! + 3.14 x 10-1 =
0.314
The point can float ! + 3.14 x 10+1 =
31.4
What if B = 2 ? + 1.00 x 2+2
= 4.0
Like machines do. http://grouper.ieee.org/groups/754/
Normalization of floating point
Multiple representations + 0.01 x 22 = 1.0 + 0.10
x 21 = 1.0 + 1.00 x 20 = 1.0
Normalized representation + 0.01 x 22 = 1.0 + 0.10
x 21 = 1.0 + 1.00 x 20 = 1.0
Normalized representation + (1.)000 x 20 1 is omitted
Normalized representation + (1.)000 x 20 there's room for an
extra digit!
Excess-127 representation -127 → 0 -126 → +1 … -1
→ +126 0 → +127
#include <float.h> FLT_MIN, FLT_MAX, ... #include <math.h> M_PI, M_E, NAN,
INFINITY, ...
Why no exact representation for 0.1?
FLOATING POINT REAL NUMBERS is used to represent
FLOATING POINT RATIONAL NUMBERS denotes a (finite) subset of
0.1 cannot be expressed as a power of 2 +
??? x 2??
+ 00 x 20 1 It's also a matter of
precision
+ 01 x 20 1 1.25 It's also a matter
of precision
+ 10 x 20 1 1.25 1.5 It's also a
matter of precision
+ 11 x 20 1 1.25 1.5 1.75 It's also
a matter of precision
+ 11 x 20 π/2 It's also a matter of
precision
+ 11 x 20 π/2 It's also a matter of
precision
+ 00 x 21 1 1.25 1.5 1.75 2.0 Not
just a matter of precision or basis...
+ 01 x 21 1 1.25 1.5 1.75 2.0 2.5
Not just a matter of precision or basis...
+ 10 x 21 1 1.25 1.5 1.75 2.0 2.5
3.0 Not just a matter of precision or basis...
Like death and taxes rounding errors are a fact of
life. http://wiki.octave.org/FAQ
+ 110 x 21 Operands that differ greatly + 100
x 2-2
+ 110000 x 21 Operands that differ greatly + 000101
x 21
+ 110000 x 21 Operands that differ greatly + 000101
x 21 = 110
None
Operands that are really close + 111 x 21 -
110 x 21 = 001 x 21
Operands that are really close + 111 x 21 -
110 x 21 = 100 x 2-2
None
Fixed point representation + 100.001010 = 22 + 2-3+ 2-5
= 4.15625
POINT WHAT'S THE WITH FLOATING
FP ARITHMETIC IS FAST Embedded in HW.
Single precision up to ~10+38. FP REPRESENTS A WIDE RANGE
HE APPROVES FP
Anyway, errors still there.
Okay, what about increasing the number of digits use decimal
representations estimating errors think before you type
More digits, please! double (52 significant bits) long double (112
significant bits) arbitrary precision * * language support needed
Use decimal representations! decimal (C# only) BigDecimal (Java only) std::decimal
(C++, coming soon)* * after IEEE-754 2008
Estimate the error of your algo rel_err = fabs(f –
fp) / f
Use float to represent time float time; while (true) time
+= 0.20;
Use float to represent time float time; while (true) time
+= 0.20; This is BAD. And you should feel BAD.
Compare float numbers (a == b)
Compare float numbers (a == b) fabs(a -b) <= FLT_EPSILON
Compare float numbers (a == b) fabs(a -b) <= FLT_EPSILON
fabs(a - b) <= max(fabs(a),fabs(b)) * pc
There is no silver bullet.
Use libraries (when available).
Vector addition (naive) float t[SIZE]; float result; for (i =
0; i < SIZE; ++i) result += t[i];
RESCUE GNU GSL TO THE
None
that's all folks! @lorisfichera – https://kid-a.github.com References and source code
available at https://github.com/kid-a/floating-point-seminar Credits Font: Yanone Kaffeesatz (http://www.yanone.de/typedesign/kaffeesatz/)