Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Floating Point 101
Search
kida
February 06, 2013
Programming
7
280
Floating Point 101
A very very basic introduction to FP.
With some inaccuracies.
kida
February 06, 2013
Tweet
Share
More Decks by kida
See All by kida
Cognitive Supervision for Laser Phonomicrosurgery
kida
0
31
Towards Cognitive Supervision in robot-assisted surgery
kida
0
180
Other Decks in Programming
See All in Programming
Docker_OSS_ホスティング入門
satokoki645
0
140
Fragment Composition of GraphQL
quramy
14
1.7k
GNU Makeの使い方 / How to use GNU Make
kaityo256
PRO
13
4.3k
2024 コーディング研修
ckazu
2
650
mb_trim関数を作りました
youkidearitai
PRO
1
200
How to improve maintainability and readability of your automated tests? ( #scrumniigata )
teyamagu
PRO
1
130
Ruby on Fails - effective error handling with Rails conventions
talyssonoc
0
300
[RubyKaigi 2024] Ruby Mixology 101: adding shots of PHP, Elixir, and more
palkan
0
120
Runtime Objects in Rust
mitsuhiko
0
220
Exploring the Implementation of “t.Run”, “t.Parallel”, and “t.Cleanup”
akarin
1
160
Webアプリをできるだけコードを手書きしないで作ってみる
tomokusaba
2
230
slog登場に伴うloggerの取り回し手法の見直し / kamakura.go #6
arthur1
0
120
Featured
See All Featured
The Invisible Side of Design
smashingmag
294
49k
A Tale of Four Properties
chriscoyier
153
22k
Automating Front-end Workflow
addyosmani
1357
200k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
34
6.1k
Music & Morning Musume
bryan
41
5.6k
Building Adaptive Systems
keathley
32
1.9k
Being A Developer After 40
akosma
67
580k
What’s in a name? Adding method to the madness
productmarketing
PRO
17
2.7k
How to train your dragon (web standard)
notwaldorf
75
5.2k
Building Effective Engineering Teams - LeadDev
addyosmani
33
1.9k
Keith and Marios Guide to Fast Websites
keithpitt
408
22k
Web Components: a chance to create the future
zenorocha
306
41k
Transcript
FLOATING 101 POINT
FLOATING 100.999998 POINT
engineers we are
researchers we are
3.14159265358979 3238462643383279 5028841971693993 7510582097494459 2307816406286208 NUMBERS WE PLAY WITH ALL
DAY LONG
well, sometimes even at night. (yawn).
So, what is a floating point?
A floating point is ± D 1 .D 2 D
3 ···D n x Be
A floating point is sign ± D 1 .D 2
D 3 ···D n x Be
A floating point is significand ± D 1 .D 2
D 3 ···D n x Be
A floating point is base ± D 1 .D 2
D 3 ···D n x Be
A floating point is exponent ± D 1 .D 2
D 3 ···D n x Be
A floating point represents ± (D 1 + D 2
* B-1 + D 3 * B-2 + … + D n * B(n-1)) * Be
For example + 3.14 x 100 = (3 + 1*0.1
+ 4*0.01)*1 = 3.14
The point can float ! + 3.14 x 10-1 =
0.314
The point can float ! + 3.14 x 10+1 =
31.4
What if B = 2 ? + 1.00 x 2+2
= 4.0
Like machines do. http://grouper.ieee.org/groups/754/
Normalization of floating point
Multiple representations + 0.01 x 22 = 1.0 + 0.10
x 21 = 1.0 + 1.00 x 20 = 1.0
Normalized representation + 0.01 x 22 = 1.0 + 0.10
x 21 = 1.0 + 1.00 x 20 = 1.0
Normalized representation + (1.)000 x 20 1 is omitted
Normalized representation + (1.)000 x 20 there's room for an
extra digit!
Excess-127 representation -127 → 0 -126 → +1 … -1
→ +126 0 → +127
#include <float.h> FLT_MIN, FLT_MAX, ... #include <math.h> M_PI, M_E, NAN,
INFINITY, ...
Why no exact representation for 0.1?
FLOATING POINT REAL NUMBERS is used to represent
FLOATING POINT RATIONAL NUMBERS denotes a (finite) subset of
0.1 cannot be expressed as a power of 2 +
??? x 2??
+ 00 x 20 1 It's also a matter of
precision
+ 01 x 20 1 1.25 It's also a matter
of precision
+ 10 x 20 1 1.25 1.5 It's also a
matter of precision
+ 11 x 20 1 1.25 1.5 1.75 It's also
a matter of precision
+ 11 x 20 π/2 It's also a matter of
precision
+ 11 x 20 π/2 It's also a matter of
precision
+ 00 x 21 1 1.25 1.5 1.75 2.0 Not
just a matter of precision or basis...
+ 01 x 21 1 1.25 1.5 1.75 2.0 2.5
Not just a matter of precision or basis...
+ 10 x 21 1 1.25 1.5 1.75 2.0 2.5
3.0 Not just a matter of precision or basis...
Like death and taxes rounding errors are a fact of
life. http://wiki.octave.org/FAQ
+ 110 x 21 Operands that differ greatly + 100
x 2-2
+ 110000 x 21 Operands that differ greatly + 000101
x 21
+ 110000 x 21 Operands that differ greatly + 000101
x 21 = 110
None
Operands that are really close + 111 x 21 -
110 x 21 = 001 x 21
Operands that are really close + 111 x 21 -
110 x 21 = 100 x 2-2
None
Fixed point representation + 100.001010 = 22 + 2-3+ 2-5
= 4.15625
POINT WHAT'S THE WITH FLOATING
FP ARITHMETIC IS FAST Embedded in HW.
Single precision up to ~10+38. FP REPRESENTS A WIDE RANGE
HE APPROVES FP
Anyway, errors still there.
Okay, what about increasing the number of digits use decimal
representations estimating errors think before you type
More digits, please! double (52 significant bits) long double (112
significant bits) arbitrary precision * * language support needed
Use decimal representations! decimal (C# only) BigDecimal (Java only) std::decimal
(C++, coming soon)* * after IEEE-754 2008
Estimate the error of your algo rel_err = fabs(f –
fp) / f
Use float to represent time float time; while (true) time
+= 0.20;
Use float to represent time float time; while (true) time
+= 0.20; This is BAD. And you should feel BAD.
Compare float numbers (a == b)
Compare float numbers (a == b) fabs(a -b) <= FLT_EPSILON
Compare float numbers (a == b) fabs(a -b) <= FLT_EPSILON
fabs(a - b) <= max(fabs(a),fabs(b)) * pc
There is no silver bullet.
Use libraries (when available).
Vector addition (naive) float t[SIZE]; float result; for (i =
0; i < SIZE; ++i) result += t[i];
RESCUE GNU GSL TO THE
None
that's all folks! @lorisfichera – https://kid-a.github.com References and source code
available at https://github.com/kid-a/floating-point-seminar Credits Font: Yanone Kaffeesatz (http://www.yanone.de/typedesign/kaffeesatz/)