Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Floating Point 101
Search
kida
February 06, 2013
Programming
7
310
Floating Point 101
A very very basic introduction to FP.
With some inaccuracies.
kida
February 06, 2013
Tweet
Share
More Decks by kida
See All by kida
Cognitive Supervision for Laser Phonomicrosurgery
kida
0
42
Towards Cognitive Supervision in robot-assisted surgery
kida
0
190
Other Decks in Programming
See All in Programming
技術懸念に立ち向かい 法改正を穏便に乗り切った話
pop_cashew
0
1.3k
生成AIで日々のエラー調査を進めたい
yuyaabo
0
510
XSLTで作るBrainfuck処理系
makki_d
0
190
コードに語らせよう――自己ドキュメント化が内包する楽しさについて / Let the Code Speak
nrslib
6
1.4k
実践ArchUnit ~実例による検証パターンの紹介~
ogiwarat
2
240
イベントストーミングから始めるドメイン駆動設計
jgeem
4
800
Zennの運営完全に理解した #完全に理解したTalk
wadayusuke
1
180
Webからモバイルへ Vue.js × Capacitor 活用事例
naokihaba
0
500
KotlinConf 2025 現地で感じたServer-Side Kotlin
n_takehata
1
180
AIエージェントによるテストフレームワーク Arbigent
takahirom
0
370
Cline指示通りに動かない? AI小説エージェントで学ぶ指示書の書き方と自動アップデートの仕組み
kamomeashizawa
1
360
FormFlow - Build Stunning Multistep Forms
yceruto
1
150
Featured
See All Featured
It's Worth the Effort
3n
184
28k
How to train your dragon (web standard)
notwaldorf
92
6.1k
Making Projects Easy
brettharned
116
6.2k
Visualization
eitanlees
146
16k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
46
9.6k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
42
2.4k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
357
30k
What’s in a name? Adding method to the madness
productmarketing
PRO
22
3.5k
Scaling GitHub
holman
459
140k
Documentation Writing (for coders)
carmenintech
71
4.9k
The Art of Programming - Codeland 2020
erikaheidi
54
13k
Unsuck your backbone
ammeep
671
58k
Transcript
FLOATING 101 POINT
FLOATING 100.999998 POINT
engineers we are
researchers we are
3.14159265358979 3238462643383279 5028841971693993 7510582097494459 2307816406286208 NUMBERS WE PLAY WITH ALL
DAY LONG
well, sometimes even at night. (yawn).
So, what is a floating point?
A floating point is ± D 1 .D 2 D
3 ···D n x Be
A floating point is sign ± D 1 .D 2
D 3 ···D n x Be
A floating point is significand ± D 1 .D 2
D 3 ···D n x Be
A floating point is base ± D 1 .D 2
D 3 ···D n x Be
A floating point is exponent ± D 1 .D 2
D 3 ···D n x Be
A floating point represents ± (D 1 + D 2
* B-1 + D 3 * B-2 + … + D n * B(n-1)) * Be
For example + 3.14 x 100 = (3 + 1*0.1
+ 4*0.01)*1 = 3.14
The point can float ! + 3.14 x 10-1 =
0.314
The point can float ! + 3.14 x 10+1 =
31.4
What if B = 2 ? + 1.00 x 2+2
= 4.0
Like machines do. http://grouper.ieee.org/groups/754/
Normalization of floating point
Multiple representations + 0.01 x 22 = 1.0 + 0.10
x 21 = 1.0 + 1.00 x 20 = 1.0
Normalized representation + 0.01 x 22 = 1.0 + 0.10
x 21 = 1.0 + 1.00 x 20 = 1.0
Normalized representation + (1.)000 x 20 1 is omitted
Normalized representation + (1.)000 x 20 there's room for an
extra digit!
Excess-127 representation -127 → 0 -126 → +1 … -1
→ +126 0 → +127
#include <float.h> FLT_MIN, FLT_MAX, ... #include <math.h> M_PI, M_E, NAN,
INFINITY, ...
Why no exact representation for 0.1?
FLOATING POINT REAL NUMBERS is used to represent
FLOATING POINT RATIONAL NUMBERS denotes a (finite) subset of
0.1 cannot be expressed as a power of 2 +
??? x 2??
+ 00 x 20 1 It's also a matter of
precision
+ 01 x 20 1 1.25 It's also a matter
of precision
+ 10 x 20 1 1.25 1.5 It's also a
matter of precision
+ 11 x 20 1 1.25 1.5 1.75 It's also
a matter of precision
+ 11 x 20 π/2 It's also a matter of
precision
+ 11 x 20 π/2 It's also a matter of
precision
+ 00 x 21 1 1.25 1.5 1.75 2.0 Not
just a matter of precision or basis...
+ 01 x 21 1 1.25 1.5 1.75 2.0 2.5
Not just a matter of precision or basis...
+ 10 x 21 1 1.25 1.5 1.75 2.0 2.5
3.0 Not just a matter of precision or basis...
Like death and taxes rounding errors are a fact of
life. http://wiki.octave.org/FAQ
+ 110 x 21 Operands that differ greatly + 100
x 2-2
+ 110000 x 21 Operands that differ greatly + 000101
x 21
+ 110000 x 21 Operands that differ greatly + 000101
x 21 = 110
None
Operands that are really close + 111 x 21 -
110 x 21 = 001 x 21
Operands that are really close + 111 x 21 -
110 x 21 = 100 x 2-2
None
Fixed point representation + 100.001010 = 22 + 2-3+ 2-5
= 4.15625
POINT WHAT'S THE WITH FLOATING
FP ARITHMETIC IS FAST Embedded in HW.
Single precision up to ~10+38. FP REPRESENTS A WIDE RANGE
HE APPROVES FP
Anyway, errors still there.
Okay, what about increasing the number of digits use decimal
representations estimating errors think before you type
More digits, please! double (52 significant bits) long double (112
significant bits) arbitrary precision * * language support needed
Use decimal representations! decimal (C# only) BigDecimal (Java only) std::decimal
(C++, coming soon)* * after IEEE-754 2008
Estimate the error of your algo rel_err = fabs(f –
fp) / f
Use float to represent time float time; while (true) time
+= 0.20;
Use float to represent time float time; while (true) time
+= 0.20; This is BAD. And you should feel BAD.
Compare float numbers (a == b)
Compare float numbers (a == b) fabs(a -b) <= FLT_EPSILON
Compare float numbers (a == b) fabs(a -b) <= FLT_EPSILON
fabs(a - b) <= max(fabs(a),fabs(b)) * pc
There is no silver bullet.
Use libraries (when available).
Vector addition (naive) float t[SIZE]; float result; for (i =
0; i < SIZE; ++i) result += t[i];
RESCUE GNU GSL TO THE
None
that's all folks! @lorisfichera – https://kid-a.github.com References and source code
available at https://github.com/kid-a/floating-point-seminar Credits Font: Yanone Kaffeesatz (http://www.yanone.de/typedesign/kaffeesatz/)