Joshua Thijssen
December 08, 2015
700

# Paradoxes and theorems every developer should know

## Joshua Thijssen

December 08, 2015

## Transcript

1. 1
Joshua Thijssen
jaytaph
namespace

2. Disclaimer:
scientist nor a
mathematician.
2

3. Second disclaimer:
I will only tell lies
3

4. German Tank
Problem
4

5. 5

6. 5
15

7. 6

8. 6
53
72
8
15

9. 7
k = number of elements
m = largest number

10. 72 + (72 / 4) - 1 = 89
8

11. 9
Intelligence Statistics Actual
June 1940 1000 169
June 1941 1550 244
August
1942
1550 327
https://en.wikipedia.org/wiki/German_tank_problem

12. 9
Intelligence Statistics Actual
June 1940 1000 169
June 1941 1550 244
August
1942
1550 327
https://en.wikipedia.org/wiki/German_tank_problem
122

13. 9
Intelligence Statistics Actual
June 1940 1000 169
June 1941 1550 244
August
1942
1550 327
https://en.wikipedia.org/wiki/German_tank_problem
122
271

14. 9
Intelligence Statistics Actual
June 1940 1000 169
June 1941 1550 244
August
1942
1550 327
https://en.wikipedia.org/wiki/German_tank_problem
122
271
342

15. 10

16. 10
➡ Data leakage.

17. 10
➡ Data leakage.
➡ User-id's, invoice-id's, etc

18. 10
➡ Data leakage.
➡ User-id's, invoice-id's, etc
➡ Used to approximate the number of
iPhones sold in 2008.

19. 10
➡ Data leakage.
➡ User-id's, invoice-id's, etc
➡ Used to approximate the number of
iPhones sold in 2008.
➡ Calculate approximations of datasets with
(incomplete) information.

20. ➡ Avoid (semi) sequential data to be leaked.
➡ Adding randomness and offsets will NOT
solve the issue.
➡ Use UUIDs
(better: timebased short IDs, you don't need UUIDs)
11

21. 12
Collecting (big) data is easy
Analyzing big data is the hard part.

22. Confirmation Bias
13

23. 2 4 6
14
Z={…,−2,−1,0,1,2,…}

24. 21%
15

25. 16
5 8 ? ?
If a card shows an even number on one face,
then its opposite face is blue.

26. < 10%
17

27. 18
coke beer 35 17
If you drink beer
then you must be 18 yrs or older.

28. 18
coke beer 35 17
If you drink beer
then you must be 18 yrs or older.

29. 18
coke beer 35 17
If you drink beer
then you must be 18 yrs or older.

for social exchange
19

31. hint:
problem" in a more social context.
20

32. BDD
21

33. 22
5 8 ? ?
If a card shows an even number on one face,
then its opposite face is blue.

34. 22
5 8 ? ?
If a card shows an even number on one face,
then its opposite face is blue.

35. 22
5 8 ? ?
If a card shows an even number on one face,
then its opposite face is blue.

36. TESTING
23

37. 24
➡ Step 1: Write code
➡ Step 2: Write tests
➡ Step 3: Proﬁt

38. public function isLeapYeap(\$year) {
return (\$year % 4 == 0);
}
25
https://www.sundoginteractive.com/blog/confirmation-bias-in-unit-testing
testIs1996ALeapYeap();
testIs2000ALeapYeap();
testIs2004ALeapYeap();
testIs2008ALeapYeap();
testIs2012ALeapYeap();
testIs1997NotALeapYear();
testIs1998NotALeapYear();
testIs2001NotALeapYear();
testIs2013NotALeapYear();

39. public function isLeapYeap(\$year) {
return (\$year % 4 == 0);
}
25
https://www.sundoginteractive.com/blog/confirmation-bias-in-unit-testing
testIs1996ALeapYeap();
testIs2000ALeapYeap();
testIs2004ALeapYeap();
testIs2008ALeapYeap();
testIs2012ALeapYeap();
testIs1997NotALeapYear();
testIs1998NotALeapYear();
testIs2001NotALeapYear();
testIs2013NotALeapYear();

40. public function isLeapYeap(\$year) {
return (\$year % 4 == 0);
}
26
https://www.sundoginteractive.com/blog/confirmation-bias-in-unit-testing

41. 27
➡ Tests where written based on actual code.
➡ Tests where written to CONFIRM actual
code, not to DISPROVE actual code!

42. 28
TDD

43. 29
➡ Step 1: Write tests
➡ Step 2: Write code
➡ Step 3: Proﬁt, as less prone to conﬁrmation
bias (as there is nothing to bias!)

30

45. Question:
31
> 50% chance
4 march
18 september
5 december
25 juli
2 februari
9 october

46. 23 people
32

47. 366 persons = 100%
33

48. Collisions occur more
often than you realize
34

49. Hash collisions
35

50. 16 bits means
300 values before >50%
collision probability
36

51. Watch out for:
37
➡ Too small hashes.
➡ Unique data.
➡ Your data might be less "protected" as
you might think.

52. Heisenberg
uncertainty
principle
38

star trek
(heisenberg compensators)
39

54. nor crystal meth
40

55. 41
x position
p momentum (mass x velocity)
ħ 0.0000000000000000000000000000000001054571800 (1.054571800E-34)

56. The more precise you
know one property, the
less you know the other.
42

observing!
43

58. Observer effect
44
heisenbug

45

60. Benford's law
46

61. Numbers beginning with 1 are
more common than numbers
beginning with 9.
47

62. Default behavior for
natural numbers.
48

63. 49

64. find . -name \*.php -exec wc -l {} \; | sort | cut -b 1 | uniq -c
50

65. find . -name \*.php -exec wc -l {} \; | sort | cut -b 1 | uniq -c
50
1073 1
886 2
636 3
372 4
352 5
350 6
307 7
247 8
222 9

66. 51

67. Bayesian filtering
52

68. What's the probability of an
event, based on conditions that
might be related to the event.
53

69. What is the chance that a
message is spam when it
contains certain words?
54

70. 55
P(A|B)
P(A)
P(B)
P(B|A)
Probability event A, if event B (conditional)
Probability event A
Probability event B
Probability event B, if event A

71. 56
➡ Figure out the probability a {mail, tweet,
comment, review} is {spam, negative} etc.

72. ➡ 10 out of 50 comments are "negative".
➡ 25 out of 50 comments uses the word
"horrible".
➡ 8 comments with the word "horrible" are
marked as "negative".
57

73. 58
negative
"horrible"

74. 59

75. 60
➡ More words?
➡ Complex algorithm,
➡ but, we can assume that words are not
independent from eachother
➡ Naive Bayes approach

76. 61

77. 62
We must know
beforehand which
negative?

78. TRAINING SET
63

79. 64
"Your product is horrible and does
not work properly. Also, you suck."
"I had a horrible experience with
another product. But yours really
worked well. Thank you!"
Negative:
Positive:

80. \$trainingset = [
'negative' => [
'count' => 1,
'words' => [
'product' => 1,
'horrible' => 1,
'properly' => 1,
'suck' => 1,
],
],
'positive' => [
'count' => 1,
'words' => [
'horrible' => 1,
'experience' => 1,
'product' => 1,
'thank' => 1,
],
],
];
65

81. 66
\$trainingset = [
'negative' => [
'count' => 631,
'words' => [
'product' => 521,
'horrible' => 52,
'properly' => 36,
'suck' => 272,
],
],
'positive' => [
'count' => 1263,
'words' => [
'horrible' => 62,
'experience' => 16,
'product' => 311,
'great' => 363
'thank' => 63,
],
],
];

82. 67
➡ You might want to ﬁlter stop-words ﬁrst.
➡ You might want to make sure negatives are
handled property "not great" => negative.
➡ Bonus points if you can spot sarcasm.

83. ➡ Collaborative ﬁltering (mahout):
➡ If user likes product A, B and C, what is the
chance that they like product D?
68

84. 69
Mess up your (training) data, and nothing can save you
(except a training set reboot)

85. ➡ Binomial probability
70

86. 71
➡ 30% change of acceptance for CFP
➡ 5 CFP's

87. 71
➡ 30% change of acceptance for CFP
➡ 5 CFP's
1 - (0.7 * 0.7 * 0.7 * 0.7 * 0.7) = 1 - 0.168 = 0.832
83% on getting selected at least once!

88. Ockham's Razor
72

89. 73
Among competing hypotheses, the one with
the fewest assumptions should be selected.

90. 74
82
simple as possible, but no simpler.

91. YAGNI
75

92. 76
Actually,
➡ The principle of plurality
Plurality should not be posited with
necessity.
➡ The principle of parsimony
It is pointless to do more with what is
done with less.

93. ➡ Every element you add needs: design,
development, maintenance, connectivity,
support, etc etc.
➡ When "adding" elements, you are not
77

94. 78
Food for thought:
Would Ockham accept a
Service Oriented
Architecture?

95. http://farm1.static.ﬂickr.com/73/163450213_18478d3aa6_d.jpg 79

96. 80