Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Paradoxes and theorems every developer should know

Paradoxes and theorems every developer should know

Joshua Thijssen

October 09, 2016
Tweet

More Decks by Joshua Thijssen

Other Decks in Technology

Transcript

  1. @jaytaph 1
    Joshua Thijssen
    jaytaph
    Paradoxes and theorems
    every developer should know

    View Slide

  2. @jaytaph
    Disclaimer:
    I'm not a (mad)
    scientist nor a
    mathematician.
    2

    View Slide

  3. @jaytaph
    German Tank
    Problem
    3

    View Slide

  4. @jaytaph 4
    15

    View Slide

  5. @jaytaph 5

    View Slide

  6. @jaytaph 5
    53
    72
    8
    15

    View Slide

  7. @jaytaph 6
    k = number of elements
    m = largest number

    View Slide

  8. @jaytaph
    72 + (72 / 4) - 1 = 89
    7

    View Slide

  9. @jaytaph 8
    Intelligence Statistics Actual
    June 1940 1000 169
    June 1941 1550 244
    August
    1942
    1550 327
    https://en.wikipedia.org/wiki/German_tank_problem

    View Slide

  10. @jaytaph 8
    Intelligence Statistics Actual
    June 1940 1000 169
    June 1941 1550 244
    August
    1942
    1550 327
    https://en.wikipedia.org/wiki/German_tank_problem
    122

    View Slide

  11. @jaytaph 8
    Intelligence Statistics Actual
    June 1940 1000 169
    June 1941 1550 244
    August
    1942
    1550 327
    https://en.wikipedia.org/wiki/German_tank_problem
    122
    271

    View Slide

  12. @jaytaph 8
    Intelligence Statistics Actual
    June 1940 1000 169
    June 1941 1550 244
    August
    1942
    1550 327
    https://en.wikipedia.org/wiki/German_tank_problem
    122
    271
    342

    View Slide

  13. @jaytaph 9

    View Slide

  14. @jaytaph 9
    ➡ Data leakage.

    View Slide

  15. @jaytaph 9
    ➡ Data leakage.
    ➡ User-id's, invoice-id's, etc

    View Slide

  16. @jaytaph 9
    ➡ Data leakage.
    ➡ User-id's, invoice-id's, etc
    ➡ Used to approximate the number of
    iPhones sold in 2008.

    View Slide

  17. @jaytaph 10
    Monthly Invoice IDs
    Monthly Invoice IDs
    Monthly Invoice IDs
    Monthly Invoice IDs
    Jan 2476 2303
    Feb 10718 14891
    Mar 19413 27858
    Apr 28833 41458
    May 38644 55429
    Jun 48633 55429
    Jul 102606 59027 84961
    Aug 109331 69715 100308
    Sep 116388 80684 116020
    Oct 123721 91935 132004
    Nov 131241 103455 148341
    Dec 139236 115276 164976

    View Slide

  18. @jaytaph 11
    Monthly Invoice IDs
    Monthly Invoice IDs
    Monthly Invoice IDs
    Monthly Invoice IDs
    Jan 2476 2303
    Feb 10718 14891
    Mar 19413 27858
    Apr 28833 41458
    May 38644 55429
    Jun 48633 55429
    Jul 102606 59027 84961
    Aug 109331 69715 100308
    Sep 116388 80684 116020
    Oct 123721 91935 132004
    Nov 131241 103455 148341
    Dec 139236 115276 164976
    Estimated subscriptions
    Estimated subscriptions
    Estimated subscriptions
    Estimated subscriptions
    Jan
    Feb 8242 12588
    Mar 8695 12967
    Apr 9420 13600
    May 9811 13971
    Jun 9989 14525
    Jul 10394 15007
    Aug 6725 10688 15347
    Sep 7057 10969 15712
    Oct 7333 11251 15984
    Nov 7520 11520 16337
    Dec 7995 11821 16635

    View Slide

  19. @jaytaph 12
    Monthly Invoice IDs
    Monthly Invoice IDs
    Monthly Invoice IDs
    Monthly Invoice IDs
    Jan 2476 2303
    Feb 10718 14891
    Mar 19413 27858
    Apr 28833 41458
    May 38644 55429
    Jun 48633 55429
    Jul 102606 59027 84961
    Aug 109331 69715 100308
    Sep 116388 80684 116020
    Oct 123721 91935 132004
    Nov 131241 103455 148341
    Dec 139236 115276 164976
    Estimated growth / size
    Estimated growth / size
    Estimated growth / size
    Estimated growth / size
    Jan
    Feb
    Mar 105% 103%
    Apr 108% 105%
    May 104% 103%
    Jun 102% 104%
    Jul 104% 103%
    Aug 103% 102%
    Sep 105% 103% 102%
    Oct 104% 103% 102%
    Nov 103% 102% 102%
    Dec 106% 103% 102%

    View Slide

  20. @jaytaph
    ➡ Avoid (semi) sequential data to be leaked.
    ➡ Adding randomness and offsets will NOT
    solve the issue.
    ➡ Use UUIDs
    (better: timebased short IDs, you don't need UUIDs)
    13

    View Slide

  21. @jaytaph
    Confirmation Bias
    14

    View Slide

  22. @jaytaph 15
    Hypothesis....

    View Slide

  23. @jaytaph 16
    Evidence!

    View Slide

  24. @jaytaph 17
    Hypothesis confirmed!

    View Slide

  25. @jaytaph 18

    View Slide

  26. @jaytaph
    2 4 6
    19
    Z={…,−2,−1,0,1,2,…}

    View Slide

  27. @jaytaph
    21%
    20

    View Slide

  28. @jaytaph 21
    5 8 ? ?
    If a card shows an even number on one face,
    then its opposite face must be blue.

    View Slide

  29. @jaytaph
    < 10%
    22

    View Slide

  30. @jaytaph 23
    coke beer 35 17
    If you drink beer
    then you must be 18 yrs or older.

    View Slide

  31. @jaytaph 23
    coke beer 35 17
    If you drink beer
    then you must be 18 yrs or older.

    View Slide

  32. @jaytaph 23
    coke beer 35 17
    If you drink beer
    then you must be 18 yrs or older.

    View Slide

  33. @jaytaph
    Cognitive Adaption
    for social exchange
    24

    View Slide

  34. @jaytaph
    hint:
    Try and place your "technical
    problem" in a more social context.
    25

    View Slide

  35. @jaytaph 26
    5 8 ? ?
    If a card shows an even number on one face,
    then its opposite face must be blue.

    View Slide

  36. @jaytaph 26
    5 8 ? ?
    If a card shows an even number on one face,
    then its opposite face must be blue.

    View Slide

  37. @jaytaph 26
    5 8 ? ?
    If a card shows an even number on one face,
    then its opposite face must be blue.

    View Slide

  38. @jaytaph
    Birthday paradox
    27

    View Slide

  39. @jaytaph
    Question:
    28
    > 50% chance
    4 march
    18 september
    5 december
    25 juli
    2 februari
    9 october

    View Slide

  40. @jaytaph
    23 people
    29

    View Slide

  41. @jaytaph
    366* persons = 100%
    30

    View Slide

  42. @jaytaph
    Collisions occur more
    often than you realize
    31

    View Slide

  43. @jaytaph
    Hash collisions
    32

    View Slide

  44. @jaytaph
    16 bit value
    300 elements
    33

    View Slide

  45. @jaytaph
    rand(1,100000)
    117 elements
    34

    View Slide

  46. @jaytaph
    Watch out for:
    35
    ➡ Too small hashes.
    ➡ Unique data.
    ➡ Your data might be less "protected" as
    you might think.

    View Slide

  47. @jaytaph
    Heisenberg
    uncertainty
    principle
    36

    View Slide

  48. @jaytaph 37

    View Slide

  49. @jaytaph 38

    View Slide

  50. @jaytaph 39
    x position
    p momentum (mass x velocity)
    ħ 0.0000000000000000000000000000000001054571800 (1.054571800E-34)

    View Slide

  51. @jaytaph
    The more precise you
    know one property, the
    less you know the other.
    40

    View Slide

  52. @jaytaph
    This is NOT about
    observing!
    41

    View Slide

  53. @jaytaph
    Observer effect
    42
    heisenbug

    View Slide

  54. @jaytaph
    It's about trade-offs
    43

    View Slide

  55. @jaytaph
    Benford's law
    44

    View Slide

  56. @jaytaph
    Numbers beginning with 1 are
    more common than numbers
    beginning with 9.
    45

    View Slide

  57. @jaytaph
    Default behavior for
    natural numbers.
    46

    View Slide

  58. @jaytaph 47

    View Slide

  59. @jaytaph
    find . -name \*.php -exec wc -l {} \; | sort | cut -b 1 | uniq -c
    48

    View Slide

  60. @jaytaph
    find . -name \*.php -exec wc -l {} \; | sort | cut -b 1 | uniq -c
    48
    1073 1
    886 2
    636 3
    372 4
    352 5
    350 6
    307 7
    247 8
    222 9

    View Slide

  61. @jaytaph 49

    View Slide

  62. @jaytaph
    Bayesian filtering
    50

    View Slide

  63. @jaytaph
    What's the probability of an
    event, based on conditions that
    might be related to the event.
    51

    View Slide

  64. @jaytaph
    What is the chance that a
    message is spam when it
    contains certain words?
    52

    View Slide

  65. @jaytaph 53
    P(A|B)
    P(A)
    P(B)
    P(B|A)
    Probability event A, if event B (conditional)
    Probability event A
    Probability event B
    Probability event B, if event A

    View Slide

  66. @jaytaph 54
    ➡ Figure out the probability a {mail, tweet,
    comment, review} is {spam, negative} etc.

    View Slide

  67. @jaytaph
    ➡ 10 out of 50 comments are "negative".
    ➡ 25 out of 50 comments uses the word
    "horrible".
    ➡ 8 comments with the word "horrible" are
    marked as "negative".
    55

    View Slide

  68. @jaytaph 56

    View Slide

  69. @jaytaph 57

    View Slide

  70. @jaytaph 58
    "Your product is horrible and does
    not work properly. Also, you suck."
    "I had a horrible experience with
    another product. But yours really
    worked well. Thank you!"
    Negative:
    Positive:

    View Slide

  71. @jaytaph 59

    View Slide

  72. @jaytaph
    60
    Find me on twitter: @jaytaph
    Find me for development and training:
    www.noxlogic.nl / www.techademy.nl
    Find me on email: [email protected]
    Find me for blogs: www.adayinthelifeof.nl

    View Slide