Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Waiting is boring

Filipe Freire
September 30, 2017

Waiting is boring

Talk about waiting in software testing and display problems and solutions found to test async scenarios in services with messaging systems, frontend apps and distributed stream processing apps.

Filipe Freire

September 30, 2017
Tweet

More Decks by Filipe Freire

Other Decks in Programming

Transcript

  1. Waiting is boring
    Filipe Freire - 30 September 2017

    View full-size slide

  2. Quick intro
    Husband, learner, tester, developer

    (new) OS contributor
    2y work as a “coding” tester

    1y work as a developer
    2
    Currently @

    View full-size slide

  3. Disclaimer
    This talk is about waiting in software testing.
    Expect an imperfect personal account on
    problems and solutions.
    3

    View full-size slide

  4. What you can take from this
    Tips on testing async scenarios in:
    A) services with messaging systems

    B) frontend apps

    C) distributed stream processing apps.
    4

    View full-size slide

  5. –Venkat Subramaniam (twitter, 1 September 2017)
    “It’s foolish to wait forever for something to
    happen. Both in life and in asynchronous
    programming, timeouts are critical.”
    6

    View full-size slide

  6. What about software?
    (testing-wise)
    8

    View full-size slide

  7. Why do we test?
    9

    View full-size slide

  8. “To verify something

    can work”
    10

    View full-size slide

  9. (Or “Fact checks”)
    Automated
    tests
    Testing
    effort
    11

    View full-size slide

  10. 12
    source: martinfowler.com

    View full-size slide

  11. Theres more to it, but lets focus
    on time
    13

    View full-size slide

  12. Syncing test’s actions with
    results, typical first approach…
    14

    View full-size slide

  13. Thread.sleep(ms);
    usleep(microseconds);
    time.sleep(time_ms)
    System.Threading.Thread.Sleep(msecs);
    15

    View full-size slide

  14. We let the computer wait
    a) 5 seconds
    b) 10 seconds
    c) 1 minute
    d) 1 hour

    16

    View full-size slide

  15. If we’re lucky with our “estimate”
    17

    View full-size slide

  16. It’s never just X seconds.
    19

    View full-size slide

  17. How do we fix avoid it?
    20

    View full-size slide

  18. 2) Example: API & messaging
    21

    View full-size slide

  19. The context
    Warehousing & Logistics project
    60 person project
    Tons of Microservices & Monoliths
    (different teams)
    Comms: REST and AMQP (RabbitMQ)
    22

    View full-size slide

  20. Typical “End-to-End”
    Event triggered Expected state
    ?
    Producer Consumer
    Exchange Queue
    Live Live
    23

    View full-size slide

  21. Different teams & test tools
    design patterns

    containers
    dependency injection
    magic
    java “scripts”
    magic
    A B
    Hammered sleeps
    24

    View full-size slide

  22. One more thing…
    Inverted test pyramid
    And the Hammered
    sleeps
    25

    View full-size slide

  23. “Inverted pyramid?

    Not my money!”
    26

    View full-size slide

  24. So the sleeps then…
    27

    View full-size slide

  25. “Typical day”
    20 different async comm flows
    each w/ 20 variations in payload

    (+ “inf” minor variations)
    28

    View full-size slide

  26. “Typical day”
    Some math:
    20 flows times 

    20 variations times 

    5 sec sleep =
    half hour (+/-)
    29

    View full-size slide

  27. “Typical day”
    Some math:
    20 flows times 

    20 variations times 

    10 sec sleep = (not uncommon)
    1 hour (+/-)
    30

    View full-size slide

  28. Solutions
    1. Invert the damn pyramid
    2. Don’t hammer a wait
    31

    View full-size slide

  29. Solutions
    1. Invert the damn pyramid
    2. Don’t hammer a sleep
    32

    View full-size slide

  30. How do we avoid a hammered
    sleep?
    33

    View full-size slide

  31. The “scholar”-way
    34

    View full-size slide

  32. The “scholar”-way
    35

    View full-size slide

  33. The “do it for me”-way
    36

    View full-size slide

  34. The “do it for me”-way
    + AssertJ
    37

    View full-size slide

  35. 3) Example: Frontend
    38

    View full-size slide

  36. The context
    Back-office projects ( >1 project)
    “Latest frameworks & tools”
    No FE unit tests. No FE tests.
    “Let’s make end-to-end FE tests”
    39

    View full-size slide

  37. Sorry, what?
    Back-office projects ( >1 project)
    “Latest frameworks & tools”
    No FE unit tests. No FE tests.
    “Let’s make end-to-end FE tests”
    40

    View full-size slide

  38. ?
    No unit tests.
    No tests.
    41

    View full-size slide

  39. ???
    No unit tests.
    No tests.
    42

    View full-size slide

  40. Scream disappointed.
    43

    View full-size slide

  41. Now let’s pretend we don’t have
    that problem, and focus 

    on end-to-end tests
    44

    View full-size slide

  42. And the other problem
    Elements visibility takes time
    Interactions take time
    Animations take time
    Requests take time
    45

    View full-size slide

  43. How do we fight this “time”?
    46

    View full-size slide

  44. The “spit upwards” approach
    >60 tests
    Hammer big waits
    Wait for everything to be ready
    Result: 2h test suite
    47

    View full-size slide

  45. Should we run the tests in 

    parallel?
    Not yet.
    48

    View full-size slide

  46. The “moderate” approach
    45 tests (clean-up)
    Small individual fixed waits
    Result: 40min test suite
    49

    View full-size slide

  47. The “surgeon” approach
    30 tests (keep the “good” ones)
    Small individual waits
    Waits are counter-like
    Result: 20min test suite
    50

    View full-size slide

  48. The “Duke Nukem” approach
    10 tests (less is more)
    Fluent waits
    Result: < 6min test suite
    51

    View full-size slide

  49. Fluent waits?
    “Implicit… explicit… mayonese”
    We don’t need to reinvent the
    wheel.
    Selenium libs have this for ages.
    52

    View full-size slide

  50. Fluent waits!
    Note: If you’re using WebDriverWait

    you’re using FluentWait
    53

    View full-size slide

  51. Expected conditions
    Check the doc: https://goo.gl/J17NVD
    54

    View full-size slide

  52. Google it today.
    It’s worth it.
    55

    View full-size slide

  53. 3) Example: Stream Processing

    (i.e. Apache Storm)
    56

    View full-size slide

  54. The context
    (new) project
    10 person team
    Process half million
    critical messages p/day
    This is just the
    beginning (MVP)
    57

    View full-size slide

  55. A recent problem
    Storm + Scala project
    1 small integration (e2e) test suite
    Critical sanity/regression checks
    Needs to take less than 2-3 min.
    How do we handle waits?
    58

    View full-size slide

  56. How a typical e2e test works
    Topology
    Consumer
    Event triggered Expected state
    ?
    59

    View full-size slide

  57. Do we hammer 10 sec waits?
    Maybe 20 sec?
    60

    View full-size slide

  58. Nope. We use what we have at
    hand.
    61

    View full-size slide

  59. Scalatest’s “eventually”
    62

    View full-size slide

  60. How you can do it
    63

    View full-size slide

  61. Why is time that important?
    64

    View full-size slide

  62. If e2e take 3 mins,
    Our current pipeline takes 10 mins
    65

    View full-size slide

  63. If e2e take 6 mins (+3)
    Our CI/CD pipeline takes 13 mins

    10 runs per day
    Time wasted per day: 30 mins
    Per month: 10 hours
    66

    View full-size slide

  64. 4) Conclusions
    67

    View full-size slide

  65. –Michael Bolton (twitter, September 2017)
    “If you’re not scrutinising test code, why do
    you trust it any more than your production
    code? Especially when no problems are
    reported?”
    68

    View full-size slide

  66. –Michael Bolton (twitter, September 2017)
    “Lesson: don’t blindly trust your test code
    or your automated checks, lest they fail to
    reveal important problems in production
    code.”
    69

    View full-size slide

  67. Smarter waits don’t solve…
    70

    View full-size slide

  68. Immutable poor culture
    71

    View full-size slide

  69. Immutable poor decisions
    72

    View full-size slide

  70. Waiting for change is boring.
    Stop waiting.
    73

    View full-size slide

  71. Thank you. Questions?
    filfreire filrfreire
    74

    View full-size slide