Deterministic Solutions to Intermittent Failures

DETERMINISTIC SOLUTIONS TO INTERMITTENT FAILURES

T i m M e r t e n s
HELLO, My Name Is github.com/tmertens @rockfx01

RSpec A testing framework for ruby

Source Code Examples https://github.com/tmertens/intermittent_test_failures

“ Do not observe the build status. You will disrupt
its quantum state! “

The Myth of Flaky Tests

The Myth of Flaky Tests DECONSTRUCTED

Tests Are Software Too • Test code does exactly what
you tell it to do • “Flaky” implies an unsolvable problem • “Non-Deterministic” behavior can be accounted for • Any failure can be resolved once you know the root cause

Real Defects • Ignored failures may be real defects

BACKSTORY

Continuous Integration They call me “CI” for short A process
or system by which new code is continuously validated against an existing test suite.

Parallelized Builds Builds which spread the work of executing tests
across 2 or more workers (e.g. containers, nodes)

FAILURES.MAP(&:FIX)

Debugging Reproducible Failures

Common Reproducible Failures • Stale Branches • Business Dates and
Times • Mocked Time vs System Time • Missing Preconditions • Real Bugs

Test Group A subset of tests from the test suite
which run on a specific node in a parallelized build.

RSpec Test Group $ rspec # OR $ rspec ./spec/some_spec.rb
./spec/other_spec.rb    $ rspec . --tag focus No Group - Runs All Tests Metadata Tags Specific Files

Test Seed A value, usually an integer, which determines the
order in which tests are executed.

RSpec Test Seed $ rspec  …  Finished in 1 minutes
17.7 seconds  99 examples, 0 failures, 4 pending    Randomized with seed 13391 Test Seed

Re-running Test Group With Seed $ rspec --seed 12345 --fail-fast
# OR $ rspec ./spec/some_spec.rb \  ./spec/other_spec.rb \  --seed 12345 --fail-fast

Test Bisect Repeatedly dividing a set of tests in half
until you find the minimal set of tests which cause another test to fail.

Bisecting Test Group with Seed $ rspec --seed 12345 --bisect
# OR $ rspec ./spec/some_spec.rb \  ./spec/other_spec.rb \  --seed 12345 --bisect

Test Pollution When the side effects of one or more
tests in a test group cause one or more other tests to fail.

Debugging Test Pollution Failures

Data Pollution • Data is persisted across test examples or
test suite executions ◦ Database Records ◦ Caches (e.g. Redis)

Defensive Testing • Tests should clean up after themselves, but…
• Don’t expect pristine starting conditions

• Don’t expect tables to be empty Defensive Testing #
Don’t:  expect(User.count ).to eq 1    # Do:  expect { foo.bar }.to change { User.count }.by(1)

• Don’t expect global scopes to only return test records
# Don’t:  expect(User.active).to match_array [user1, user2]    # Do:  expect(User.active).to include(user1, user2)  expect(User.active).not_to include(user3) Defensive Testing

Class/Singleton Caching • Reset cache mutations after tests that modify
them • Avoid mutating caches in tests

Class/Singleton Caching # Don’t:  described_class.add("foo") # mutates the singleton  expect(described_class.contains?("foo")).to
be true    # Do:  subject = described_class.new  subject.add("foo")  expect(subject.contains?("foo")).to be true

Mutated Constants • Don’t Overwrite constants # Don’t:  before {
SOME_CONST = "my test value” }    # Do:  stub_const("MyClass", "my test value")  allow(MyClass).to receive(:foo).and_return("foo")  fake_class = class_double(MyClass, foo: "foo")  stub_const("MyClass", fake_class)

Mutated Constants # Don't:  before do  MyClass.define_method(:foo) { "foo" } 
end    # Do:  instance = described_class.new  allow(instance).to receive(:foo).and_return(“foo")

Mutated (Test) Constants describe Foo do  # Don’t:  BAR =
"some_value"  it { expect(Foo.bar).to eq BAR }    # Do:  let(:bar) { "some_value" }  it { expect(Foo.bar).to eq bar }  end

Real Bugs! • Always ensure you understand the reason for
the test failure and ensure your production code is not at fault

Running Tests in a Loop describe MyClass do  100.times do 
describe "#some_method" do  it "does something" do  # ...  end  end  end  end

$ rspec ./spec/some_spec.rb --fail-fast Running Tests in a Loop

Non-Deterministic Failure F a i l u r e s
t h a t o c c u r a t seemingly random frequencies due to non-deterministic behavior of the code under test.

Debugging Non-Deterministic Failures

Unordered Queries • Don’t assume queries return results in specific
order • Unordered queries in Postgresql ◦ Postgresql returns results in non-deterministic order if query is not explicitly sorted # Don’t:  expect(results).to eq [record_1, record_2]    # Do:  expect(results).to contain_exactly record_1, record_2  expect(results).to match_array [record_1, record_2]

Frozen Time • Creating records in frozen time ◦ All
records have the same created_at time ◦ Queries ordered by created_at will return results in non-deterministic order • Prefer Timecop#travel over Timecop#freeze • Only freeze time when precise time is needed

Randomized Test Data • Faker or other data generation or
sampling methods return unexpected or unsupported data ◦ Non-alpha names (“D’Angelo”, “Doe-Smith”, “Mc Donald”) ◦ Invalid phone numbers, zip codes, unsupported states, etc. • Output relevant randomized data in the test error message to make troubleshooting easier

Debugging “Unreproducible” Failures

Date and Time • Tests only fail on weekends/holidays? •
Tests only fail at certain time of day? • Timecop to the date/time when the tests ran in CI Avant timecop-rspec gem:  https://github.com/avantoss/timecop-rspec

UTC vs Local Date/Time • `Date.today` uses system time zone
• `Date.current` uses application time zone

UTC vs Local Date/Time ENV["TZ"] = "UTC"  Time.zone = "America/Chicago" 
  early_morning_utc = Time.utc(2017,11,10,2)  Timecop.travel(early_morning_utc) do  # This will fail:  expect(Date.current).to eq Date.today  end

SQL Date Comparisons • Database queries comparing Dates to Time
◦ Never pass Time objects to sequel queries against Date columns MyModel.where(‘start_date <= ?’, Time.now).to_sql  #=> SELECT “my_models”.*  FROM “my_models”  WHERE (start_date <= ‘2017-11-03 06:29:45’)

Timeouts and Asynchronous Javascript • CI performance is often worse
than your local machine • Page load performance can vary widely based on application configuration and test ordering • Increase timeouts for CI as needed • Don’t use browser tests for performance testing

Timeouts and Asynchronous Javascript • Wait for pages to finish
loading before interacting with them ◦ SitePrism load_validations:  https://github.com/natritmeyer/site_prism#load-validations

Environmental Differences • Compare CI configuration and setup to local
◦ Environment Variables ◦ Test setup or execution inconsistencies • Database ◦ Seeds ◦ Migrations missing from schema or structure files

Environmental Differences • Library versions or inconsistencies • Operating System
differences • Use Docker for consistency

Strategies for Unreproducible Failures • SSH into CI and try
to reproduce • Use common sense ◦ What are the probable causes of the failure? • Check gem github repos for related issues or changes • Learn to use pry, byebug • Incrementally narrow the scope of the defect

Strategies for Unreproducible Failures • Know your test support code
in and out • Look at failure trends over time • Add logging

Describe “#the_end” do  It “is just the beginning”

Takeaways • Keep your builds green to avoid sadness •
Tests are code too • Set realistic goals • Celebrate success!

Get In Touch Tim Mertens Github { tmertens } Twitter
{ @rockfx01 }   https://github.com/tmertens/intermittent_test_failures

Deterministic Solutions to Intermittent Failures

Deterministic Solutions to Intermittent Failures

Other Decks in Programming

Featured

Transcript