Building a Canary Testing Framework

Building a Canary Testing Framework Iheanyi Ekechukwu

Iheanyi Ekechukwu Senior Software Engineer @ GitHub @kwuchu https://github.com/iheanyi

What is a canary?

Canary tests should mimic user behavior and test a specific
feature.

Canary tests (normally) should run against your production systems.

Opinion: Each product in your organization should have its own
canary running.

Is this an integration test?

Well, yes and no.

(\__/) (•ź•) Integration Tests </ \>

⠀ ⠀ ⠀(\__/) (•ź•) 　＿ノヽノ＼＿ `/　`/ ⌒Ｙ⌒ Ｙ　ヽ
Canaries ( 　(三ヽ⼈人　 /　　 | |　ﾉ⌒＼￣￣ヽ　ノヽ＿＿＿＞､＿＿_／　　｜( 王ﾉ〈 (\__/) 　　 /ﾐ`ー―⼺彡\ (•ź•) Integration Tests 　　/ ╰ ╯ \/ \>

You can do more than just CRUD operations (ex: SSH,
shell commands)

Allow for more specificity and granularity for the execution environment

Allow for better observability into what’s going on within your
systems

So, how do we build a canary testing framework?

First, what are our necessary features?

Visualization of Test Output / Execution Status

Recording and alerting on metrics via Prometheus

Run all tests simultaneously over a set interval with a
timeout

Ability to write tests in a flexible, approachable language

Canary Framework Architecture Components

Go for the Framework

JavaScript for the Test VM

A database for storing test execution results

API & Web UI for visualizing test execution results from
our database

Why Go for the actual framework?

Concurrency is our friend

Decent support for embedding languages

Go API Clients

Why JavaScript for the scripting language?

Flexibility

Low barrier to entry

Embeddable within Go via Otto

Canary Daemon Architecture

Let’s Break This Down

The JavaScript VM

Allows us to write and run test scripts using JavaScript

Extensions to the framework (such as logging, http, etc.) are
attached to the VM

The Test Runner

Isolates and configures the VM, then runs the tests

Returns errors (if any) from the test execution

The Canary Daemon

Executes the config file and registers the tests to run

Repeatedly runs each test using the runner, save results to
the database

Also, records and exports metrics such as tests started, running,
and finished via Prometheus.

Serves the API and front-end for visualizing test execution results

Okay, let’s walk through the code for a simple canary.

Loading Test Configurations

// internal/js/canary/config.go // Load a Javascript config script, returning all
the TestConfigs that // are defined in it. The given VM is left unchanged, but it's context // is available during parsing of the config script. func Load(vm *otto.Otto, src io.Reader) (*Config, []*js.TestConfig, error) { configVM := vm.Copy() // avoid polluting the global namespace ctx := new(ctx) configVM.Set("settings", ctx.ottoFuncSettings) configVM.Set("file", ctx.ottoFuncFile) configVM.Set("register_test", ctx.ottoFuncRegisterTest) if err := jsctx.LoadStdLib(context.Background(), configVM, "std"); err != nil { return nil, nil, fmt.Errorf("can't load stdlib: %v", err) } source, err := ioutil.ReadAll(src) if err != nil { return nil, nil, fmt.Errorf("can't read config file: %v", err) } if _, err := configVM.Run(source); err != nil { return nil, nil, fmt.Errorf("can't apply configuration: %v", err) } if ctx.cfg == nil { ctx.cfg = new(Config) } return ctx.cfg, ctx.tests, nil }

ctx := new(ctx)

// Config holds the global canary configuration. type Config struct
{ Name string } type ctx struct { cfg *Config tests []*js.TestConfig }

// A TestConfig describes how a class of tests must
be run. type TestConfig struct { Name string Script *otto.Script Frequency time.Duration Timeout time.Duration }

configVM.Set("file", ctx.ottoFuncFile) configVM.Set("register_test", ctx.ottoFuncRegisterTest)

// internal/js/canary/config.go func (ctx *ctx) ottoFuncFile(call otto.FunctionCall) otto.Value { filename
:= ottoutil.String(call.Otto, call.Argument(0)) data, err := ioutil.ReadFile(filename) if err != nil { ottoutil.Throw(call.Otto, err.Error()) } v, err := otto.ToValue(string(data)) if err != nil { ottoutil.Throw(call.Otto, err.Error()) } return v }

// internal/js/canary/config.go func (ctx *ctx) ottoFuncRegisterTest(call otto.FunctionCall) otto.Value { cfg
:= new(testConfig) cfg.load(call.Otto, call.Argument(0)) src := ottoutil.String(call.Otto, call.Argument(1)) test := &js.TestConfig{ Name: cfg.Name, Frequency: cfg.Frequency, Timeout: cfg.Timeout, } var err error test.Script, err = call.Otto.Compile("", src) if err != nil { ottoutil.Throw(call.Otto, err.Error()) } ctx.tests = append(ctx.tests, test) return otto.UndefinedValue() } type testConfig struct { Name string Frequency time.Duration Timeout time.Duration }

cfg := new(testConfig) cfg.load(call.Otto, call.Argument(0))

// internal/js/canary/config.go func (cfg *testConfig) load(vm *otto.Otto, config otto.Value) {
ottoutil.LoadObject(vm, config, map[string]func(otto.Value) error{ "name": func(v otto.Value) (err error) { cfg.Name, err = v.ToString() return }, "frequency": func(v otto.Value) error { cfg.Frequency = ottoutil.Duration(vm, v) return nil }, "timeout": func(v otto.Value) error { cfg.Timeout = ottoutil.Duration(vm, v) return nil }, }) }

func (ctx *ctx) ottoFuncRegisterTest(call otto.FunctionCall) otto.Value { cfg := new(testConfig)
cfg.load(call.Otto, call.Argument(0)) src := ottoutil.String(call.Otto, call.Argument(1)) test := &js.TestConfig{ Name: cfg.Name, Frequency: cfg.Frequency, Timeout: cfg.Timeout, } var err error test.Script, err = call.Otto.Compile("", src) if err != nil { ottoutil.Throw(call.Otto, err.Error()) } ctx.tests = append(ctx.tests, test) return otto.UndefinedValue() } type ctx struct { cfg *Config tests []*js.TestConfig }

Let’s look at an example config.js file.

settings({ name: 'Example Canary', }); var frequency = '10m'; var
timeout = '10m'; register_test( { name: 'simple example test', frequency: frequency, timeout: timeout, }, file('simple-example.js') );

How does config.Load even get called?

Let's create our daemon and call this function.

// cmd/canaryd/main.go func main() {}

// cmd/canaryd/main.go func main() { log.SetFormatter(&log.JSONFormatter{}) log.SetOutput(os.Stdout) var ( cfgPath
= flag.String("cfg", "config.js", "path to a JS config file") workDir = flag.String("work.dir", ".", "directory from which to run, should match expectations about relative paths in cfg.file") ) flag.Parse() if *workDir != "" { if err := os.Chdir(*workDir); err != nil { log.WithError(err).WithField("work.dir", *workDir).Fatal("can't change dir") } } var ( vm = otto.New() ) canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath) }

func mustLoadConfigs(vm *otto.Otto, filename string) (*canary.Config, []*js.TestConfig) { cfg, err
:= os.Open(filename) if err != nil { log.WithError(err).WithField("filename", filename).Fatal("could not open configuration file") } defer cfg.Close() canaryConfig, testCfgs, err := canary.Load(vm, cfg) if err != nil { log.WithError(err).Fatal("cannot load canary test configs") } return canaryConfig, testCfgs }

The Test Runner

// internal/js/runner/runner.go func Run(ctx context.Context, vm *otto.Otto, jsctx *js.Context, test
*js.Test, id string) error { testVM := vm.Copy() done := make(chan struct{}) go func() { select { case <-ctx.Done(): testVM.Interrupt <- func() { panic(ctx.Err()) } case <-done: } }() _, err := testVM.Run(test.Script) if oe, ok := err.(*otto.Error); ok { err = errors.New(oe.String()) } close(done) return err }

Extending the VM with logging, HTTP, and standard library functionality

Standard Library Functions

// internal/js/context/stdlib.go func LoadStdLib(ctx context.Context, vm *otto.Otto, pkgname string) error
{ v, err := vm.Run(`({})`) if err != nil { return err } pkg := v.Object() for name, subpkg := range map[string]func(*otto.Otto) (otto.Value, error){ "time": (&timePkg{ctx: ctx}).load, "os": (&osPkg{ctx: ctx}).load, "do": (newDoPkg(ctx)).load, } { v, err := subpkg(vm) if err != nil { return fmt.Errorf("can't load package %q: %v", name, err) } if err := pkg.Set(name, v); err != nil { return fmt.Errorf("can't set package %q: %v", name, err) } } return vm.Set(pkgname, pkg) }

type osPkg struct { ctx context.Context } func (svc *osPkg)
load(vm *otto.Otto) (otto.Value, error) { return ottoutil.ToPkg(vm, map[string]func(all otto.FunctionCall) otto.Value{ "getenv": svc.getenv, }), nil } func (svc *osPkg) getenv(all otto.FunctionCall) otto.Value { vm := all.Otto key := ottoutil.String(vm, all.Argument(0)) v, err := otto.ToValue(os.Getenv(key)) if err != nil { ottoutil.Throw(vm, "can't set string value: %v", err) } return v }

var secret = std.os.getenv("SOME_SECRET_VARIABLE");

HTTP Support

// internal/js/context/http.go // LoadHTTP loads an HTTP package in the
VM that sends HTTP requests using the // given client. func LoadHTTP(vm *otto.Otto, pkgname string, client *http.Client, cfgReq func(*http.Request) *http.Request) error { v, err := (&httpPkg{client: client, cfgReq: cfgReq}).load(vm) if err != nil { return err } return vm.Set(pkgname, v) }

type httpPkg struct { client *http.Client cfgReq func(*http.Request) *http.Request }
func (hpkg *httpPkg) load(vm *otto.Otto) (otto.Value, error) { v, err := vm.Run(`({})`) if err != nil { return q, err } pkg := v.Object() for name, method := range map[string]func(all otto.FunctionCall) otto.Value{ "do": hpkg.do, } { if err := pkg.Set(name, method); err != nil { return q, fmt.Errorf("can't set method %q, %v", name, err) } } return pkg.Value(), nil }

func (hpkg *httpPkg) do(all otto.FunctionCall) otto.Value { vm := all.Otto
var ( method = ottoutil.String(vm, all.Argument(0)) url = ottoutil.String(vm, all.Argument(1)) headers = ottoutil.StringMapSlice(vm, all.Argument(2)) body = ottoutil.String(vm, all.Argument(3)) ) req, err := http.NewRequest(method, url, strings.NewReader(body)) if err != nil { ottoutil.Throw(vm, err.Error()) } for k, vals := range headers { for _, val := range vals { req.Header.Add(k, val) } } resp, err := hpkg.client.Do(hpkg.cfgReq(req)) if err != nil { ottoutil.Throw(vm, err.Error()) } defer resp.Body.Close() respBody, err := ioutil.ReadAll(resp.Body) if err != nil { ottoutil.Throw(vm, err.Error()) } v, err := vm.Run(`({})`) if err != nil { ottoutil.Throw(vm, err.Error()) } pkg := v.Object() for name, value := range map[string]interface{}{ "code": resp.StatusCode, "headers": map[string][]string(resp.Header), "body": string(respBody), } { if err := pkg.Set(name, value); err != nil { ottoutil.Throw(vm, err.Error()) } } return v }

var response = http.do('GET', 'https://icanhazip.com', {});

Structured Logging

// internal/js/context/log.go // LoadLog loads a log package in the
VM that logs to the given logger func LoadLog(vm *otto.Otto, pkgname string, ll log.FieldLogger) error { // Setup the logging formatter to be structured as JSON formatted. log.SetFormatter(&log.JSONFormatter{}) // Output the stdout for capturing. log.SetOutput(os.Stdout) v, err := (&logger{ll: ll}).load(vm) if err != nil { return err } return vm.Set(pkgname, v) }

type logger struct { ll log.FieldLogger }

func (ll *logger) load(vm *otto.Otto) (otto.Value, error) { v, err
:= vm.Run(`({})`) if err != nil { return q, err } pkg := v.Object() for name, method := range map[string]func(all otto.FunctionCall) otto.Value{ "kv": ll.kv, "info": ll.info, "error": ll.error, "fail": ll.fail, } { if err := pkg.Set(name, method); err != nil { return q, fmt.Errorf("can't set method %q, %v", name, err) } } return pkg.Value(), nil }

func (ll *logger) kv(all otto.FunctionCall) otto.Value { vm := all.Otto
var child *log.Entry switch { case all.Argument(0).IsObject(): obj := all.Argument(0).Object() keys := obj.Keys() f := make(log.Fields, len(keys)) for _, key := range obj.Keys() { v, err := obj.Get(key) if err != nil { ottoutil.Throw(vm, err.Error()) } gov, err := v.Export() if err != nil { ottoutil.Throw(vm, err.Error()) } f[key] = gov } child = ll.ll.WithFields(f) case len(all.ArgumentList)%2 == 0: args := all.ArgumentList f := make(log.Fields, len(args)/2) for i := 0; i < len(args); i += 2 { k := ottoutil.String(vm, args[i]) v, err := args[i+1].Export() if err != nil { ottoutil.Throw(vm, err.Error()) } f[k] = v } child = ll.ll.WithFields(f) default: ottoutil.Throw(vm, "invalid call to log.kv") } v, err := (&logger{ll: child}).load(vm) if err != nil { ottoutil.Throw(vm, err.Error()) } return v }

var q = otto.UndefinedValue() func (ll *logger) info(all otto.FunctionCall) otto.Value
{ vm := all.Otto msg := ottoutil.String(vm, all.Argument(0)) ll.ll.Info(msg) return q }

func (ll *logger) fail(all otto.FunctionCall) otto.Value { vm := all.Otto
msg := ottoutil.String(vm, all.Argument(0)) ll.ll.Error(msg) ottoutil.Throw(all.Otto, msg) return q }

log.kv('foo', 'bar').info('this is an example message');

Let’s add them to the VM in the runner

func Run(ctx context.Context, vm *otto.Otto, jsctx *js.Context, test *js.Test, id
string) error { testVM := vm.Copy() reqConfig := func(req *http.Request) *http.Request { return req.WithContext(ctx) } if err := jscontext.LoadStdLib(ctx, testVM, "std"); err != nil { return fmt.Errorf("can't setup std package in VM: %v", err) } if err := jscontext.LoadHTTP(testVM, "http", jsctx.HTTPClient, reqConfig); err != nil { return fmt.Errorf("can't setup HTTP package in VM: %v", err) } if err := jscontext.LoadLog(testVM, "log", jsctx.Log); err != nil { return fmt.Errorf("can't setup LOG package in VM: %v", err) } … }

Let's add the runner to the daemon.

func main() { log.SetFormatter(&log.JSONFormatter{}) // Output the stdout for capturing.
log.SetOutput(os.Stdout) var ( cfgPath = flag.String("cfg", "config.js", "path to a JS config file") workDir = flag.String("work.dir", ".", "directory from which to run, should match expectations about relative paths in cfg.file") ) flag.Parse() if *workDir != "" { if err := os.Chdir(*workDir); err != nil { log.WithError(err).WithField("work.dir", *workDir).Fatal("can't change dir") } } var ( ctx = context.Background() vm = otto.New() ) canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath) launchTests(vm, canaryCfg, testCfgs) // Block forever because we want the tests to run forever. select {} }

func launchTests(vm *otto.Otto, config *canary.Config, configs []*js.TestConfig) { for _,
cfg := range configs { go runTestForever(db, met, vm, cfg, cfg.Test()) } }

func runTestForever(db dbpkg.CanaryStore, met *metrics.Node, vm *otto.Otto, cfg *js.TestConfig, test
*js.Test) { ll := log.WithFields(log.Fields{ "test.name": test.Name, }) for { go func(vm *otto.Otto) { testID := uuid.New() ll = ll.WithField("test.id", testID) testCtx := &js.Context{ Log: ll, HTTPClient: &http.Client{ Transport: http.DefaultTransport, }, } ctx, cancel := context.WithTimeout(context.Background(), cfg.Timeout) defer cancel() terr := runner.Run(ctx, vm, testCtx, test, testID) if terr != nil { ll.WithError(terr).Error("test failed") } }(vm.Copy()) // copy VM to avoid polluting global namespace // We'll run the test again after the duration we defined. time.Sleep(cfg.Frequency) } }

Saving Test Execution Results to a Database

// internal/db/db.go type CanaryStore interface { StartTest(id string, testName string,
startTime time.Time) (*TestInstance, error) EndTest(test *TestInstance, failure error, endAt time.Time) error ListTests() ([]TestInstance, error) ListOngoingTests() ([]TestInstance, error) FindTestByID(id string) (*TestInstance, error) Close() error }

// internal/db/models.go // TestInstance collects details about the instance of
a unique // test execution. type TestInstance struct { TestID string `json:"id,omitempty"` TestName string `json:"name,omitempty"` StartAt time.Time `json:"start_at,omitempty"` EndAt time.Time `json:"end_at,omitempty"` Pass bool `json:"pass,omitempty"` FailCause string `json:"fail_cause,omitempty"` } // BoltTestInstance is what gets serialized and saved to the Bolt database. Only // difference is that we're going to be using strings for StartAt and EndAt type BoltTestInstance struct { TestID string `json:"id,omitempty"` TestName string `json:"name,omitempty"` StartAt string `json:"start_at,omitempty"` EndAt string `json:"end_at,omitempty"` Pass bool `json:"pass,omitempty"` FailCause string `json:"fail_cause,omitempty"` }

var _ CanaryStore = (*boltStore)(nil) type boltStore struct { db
*bolt.DB ongoingMu sync.RWMutex ongoing map[string]TestInstance cancel context.CancelFunc }

// internal/db/db.go func (db *boltStore) StartTest(id string, testName string, startTime
time.Time) (*TestInstance, error) { test := &TestInstance{ TestID: id, TestName: testName, StartAt: startTime.UTC(), } // keep in memory until its finished db.ongoingMu.Lock() db.ongoing[id] = *test db.ongoingMu.Unlock() return test, nil }

func (db *boltStore) EndTest(test *TestInstance, failure error, endAt time.Time) error
{ db.ongoingMu.Lock() defer db.ongoingMu.Unlock() t, ok := db.ongoing[test.TestID] if !ok { return fmt.Errorf("test with ID does not exist: %q", test.TestID) } delete(db.ongoing, test.TestID) t.Pass = failure == nil if failure != nil { t.FailCause = failure.Error() } t.EndAt = endAt return insertTest(db.db, &t) }

func insertTest(db *bolt.DB, test *TestInstance) error { return db.Update(func(tx *bolt.Tx)
error { b := tx.Bucket(testsBucket) // Make this something that is saveable by the database. dbTest := &BoltTestInstance{ TestID: test.TestID, TestName: test.TestName, StartAt: test.StartAt.UTC().Format(time.RFC3339), EndAt: test.StartAt.UTC().Format(time.RFC3339), Pass: test.Pass, FailCause: test.FailCause, } // Marshal and save the encoded test. if buf, err := json.Marshal(dbTest); err != nil { return err } else if err := b.Put([]byte(dbTest.TestID), buf); err != nil { return err } return nil }) }

Let’s add it to the daemon.

// cmd/canaryd/main.go func main() { log.SetFormatter(&log.JSONFormatter{}) // Output the stdout
for capturing. log.SetOutput(os.Stdout) var ( cfgPath = flag.String("cfg", "config.js", "path to a JS config file") workDir = flag.String("work.dir", ".", "directory from which to run, should match expectations about relative paths in cfg.file") dbPath = flag.String("db.file", "canary.db", "file for the canary database") ) flag.Parse() var ( ctx = context.Background() vm = otto.New() ) db := mustOpenBolt(*dbPath) defer db.Close() canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath) launchTests(db, vm, canaryCfg, testCfgs) // Block forever because we want the tests to run forever. select {} }

func mustOpenBolt(path string) dbpkg.CanaryStore { db, err := dbpkg.NewBoltStore(path) if
err != nil { log.WithError(err).Fatal("can't open database") } return db }

*js.Test) { … for { go func(vm *otto.Otto) { testID := uuid.New() ll = ll.WithField("test.id", testID) testCtx := &js.Context{ Log: ll, HTTPClient: &http.Client{ Transport: http.DefaultTransport, }, } dbtest, err := db.StartTest( testID, test.Name, time.Now(), ) if err != nil { ll.WithError(err).Error("could not start the test") } ctx, cancel := context.WithTimeout(context.Background(), cfg.Timeout) defer cancel() terr := runner.Run(ctx, vm, testCtx, test, testID) if terr != nil { ll.WithError(terr).Error("test failed") } if err := db.EndTest(dbtest, terr, time.Now()); err != nil { ll.WithError(err).Error("couldn't mark test as being ended") } }(vm.Copy()) // copy VM to avoid polluting global namespace // We'll run the test again after the duration we defined. time.Sleep(cfg.Frequency) } }

Front-End for Viewing Test Execution Results

// internal/app/serve.go import ( "github.com/99designs/gqlgen/handler" "github.com/gorilla/mux" "github.com/iheanyi/simple-canary/internal/db" "github.com/sirupsen/logrus" ) //
App is an instance of the dashboard for the canary. type App struct { l logrus.FieldLogger db db.CanaryStore } func New(db db.CanaryStore, r *mux.Router) *App { app := &App{ l: logrus.WithField("component", "app"), db: db, } r.Handle("/", handler.Playground("GraphQL Playground", "/query")) r.Handle("/query", handler.GraphQL(NewExecutableSchema(Config{Resolvers: &Resolver{ db: db, }}))) return app }

type TestInstance { id: ID! name: String! start_at: Time! end_at:
Time pass: Boolean fail_cause: String } type Query { tests: [TestInstance!]! test(id: String!): TestInstance ongoingTests: [TestInstance!]! } scalar Time

type Resolver struct { db dbpkg.CanaryStore } func (r *Resolver)
Query() QueryResolver { return &queryResolver{r} } func (r *Resolver) TestInstance() TestInstanceResolver { return &testInstanceResolver{r} } type queryResolver struct{ *Resolver } func (r *queryResolver) Tests(ctx context.Context) ([]dbpkg.TestInstance, error) { tests, err := r.db.ListTests() return tests, err } func (r *queryResolver) Test(ctx context.Context, id string) (*dbpkg.TestInstance, error) { test, err := r.db.FindTestByID(id) return test, err } func (r *queryResolver) OngoingTests(ctx context.Context) ([]dbpkg.TestInstance, error) { tests, err := r.db.ListOngoingTests() return tests, err } type testInstanceResolver struct{ *Resolver } func (r *testInstanceResolver) ID(ctx context.Context, obj *dbpkg.TestInstance) (string, error) { return obj.TestID, nil } func (r *testInstanceResolver) Name(ctx context.Context, obj *dbpkg.TestInstance) (string, error) { return obj.TestName, nil } func (r *testInstanceResolver) StartAt(ctx context.Context, obj *dbpkg.TestInstance) (time.Time, error) { return obj.StartAt, nil } func (r *testInstanceResolver) EndAt(ctx context.Context, obj *dbpkg.TestInstance) (*time.Time, error) { return &obj.EndAt, nil } func (r *testInstanceResolver) FailCause(ctx context.Context, obj *dbpkg.TestInstance) (*string, error) { return &obj.FailCause, nil }

Back to the daemon.

func main() { log.SetFormatter(&log.JSONFormatter{}) // Output the stdout for capturing.
log.SetOutput(os.Stdout) var ( … listenHost = flag.String("listen.host", "", "interface on which to listen") listenPort = flag.String("listen.port", "8080", "port on which to listen") ) … l := mustListen(*listenHost, *listenPort) db := mustOpenBolt(*dbPath) defer db.Close() if err := launchHTTP(ctx, l, db); err != nil { log.WithError(err).Fatal("can't launch http server") } … }

func launchHTTP( ctx context.Context, l net.Listener, db dbpkg.CanaryStore, ) error
{ addr := l.Addr().(*net.TCPAddr) host, err := os.Hostname() if err != nil { return fmt.Errorf("can't get hostname: %v", err) } host = net.JoinHostPort(host, strconv.Itoa(addr.Port)) r := mux.NewRouter().Host(host).Subrouter() _ = app.New(db, r) log.WithField("host", host).Info("API starting") go http.Serve(l, r) return nil }

Last, but not least, metrics.

// internal/metrics/metrics.go package metrics import ( "net/http" "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promhttp" )
// Node is a wrapper around Prometheus's registerer interface type Node struct { registry prometheus.Registerer } // Prometheus returns an instance of the metrics node and the prometheus // handler. func Prometheus() (*Node, http.Handler) { registry := prometheus.NewRegistry() handler := promhttp.HandlerFor(registry, promhttp.HandlerOpts{}) registry.MustRegister(prometheus.NewProcessCollector(prometheus.ProcessCollectorOpts{})) registry.MustRegister(prometheus.NewGoCollector()) return &Node{ registry: registry, }, handler }

// cmd/canaryd/main.go func main() { … met, hdl := metrics.Prometheus()
… if err := launchHTTP(ctx, l, hdl, db); err != nil { log.WithError(err).Fatal("can't launch http server") } canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath) launchTests(db, met, vm, canaryCfg, testCfgs) // Block forever because we want the tests to run forever. select {} }

func launchHTTP( ctx context.Context, l net.Listener, promhdl http.Handler, db dbpkg.CanaryStore,
) error { addr := l.Addr().(*net.TCPAddr) host, err := os.Hostname() if err != nil { return fmt.Errorf("can't get hostname: %v", err) } host = net.JoinHostPort(host, strconv.Itoa(addr.Port)) r := mux.NewRouter().Host(host).Subrouter() _ = app.New(db, r) r.PathPrefix("/metrics").Handler(promhdl) log.WithField("host", host).Info("API starting") go http.Serve(l, r) return nil }

*js.Test) { ll := log.WithFields(log.Fields{ "test.name": test.Name, }) tmet := met.Labels(map[string]string{ "test_name": test.Name, }) var ( started = tmet.Counter("test_started_count", "Number of tests that were started") finished = tmet.Counter("test_finished_count", "Number of tests that have finished", "result") running = tmet.Gauge("test_running_total", "Tests that are currently running") _ = tmet.Summary("test_duration_seconds", "Duration of tests", []float64{0.5, 0.75, 0.9, 0.99, 1.0}, "result") ) … }

*js.Test) { … for { go func(vm *otto.Otto) { … started.WithLabelValues().Add(1) running.Add(1) defer running.Add(-1) ctx, cancel := context.WithTimeout(context.Background(), cfg.Timeout) defer cancel() terr := runner.Run(ctx, vm, testCtx, test, testID) if terr != nil { finished.With(prometheus.Labels{"result": "fail"}).Add(1) ll.WithError(terr).Error("test failed") } else { finished.With(prometheus.Labels{"result": "pass"}).Add(1) } if err := db.EndTest(dbtest, terr, time.Now()); err != nil { ll.WithError(err).Error("couldn't mark test as being ended") } }(vm.Copy()) // copy VM to avoid polluting global namespace // We'll run the test again after the duration we defined. time.Sleep(cfg.Frequency) } }

We're done! Let's see a demo?

https://github.com/iheanyi/ simple-canary

Special thanks to Antoine Grondin

Questions?

Thank you! @kwuchu

Building a Canary Testing Framework

Building a Canary Testing Framework

More Decks by Iheanyi Ekechukwu

Other Decks in Programming

Featured

Transcript