Pro Yearly is on sale from $80 to $50! »

Building a Canary Testing Framework

Building a Canary Testing Framework

Originally given at StrangeLoop in 2018, this talk walks through what canary testing and building a simple canary testing framework.

E02ac480d0b92ec00e660be464d35c8d?s=128

Iheanyi Ekechukwu

September 28, 2018
Tweet

Transcript

  1. Building a Canary Testing Framework Iheanyi Ekechukwu

  2. Iheanyi Ekechukwu Senior Software Engineer @ GitHub @kwuchu https://github.com/iheanyi

  3. What is a canary?

  4. None
  5. Canary tests should mimic user behavior and test a specific

    feature.
  6. Canary tests (normally) should run against your production systems.

  7. Opinion: Each product in your organization should have its own

    canary running.
  8. Is this an integration test?

  9. Well, yes and no.

  10. (\__/) (•ź•) Integration Tests </ \>

  11. None
  12. ⠀ ⠀ ⠀(\__/) (•ź•)  _ノ ヽ ノ\_ `/ `/ ⌒Y⌒ Y ヽ

    Canaries (  (三ヽ⼈人  /   | | ノ⌒\  ̄ ̄ヽ  ノ ヽ___>、___/    |( 王 ノ〈 (\__/)    /ミ`ー―⼺彡\ (•ź•) Integration Tests   / ╰ ╯ \/ \>
  13. You can do more than just CRUD operations (ex: SSH,

    shell commands)
  14. Allow for more specificity and granularity for the execution environment

  15. Allow for better observability into what’s going on within your

    systems
  16. So, how do we build a canary testing framework?

  17. First, what are our necessary features?

  18. Visualization of Test Output / Execution Status

  19. Recording and alerting on metrics via Prometheus

  20. Run all tests simultaneously over a set interval with a

    timeout
  21. Ability to write tests in a flexible, approachable language

  22. Canary Framework Architecture Components

  23. Go for the Framework

  24. JavaScript for the Test VM

  25. A database for storing test execution results

  26. API & Web UI for visualizing test execution results from

    our database
  27. Why Go for the actual framework?

  28. Concurrency is our friend

  29. Decent support for embedding languages

  30. Go API Clients

  31. Why JavaScript for the scripting language?

  32. Flexibility

  33. Low barrier to entry

  34. Embeddable within Go via Otto

  35. Canary Daemon Architecture

  36. Let’s Break This Down

  37. The JavaScript VM

  38. Allows us to write and run test scripts using JavaScript

  39. Extensions to the framework (such as logging, http, etc.) are

    attached to the VM
  40. None
  41. The Test Runner

  42. Isolates and configures the VM, then runs the tests

  43. Returns errors (if any) from the test execution

  44. The Canary Daemon

  45. Executes the config file and registers the tests to run

  46. Repeatedly runs each test using the runner, save results to

    the database
  47. Also, records and exports metrics such as tests started, running,

    and finished via Prometheus.
  48. Serves the API and front-end for visualizing test execution results

  49. Okay, let’s walk through the code for a simple canary.

  50. None
  51. Loading Test Configurations

  52. // internal/js/canary/config.go // Load a Javascript config script, returning all

    the TestConfigs that // are defined in it. The given VM is left unchanged, but it's context // is available during parsing of the config script. func Load(vm *otto.Otto, src io.Reader) (*Config, []*js.TestConfig, error) { configVM := vm.Copy() // avoid polluting the global namespace ctx := new(ctx) configVM.Set("settings", ctx.ottoFuncSettings) configVM.Set("file", ctx.ottoFuncFile) configVM.Set("register_test", ctx.ottoFuncRegisterTest) if err := jsctx.LoadStdLib(context.Background(), configVM, "std"); err != nil { return nil, nil, fmt.Errorf("can't load stdlib: %v", err) } source, err := ioutil.ReadAll(src) if err != nil { return nil, nil, fmt.Errorf("can't read config file: %v", err) } if _, err := configVM.Run(source); err != nil { return nil, nil, fmt.Errorf("can't apply configuration: %v", err) } if ctx.cfg == nil { ctx.cfg = new(Config) } return ctx.cfg, ctx.tests, nil }
  53. ctx := new(ctx)

  54. // Config holds the global canary configuration. type Config struct

    { Name string } type ctx struct { cfg *Config tests []*js.TestConfig }
  55. // A TestConfig describes how a class of tests must

    be run. type TestConfig struct { Name string Script *otto.Script Frequency time.Duration Timeout time.Duration }
  56. // internal/js/canary/config.go // Load a Javascript config script, returning all

    the TestConfigs that // are defined in it. The given VM is left unchanged, but it's context // is available during parsing of the config script. func Load(vm *otto.Otto, src io.Reader) (*Config, []*js.TestConfig, error) { configVM := vm.Copy() // avoid polluting the global namespace ctx := new(ctx) configVM.Set("settings", ctx.ottoFuncSettings) configVM.Set("file", ctx.ottoFuncFile) configVM.Set("register_test", ctx.ottoFuncRegisterTest) if err := jsctx.LoadStdLib(context.Background(), configVM, "std"); err != nil { return nil, nil, fmt.Errorf("can't load stdlib: %v", err) } source, err := ioutil.ReadAll(src) if err != nil { return nil, nil, fmt.Errorf("can't read config file: %v", err) } if _, err := configVM.Run(source); err != nil { return nil, nil, fmt.Errorf("can't apply configuration: %v", err) } if ctx.cfg == nil { ctx.cfg = new(Config) } return ctx.cfg, ctx.tests, nil }
  57. configVM.Set("file", ctx.ottoFuncFile) configVM.Set("register_test", ctx.ottoFuncRegisterTest)

  58. // internal/js/canary/config.go func (ctx *ctx) ottoFuncFile(call otto.FunctionCall) otto.Value { filename

    := ottoutil.String(call.Otto, call.Argument(0)) data, err := ioutil.ReadFile(filename) if err != nil { ottoutil.Throw(call.Otto, err.Error()) } v, err := otto.ToValue(string(data)) if err != nil { ottoutil.Throw(call.Otto, err.Error()) } return v }
  59. // internal/js/canary/config.go func (ctx *ctx) ottoFuncRegisterTest(call otto.FunctionCall) otto.Value { cfg

    := new(testConfig) cfg.load(call.Otto, call.Argument(0)) src := ottoutil.String(call.Otto, call.Argument(1)) test := &js.TestConfig{ Name: cfg.Name, Frequency: cfg.Frequency, Timeout: cfg.Timeout, } var err error test.Script, err = call.Otto.Compile("", src) if err != nil { ottoutil.Throw(call.Otto, err.Error()) } ctx.tests = append(ctx.tests, test) return otto.UndefinedValue() } type testConfig struct { Name string Frequency time.Duration Timeout time.Duration }
  60. cfg := new(testConfig) cfg.load(call.Otto, call.Argument(0))

  61. // internal/js/canary/config.go func (cfg *testConfig) load(vm *otto.Otto, config otto.Value) {

    ottoutil.LoadObject(vm, config, map[string]func(otto.Value) error{ "name": func(v otto.Value) (err error) { cfg.Name, err = v.ToString() return }, "frequency": func(v otto.Value) error { cfg.Frequency = ottoutil.Duration(vm, v) return nil }, "timeout": func(v otto.Value) error { cfg.Timeout = ottoutil.Duration(vm, v) return nil }, }) }
  62. func (ctx *ctx) ottoFuncRegisterTest(call otto.FunctionCall) otto.Value { cfg := new(testConfig)

    cfg.load(call.Otto, call.Argument(0)) src := ottoutil.String(call.Otto, call.Argument(1)) test := &js.TestConfig{ Name: cfg.Name, Frequency: cfg.Frequency, Timeout: cfg.Timeout, } var err error test.Script, err = call.Otto.Compile("", src) if err != nil { ottoutil.Throw(call.Otto, err.Error()) } ctx.tests = append(ctx.tests, test) return otto.UndefinedValue() } type ctx struct { cfg *Config tests []*js.TestConfig }
  63. // internal/js/canary/config.go // Load a Javascript config script, returning all

    the TestConfigs that // are defined in it. The given VM is left unchanged, but it's context // is available during parsing of the config script. func Load(vm *otto.Otto, src io.Reader) (*Config, []*js.TestConfig, error) { configVM := vm.Copy() // avoid polluting the global namespace ctx := new(ctx) configVM.Set("settings", ctx.ottoFuncSettings) configVM.Set("file", ctx.ottoFuncFile) configVM.Set("register_test", ctx.ottoFuncRegisterTest) if err := jsctx.LoadStdLib(context.Background(), configVM, "std"); err != nil { return nil, nil, fmt.Errorf("can't load stdlib: %v", err) } source, err := ioutil.ReadAll(src) if err != nil { return nil, nil, fmt.Errorf("can't read config file: %v", err) } if _, err := configVM.Run(source); err != nil { return nil, nil, fmt.Errorf("can't apply configuration: %v", err) } if ctx.cfg == nil { ctx.cfg = new(Config) } return ctx.cfg, ctx.tests, nil }
  64. Let’s look at an example config.js file.

  65. settings({ name: 'Example Canary', }); var frequency = '10m'; var

    timeout = '10m'; register_test( { name: 'simple example test', frequency: frequency, timeout: timeout, }, file('simple-example.js') );
  66. How does config.Load even get called?

  67. Let's create our daemon and call this function.

  68. // cmd/canaryd/main.go func main() {}

  69. // cmd/canaryd/main.go func main() { log.SetFormatter(&log.JSONFormatter{}) log.SetOutput(os.Stdout) var ( cfgPath

    = flag.String("cfg", "config.js", "path to a JS config file") workDir = flag.String("work.dir", ".", "directory from which to run, should match expectations about relative paths in cfg.file") ) flag.Parse() if *workDir != "" { if err := os.Chdir(*workDir); err != nil { log.WithError(err).WithField("work.dir", *workDir).Fatal("can't change dir") } } var ( vm = otto.New() ) canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath) }
  70. func mustLoadConfigs(vm *otto.Otto, filename string) (*canary.Config, []*js.TestConfig) { cfg, err

    := os.Open(filename) if err != nil { log.WithError(err).WithField("filename", filename).Fatal("could not open configuration file") } defer cfg.Close() canaryConfig, testCfgs, err := canary.Load(vm, cfg) if err != nil { log.WithError(err).Fatal("cannot load canary test configs") } return canaryConfig, testCfgs }
  71. The Test Runner

  72. // internal/js/runner/runner.go func Run(ctx context.Context, vm *otto.Otto, jsctx *js.Context, test

    *js.Test, id string) error { testVM := vm.Copy() done := make(chan struct{}) go func() { select { case <-ctx.Done(): testVM.Interrupt <- func() { panic(ctx.Err()) } case <-done: } }() _, err := testVM.Run(test.Script) if oe, ok := err.(*otto.Error); ok { err = errors.New(oe.String()) } close(done) return err }
  73. Extending the VM with logging, HTTP, and standard library functionality

  74. Standard Library Functions

  75. // internal/js/context/stdlib.go func LoadStdLib(ctx context.Context, vm *otto.Otto, pkgname string) error

    { v, err := vm.Run(`({})`) if err != nil { return err } pkg := v.Object() for name, subpkg := range map[string]func(*otto.Otto) (otto.Value, error){ "time": (&timePkg{ctx: ctx}).load, "os": (&osPkg{ctx: ctx}).load, "do": (newDoPkg(ctx)).load, } { v, err := subpkg(vm) if err != nil { return fmt.Errorf("can't load package %q: %v", name, err) } if err := pkg.Set(name, v); err != nil { return fmt.Errorf("can't set package %q: %v", name, err) } } return vm.Set(pkgname, pkg) }
  76. type osPkg struct { ctx context.Context } func (svc *osPkg)

    load(vm *otto.Otto) (otto.Value, error) { return ottoutil.ToPkg(vm, map[string]func(all otto.FunctionCall) otto.Value{ "getenv": svc.getenv, }), nil } func (svc *osPkg) getenv(all otto.FunctionCall) otto.Value { vm := all.Otto key := ottoutil.String(vm, all.Argument(0)) v, err := otto.ToValue(os.Getenv(key)) if err != nil { ottoutil.Throw(vm, "can't set string value: %v", err) } return v }
  77. var secret = std.os.getenv("SOME_SECRET_VARIABLE");

  78. HTTP Support

  79. // internal/js/context/http.go // LoadHTTP loads an HTTP package in the

    VM that sends HTTP requests using the // given client. func LoadHTTP(vm *otto.Otto, pkgname string, client *http.Client, cfgReq func(*http.Request) *http.Request) error { v, err := (&httpPkg{client: client, cfgReq: cfgReq}).load(vm) if err != nil { return err } return vm.Set(pkgname, v) }
  80. type httpPkg struct { client *http.Client cfgReq func(*http.Request) *http.Request }

    func (hpkg *httpPkg) load(vm *otto.Otto) (otto.Value, error) { v, err := vm.Run(`({})`) if err != nil { return q, err } pkg := v.Object() for name, method := range map[string]func(all otto.FunctionCall) otto.Value{ "do": hpkg.do, } { if err := pkg.Set(name, method); err != nil { return q, fmt.Errorf("can't set method %q, %v", name, err) } } return pkg.Value(), nil }
  81. func (hpkg *httpPkg) do(all otto.FunctionCall) otto.Value { vm := all.Otto

    var ( method = ottoutil.String(vm, all.Argument(0)) url = ottoutil.String(vm, all.Argument(1)) headers = ottoutil.StringMapSlice(vm, all.Argument(2)) body = ottoutil.String(vm, all.Argument(3)) ) req, err := http.NewRequest(method, url, strings.NewReader(body)) if err != nil { ottoutil.Throw(vm, err.Error()) } for k, vals := range headers { for _, val := range vals { req.Header.Add(k, val) } } resp, err := hpkg.client.Do(hpkg.cfgReq(req)) if err != nil { ottoutil.Throw(vm, err.Error()) } defer resp.Body.Close() respBody, err := ioutil.ReadAll(resp.Body) if err != nil { ottoutil.Throw(vm, err.Error()) } v, err := vm.Run(`({})`) if err != nil { ottoutil.Throw(vm, err.Error()) } pkg := v.Object() for name, value := range map[string]interface{}{ "code": resp.StatusCode, "headers": map[string][]string(resp.Header), "body": string(respBody), } { if err := pkg.Set(name, value); err != nil { ottoutil.Throw(vm, err.Error()) } } return v }
  82. var response = http.do('GET', 'https://icanhazip.com', {});

  83. Structured Logging

  84. // internal/js/context/log.go // LoadLog loads a log package in the

    VM that logs to the given logger func LoadLog(vm *otto.Otto, pkgname string, ll log.FieldLogger) error { // Setup the logging formatter to be structured as JSON formatted. log.SetFormatter(&log.JSONFormatter{}) // Output the stdout for capturing. log.SetOutput(os.Stdout) v, err := (&logger{ll: ll}).load(vm) if err != nil { return err } return vm.Set(pkgname, v) }
  85. type logger struct { ll log.FieldLogger }

  86. func (ll *logger) load(vm *otto.Otto) (otto.Value, error) { v, err

    := vm.Run(`({})`) if err != nil { return q, err } pkg := v.Object() for name, method := range map[string]func(all otto.FunctionCall) otto.Value{ "kv": ll.kv, "info": ll.info, "error": ll.error, "fail": ll.fail, } { if err := pkg.Set(name, method); err != nil { return q, fmt.Errorf("can't set method %q, %v", name, err) } } return pkg.Value(), nil }
  87. func (ll *logger) kv(all otto.FunctionCall) otto.Value { vm := all.Otto

    var child *log.Entry switch { case all.Argument(0).IsObject(): obj := all.Argument(0).Object() keys := obj.Keys() f := make(log.Fields, len(keys)) for _, key := range obj.Keys() { v, err := obj.Get(key) if err != nil { ottoutil.Throw(vm, err.Error()) } gov, err := v.Export() if err != nil { ottoutil.Throw(vm, err.Error()) } f[key] = gov } child = ll.ll.WithFields(f) case len(all.ArgumentList)%2 == 0: args := all.ArgumentList f := make(log.Fields, len(args)/2) for i := 0; i < len(args); i += 2 { k := ottoutil.String(vm, args[i]) v, err := args[i+1].Export() if err != nil { ottoutil.Throw(vm, err.Error()) } f[k] = v } child = ll.ll.WithFields(f) default: ottoutil.Throw(vm, "invalid call to log.kv") } v, err := (&logger{ll: child}).load(vm) if err != nil { ottoutil.Throw(vm, err.Error()) } return v }
  88. var q = otto.UndefinedValue() func (ll *logger) info(all otto.FunctionCall) otto.Value

    { vm := all.Otto msg := ottoutil.String(vm, all.Argument(0)) ll.ll.Info(msg) return q }
  89. func (ll *logger) fail(all otto.FunctionCall) otto.Value { vm := all.Otto

    msg := ottoutil.String(vm, all.Argument(0)) ll.ll.Error(msg) ottoutil.Throw(all.Otto, msg) return q }
  90. log.kv('foo', 'bar').info('this is an example message');

  91. Let’s add them to the VM in the runner

  92. func Run(ctx context.Context, vm *otto.Otto, jsctx *js.Context, test *js.Test, id

    string) error { testVM := vm.Copy() reqConfig := func(req *http.Request) *http.Request { return req.WithContext(ctx) } if err := jscontext.LoadStdLib(ctx, testVM, "std"); err != nil { return fmt.Errorf("can't setup std package in VM: %v", err) } if err := jscontext.LoadHTTP(testVM, "http", jsctx.HTTPClient, reqConfig); err != nil { return fmt.Errorf("can't setup HTTP package in VM: %v", err) } if err := jscontext.LoadLog(testVM, "log", jsctx.Log); err != nil { return fmt.Errorf("can't setup LOG package in VM: %v", err) } … }
  93. Let's add the runner to the daemon.

  94. func main() { log.SetFormatter(&log.JSONFormatter{}) // Output the stdout for capturing.

    log.SetOutput(os.Stdout) var ( cfgPath = flag.String("cfg", "config.js", "path to a JS config file") workDir = flag.String("work.dir", ".", "directory from which to run, should match expectations about relative paths in cfg.file") ) flag.Parse() if *workDir != "" { if err := os.Chdir(*workDir); err != nil { log.WithError(err).WithField("work.dir", *workDir).Fatal("can't change dir") } } var ( ctx = context.Background() vm = otto.New() ) canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath) launchTests(vm, canaryCfg, testCfgs) // Block forever because we want the tests to run forever. select {} }
  95. func launchTests(vm *otto.Otto, config *canary.Config, configs []*js.TestConfig) { for _,

    cfg := range configs { go runTestForever(db, met, vm, cfg, cfg.Test()) } }
  96. func runTestForever(db dbpkg.CanaryStore, met *metrics.Node, vm *otto.Otto, cfg *js.TestConfig, test

    *js.Test) { ll := log.WithFields(log.Fields{ "test.name": test.Name, }) for { go func(vm *otto.Otto) { testID := uuid.New() ll = ll.WithField("test.id", testID) testCtx := &js.Context{ Log: ll, HTTPClient: &http.Client{ Transport: http.DefaultTransport, }, } ctx, cancel := context.WithTimeout(context.Background(), cfg.Timeout) defer cancel() terr := runner.Run(ctx, vm, testCtx, test, testID) if terr != nil { ll.WithError(terr).Error("test failed") } }(vm.Copy()) // copy VM to avoid polluting global namespace // We'll run the test again after the duration we defined. time.Sleep(cfg.Frequency) } }
  97. Saving Test Execution Results to a Database

  98. None
  99. // internal/db/db.go type CanaryStore interface { StartTest(id string, testName string,

    startTime time.Time) (*TestInstance, error) EndTest(test *TestInstance, failure error, endAt time.Time) error ListTests() ([]TestInstance, error) ListOngoingTests() ([]TestInstance, error) FindTestByID(id string) (*TestInstance, error) Close() error }
  100. // internal/db/models.go // TestInstance collects details about the instance of

    a unique // test execution. type TestInstance struct { TestID string `json:"id,omitempty"` TestName string `json:"name,omitempty"` StartAt time.Time `json:"start_at,omitempty"` EndAt time.Time `json:"end_at,omitempty"` Pass bool `json:"pass,omitempty"` FailCause string `json:"fail_cause,omitempty"` } // BoltTestInstance is what gets serialized and saved to the Bolt database. Only // difference is that we're going to be using strings for StartAt and EndAt type BoltTestInstance struct { TestID string `json:"id,omitempty"` TestName string `json:"name,omitempty"` StartAt string `json:"start_at,omitempty"` EndAt string `json:"end_at,omitempty"` Pass bool `json:"pass,omitempty"` FailCause string `json:"fail_cause,omitempty"` }
  101. var _ CanaryStore = (*boltStore)(nil) type boltStore struct { db

    *bolt.DB ongoingMu sync.RWMutex ongoing map[string]TestInstance cancel context.CancelFunc }
  102. // internal/db/db.go func (db *boltStore) StartTest(id string, testName string, startTime

    time.Time) (*TestInstance, error) { test := &TestInstance{ TestID: id, TestName: testName, StartAt: startTime.UTC(), } // keep in memory until its finished db.ongoingMu.Lock() db.ongoing[id] = *test db.ongoingMu.Unlock() return test, nil }
  103. func (db *boltStore) EndTest(test *TestInstance, failure error, endAt time.Time) error

    { db.ongoingMu.Lock() defer db.ongoingMu.Unlock() t, ok := db.ongoing[test.TestID] if !ok { return fmt.Errorf("test with ID does not exist: %q", test.TestID) } delete(db.ongoing, test.TestID) t.Pass = failure == nil if failure != nil { t.FailCause = failure.Error() } t.EndAt = endAt return insertTest(db.db, &t) }
  104. func insertTest(db *bolt.DB, test *TestInstance) error { return db.Update(func(tx *bolt.Tx)

    error { b := tx.Bucket(testsBucket) // Make this something that is saveable by the database. dbTest := &BoltTestInstance{ TestID: test.TestID, TestName: test.TestName, StartAt: test.StartAt.UTC().Format(time.RFC3339), EndAt: test.StartAt.UTC().Format(time.RFC3339), Pass: test.Pass, FailCause: test.FailCause, } // Marshal and save the encoded test. if buf, err := json.Marshal(dbTest); err != nil { return err } else if err := b.Put([]byte(dbTest.TestID), buf); err != nil { return err } return nil }) }
  105. Let’s add it to the daemon.

  106. // cmd/canaryd/main.go func main() { log.SetFormatter(&log.JSONFormatter{}) // Output the stdout

    for capturing. log.SetOutput(os.Stdout) var ( cfgPath = flag.String("cfg", "config.js", "path to a JS config file") workDir = flag.String("work.dir", ".", "directory from which to run, should match expectations about relative paths in cfg.file") dbPath = flag.String("db.file", "canary.db", "file for the canary database") ) flag.Parse() var ( ctx = context.Background() vm = otto.New() ) db := mustOpenBolt(*dbPath) defer db.Close() canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath) launchTests(db, vm, canaryCfg, testCfgs) // Block forever because we want the tests to run forever. select {} }
  107. func mustOpenBolt(path string) dbpkg.CanaryStore { db, err := dbpkg.NewBoltStore(path) if

    err != nil { log.WithError(err).Fatal("can't open database") } return db }
  108. func runTestForever(db dbpkg.CanaryStore, met *metrics.Node, vm *otto.Otto, cfg *js.TestConfig, test

    *js.Test) { … for { go func(vm *otto.Otto) { testID := uuid.New() ll = ll.WithField("test.id", testID) testCtx := &js.Context{ Log: ll, HTTPClient: &http.Client{ Transport: http.DefaultTransport, }, } dbtest, err := db.StartTest( testID, test.Name, time.Now(), ) if err != nil { ll.WithError(err).Error("could not start the test") } ctx, cancel := context.WithTimeout(context.Background(), cfg.Timeout) defer cancel() terr := runner.Run(ctx, vm, testCtx, test, testID) if terr != nil { ll.WithError(terr).Error("test failed") } if err := db.EndTest(dbtest, terr, time.Now()); err != nil { ll.WithError(err).Error("couldn't mark test as being ended") } }(vm.Copy()) // copy VM to avoid polluting global namespace // We'll run the test again after the duration we defined. time.Sleep(cfg.Frequency) } }
  109. Front-End for Viewing Test Execution Results

  110. None
  111. // internal/app/serve.go import ( "github.com/99designs/gqlgen/handler" "github.com/gorilla/mux" "github.com/iheanyi/simple-canary/internal/db" "github.com/sirupsen/logrus" ) //

    App is an instance of the dashboard for the canary. type App struct { l logrus.FieldLogger db db.CanaryStore } func New(db db.CanaryStore, r *mux.Router) *App { app := &App{ l: logrus.WithField("component", "app"), db: db, } r.Handle("/", handler.Playground("GraphQL Playground", "/query")) r.Handle("/query", handler.GraphQL(NewExecutableSchema(Config{Resolvers: &Resolver{ db: db, }}))) return app }
  112. type TestInstance { id: ID! name: String! start_at: Time! end_at:

    Time pass: Boolean fail_cause: String } type Query { tests: [TestInstance!]! test(id: String!): TestInstance ongoingTests: [TestInstance!]! } scalar Time
  113. type Resolver struct { db dbpkg.CanaryStore } func (r *Resolver)

    Query() QueryResolver { return &queryResolver{r} } func (r *Resolver) TestInstance() TestInstanceResolver { return &testInstanceResolver{r} } type queryResolver struct{ *Resolver } func (r *queryResolver) Tests(ctx context.Context) ([]dbpkg.TestInstance, error) { tests, err := r.db.ListTests() return tests, err } func (r *queryResolver) Test(ctx context.Context, id string) (*dbpkg.TestInstance, error) { test, err := r.db.FindTestByID(id) return test, err } func (r *queryResolver) OngoingTests(ctx context.Context) ([]dbpkg.TestInstance, error) { tests, err := r.db.ListOngoingTests() return tests, err } type testInstanceResolver struct{ *Resolver } func (r *testInstanceResolver) ID(ctx context.Context, obj *dbpkg.TestInstance) (string, error) { return obj.TestID, nil } func (r *testInstanceResolver) Name(ctx context.Context, obj *dbpkg.TestInstance) (string, error) { return obj.TestName, nil } func (r *testInstanceResolver) StartAt(ctx context.Context, obj *dbpkg.TestInstance) (time.Time, error) { return obj.StartAt, nil } func (r *testInstanceResolver) EndAt(ctx context.Context, obj *dbpkg.TestInstance) (*time.Time, error) { return &obj.EndAt, nil } func (r *testInstanceResolver) FailCause(ctx context.Context, obj *dbpkg.TestInstance) (*string, error) { return &obj.FailCause, nil }
  114. Back to the daemon.

  115. func main() { log.SetFormatter(&log.JSONFormatter{}) // Output the stdout for capturing.

    log.SetOutput(os.Stdout) var ( … listenHost = flag.String("listen.host", "", "interface on which to listen") listenPort = flag.String("listen.port", "8080", "port on which to listen") ) … l := mustListen(*listenHost, *listenPort) db := mustOpenBolt(*dbPath) defer db.Close() if err := launchHTTP(ctx, l, db); err != nil { log.WithError(err).Fatal("can't launch http server") } … }
  116. func launchHTTP( ctx context.Context, l net.Listener, db dbpkg.CanaryStore, ) error

    { addr := l.Addr().(*net.TCPAddr) host, err := os.Hostname() if err != nil { return fmt.Errorf("can't get hostname: %v", err) } host = net.JoinHostPort(host, strconv.Itoa(addr.Port)) r := mux.NewRouter().Host(host).Subrouter() _ = app.New(db, r) log.WithField("host", host).Info("API starting") go http.Serve(l, r) return nil }
  117. None
  118. Last, but not least, metrics.

  119. // internal/metrics/metrics.go package metrics import ( "net/http" "github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus/promhttp" )

    // Node is a wrapper around Prometheus's registerer interface type Node struct { registry prometheus.Registerer } // Prometheus returns an instance of the metrics node and the prometheus // handler. func Prometheus() (*Node, http.Handler) { registry := prometheus.NewRegistry() handler := promhttp.HandlerFor(registry, promhttp.HandlerOpts{}) registry.MustRegister(prometheus.NewProcessCollector(prometheus.ProcessCollectorOpts{})) registry.MustRegister(prometheus.NewGoCollector()) return &Node{ registry: registry, }, handler }
  120. // cmd/canaryd/main.go func main() { … met, hdl := metrics.Prometheus()

    … if err := launchHTTP(ctx, l, hdl, db); err != nil { log.WithError(err).Fatal("can't launch http server") } canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath) launchTests(db, met, vm, canaryCfg, testCfgs) // Block forever because we want the tests to run forever. select {} }
  121. func launchHTTP( ctx context.Context, l net.Listener, promhdl http.Handler, db dbpkg.CanaryStore,

    ) error { addr := l.Addr().(*net.TCPAddr) host, err := os.Hostname() if err != nil { return fmt.Errorf("can't get hostname: %v", err) } host = net.JoinHostPort(host, strconv.Itoa(addr.Port)) r := mux.NewRouter().Host(host).Subrouter() _ = app.New(db, r) r.PathPrefix("/metrics").Handler(promhdl) log.WithField("host", host).Info("API starting") go http.Serve(l, r) return nil }
  122. func runTestForever(db dbpkg.CanaryStore, met *metrics.Node, vm *otto.Otto, cfg *js.TestConfig, test

    *js.Test) { ll := log.WithFields(log.Fields{ "test.name": test.Name, }) tmet := met.Labels(map[string]string{ "test_name": test.Name, }) var ( started = tmet.Counter("test_started_count", "Number of tests that were started") finished = tmet.Counter("test_finished_count", "Number of tests that have finished", "result") running = tmet.Gauge("test_running_total", "Tests that are currently running") _ = tmet.Summary("test_duration_seconds", "Duration of tests", []float64{0.5, 0.75, 0.9, 0.99, 1.0}, "result") ) … }
  123. func runTestForever(db dbpkg.CanaryStore, met *metrics.Node, vm *otto.Otto, cfg *js.TestConfig, test

    *js.Test) { … for { go func(vm *otto.Otto) { … started.WithLabelValues().Add(1) running.Add(1) defer running.Add(-1) ctx, cancel := context.WithTimeout(context.Background(), cfg.Timeout) defer cancel() terr := runner.Run(ctx, vm, testCtx, test, testID) if terr != nil { finished.With(prometheus.Labels{"result": "fail"}).Add(1) ll.WithError(terr).Error("test failed") } else { finished.With(prometheus.Labels{"result": "pass"}).Add(1) } if err := db.EndTest(dbtest, terr, time.Now()); err != nil { ll.WithError(err).Error("couldn't mark test as being ended") } }(vm.Copy()) // copy VM to avoid polluting global namespace // We'll run the test again after the duration we defined. time.Sleep(cfg.Frequency) } }
  124. None
  125. We're done! Let's see a demo?

  126. https://github.com/iheanyi/ simple-canary

  127. Special thanks to Antoine Grondin

  128. Questions?

  129. Thank you! @kwuchu