$30 off During Our Annual Pro Sale. View Details »

Building a Canary Testing Framework

Building a Canary Testing Framework

Originally given at StrangeLoop in 2018, this talk walks through what canary testing and building a simple canary testing framework.

Iheanyi Ekechukwu

September 28, 2018
Tweet

More Decks by Iheanyi Ekechukwu

Other Decks in Programming

Transcript

  1. Building a Canary Testing
    Framework
    Iheanyi Ekechukwu

    View Slide

  2. Iheanyi Ekechukwu
    Senior Software Engineer @ GitHub
    @kwuchu
    https://github.com/iheanyi

    View Slide

  3. What is a canary?

    View Slide

  4. View Slide

  5. Canary tests should mimic user
    behavior and test a specific
    feature.

    View Slide

  6. Canary tests (normally) should run
    against your production systems.

    View Slide

  7. Opinion: Each product in your
    organization should have its own
    canary running.

    View Slide

  8. Is this an integration test?

    View Slide

  9. Well, yes and no.

    View Slide

  10. (\__/)
    (•ź•) Integration Tests
    \>

    View Slide

  11. View Slide

  12. ⠀ ⠀ ⠀(\__/)
    (•ź•)
     _ノ ヽ ノ\_
    `/ `/ ⌒Y⌒ Y ヽ Canaries
    (  (三ヽ⼈人  /   |
    | ノ⌒\  ̄ ̄ヽ  ノ
    ヽ___>、___/
       |( 王 ノ〈 (\__/)
       /ミ`ー―⼺彡\ (•ź•) Integration Tests
      / ╰ ╯ \/ \>

    View Slide

  13. You can do more than just CRUD
    operations (ex: SSH, shell
    commands)

    View Slide

  14. Allow for more specificity and
    granularity for the execution
    environment

    View Slide

  15. Allow for better observability into
    what’s going on within your
    systems

    View Slide

  16. So, how do we build a canary
    testing framework?

    View Slide

  17. First, what are our necessary
    features?

    View Slide

  18. Visualization of Test Output /
    Execution Status

    View Slide

  19. Recording and alerting on
    metrics via Prometheus

    View Slide

  20. Run all tests simultaneously over
    a set interval with a timeout

    View Slide

  21. Ability to write tests in a
    flexible, approachable language

    View Slide

  22. Canary Framework
    Architecture Components

    View Slide

  23. Go for the Framework

    View Slide

  24. JavaScript for the Test VM

    View Slide

  25. A database for storing test
    execution results

    View Slide

  26. API & Web UI for visualizing test
    execution results from our
    database

    View Slide

  27. Why Go for the actual
    framework?

    View Slide

  28. Concurrency is our friend

    View Slide

  29. Decent support for embedding
    languages

    View Slide

  30. Go API Clients

    View Slide

  31. Why JavaScript for the
    scripting language?

    View Slide

  32. Flexibility

    View Slide

  33. Low barrier to entry

    View Slide

  34. Embeddable within Go via Otto

    View Slide

  35. Canary Daemon Architecture

    View Slide

  36. Let’s Break This Down

    View Slide

  37. The JavaScript VM

    View Slide

  38. Allows us to write and run test
    scripts using JavaScript

    View Slide

  39. Extensions to the framework (such
    as logging, http, etc.) are attached
    to the VM

    View Slide

  40. View Slide

  41. The Test Runner

    View Slide

  42. Isolates and configures the
    VM, then runs the tests

    View Slide

  43. Returns errors (if any) from
    the test execution

    View Slide

  44. The Canary Daemon

    View Slide

  45. Executes the config file and
    registers the tests to run

    View Slide

  46. Repeatedly runs each test using
    the runner, save results to the
    database

    View Slide

  47. Also, records and exports metrics
    such as tests started, running, and
    finished via Prometheus.

    View Slide

  48. Serves the API and front-end for
    visualizing test execution results

    View Slide

  49. Okay, let’s walk through the
    code for a simple canary.

    View Slide

  50. View Slide

  51. Loading Test Configurations

    View Slide

  52. // internal/js/canary/config.go
    // Load a Javascript config script, returning all the TestConfigs that
    // are defined in it. The given VM is left unchanged, but it's context
    // is available during parsing of the config script.
    func Load(vm *otto.Otto, src io.Reader) (*Config, []*js.TestConfig, error) {
    configVM := vm.Copy() // avoid polluting the global namespace
    ctx := new(ctx)
    configVM.Set("settings", ctx.ottoFuncSettings)
    configVM.Set("file", ctx.ottoFuncFile)
    configVM.Set("register_test", ctx.ottoFuncRegisterTest)
    if err := jsctx.LoadStdLib(context.Background(), configVM, "std"); err != nil {
    return nil, nil, fmt.Errorf("can't load stdlib: %v", err)
    }
    source, err := ioutil.ReadAll(src)
    if err != nil {
    return nil, nil, fmt.Errorf("can't read config file: %v", err)
    }
    if _, err := configVM.Run(source); err != nil {
    return nil, nil, fmt.Errorf("can't apply configuration: %v", err)
    }
    if ctx.cfg == nil {
    ctx.cfg = new(Config)
    }
    return ctx.cfg, ctx.tests, nil
    }

    View Slide

  53. ctx := new(ctx)

    View Slide

  54. // Config holds the global canary configuration.
    type Config struct {
    Name string
    }
    type ctx struct {
    cfg *Config
    tests []*js.TestConfig
    }

    View Slide

  55. // A TestConfig describes how a class of tests must be run.
    type TestConfig struct {
    Name string
    Script *otto.Script
    Frequency time.Duration
    Timeout time.Duration
    }

    View Slide

  56. // internal/js/canary/config.go
    // Load a Javascript config script, returning all the TestConfigs that
    // are defined in it. The given VM is left unchanged, but it's context
    // is available during parsing of the config script.
    func Load(vm *otto.Otto, src io.Reader) (*Config, []*js.TestConfig, error) {
    configVM := vm.Copy() // avoid polluting the global namespace
    ctx := new(ctx)
    configVM.Set("settings", ctx.ottoFuncSettings)
    configVM.Set("file", ctx.ottoFuncFile)
    configVM.Set("register_test", ctx.ottoFuncRegisterTest)
    if err := jsctx.LoadStdLib(context.Background(), configVM, "std"); err != nil {
    return nil, nil, fmt.Errorf("can't load stdlib: %v", err)
    }
    source, err := ioutil.ReadAll(src)
    if err != nil {
    return nil, nil, fmt.Errorf("can't read config file: %v", err)
    }
    if _, err := configVM.Run(source); err != nil {
    return nil, nil, fmt.Errorf("can't apply configuration: %v", err)
    }
    if ctx.cfg == nil {
    ctx.cfg = new(Config)
    }
    return ctx.cfg, ctx.tests, nil
    }

    View Slide

  57. configVM.Set("file", ctx.ottoFuncFile)
    configVM.Set("register_test", ctx.ottoFuncRegisterTest)

    View Slide

  58. // internal/js/canary/config.go
    func (ctx *ctx) ottoFuncFile(call otto.FunctionCall) otto.Value {
    filename := ottoutil.String(call.Otto, call.Argument(0))
    data, err := ioutil.ReadFile(filename)
    if err != nil {
    ottoutil.Throw(call.Otto, err.Error())
    }
    v, err := otto.ToValue(string(data))
    if err != nil {
    ottoutil.Throw(call.Otto, err.Error())
    }
    return v
    }

    View Slide

  59. // internal/js/canary/config.go
    func (ctx *ctx) ottoFuncRegisterTest(call otto.FunctionCall) otto.Value {
    cfg := new(testConfig)
    cfg.load(call.Otto, call.Argument(0))
    src := ottoutil.String(call.Otto, call.Argument(1))
    test := &js.TestConfig{
    Name: cfg.Name,
    Frequency: cfg.Frequency,
    Timeout: cfg.Timeout,
    }
    var err error
    test.Script, err = call.Otto.Compile("", src)
    if err != nil {
    ottoutil.Throw(call.Otto, err.Error())
    }
    ctx.tests = append(ctx.tests, test)
    return otto.UndefinedValue()
    }
    type testConfig struct {
    Name string
    Frequency time.Duration
    Timeout time.Duration
    }

    View Slide

  60. cfg := new(testConfig)
    cfg.load(call.Otto, call.Argument(0))

    View Slide

  61. // internal/js/canary/config.go
    func (cfg *testConfig) load(vm *otto.Otto, config otto.Value) {
    ottoutil.LoadObject(vm, config, map[string]func(otto.Value) error{
    "name": func(v otto.Value) (err error) {
    cfg.Name, err = v.ToString()
    return
    },
    "frequency": func(v otto.Value) error {
    cfg.Frequency = ottoutil.Duration(vm, v)
    return nil
    },
    "timeout": func(v otto.Value) error {
    cfg.Timeout = ottoutil.Duration(vm, v)
    return nil
    },
    })
    }

    View Slide

  62. func (ctx *ctx) ottoFuncRegisterTest(call otto.FunctionCall) otto.Value {
    cfg := new(testConfig)
    cfg.load(call.Otto, call.Argument(0))
    src := ottoutil.String(call.Otto, call.Argument(1))
    test := &js.TestConfig{
    Name: cfg.Name,
    Frequency: cfg.Frequency,
    Timeout: cfg.Timeout,
    }
    var err error
    test.Script, err = call.Otto.Compile("", src)
    if err != nil {
    ottoutil.Throw(call.Otto, err.Error())
    }
    ctx.tests = append(ctx.tests, test)
    return otto.UndefinedValue()
    }
    type ctx struct {
    cfg *Config
    tests []*js.TestConfig
    }

    View Slide

  63. // internal/js/canary/config.go
    // Load a Javascript config script, returning all the TestConfigs that
    // are defined in it. The given VM is left unchanged, but it's context
    // is available during parsing of the config script.
    func Load(vm *otto.Otto, src io.Reader) (*Config, []*js.TestConfig, error) {
    configVM := vm.Copy() // avoid polluting the global namespace
    ctx := new(ctx)
    configVM.Set("settings", ctx.ottoFuncSettings)
    configVM.Set("file", ctx.ottoFuncFile)
    configVM.Set("register_test", ctx.ottoFuncRegisterTest)
    if err := jsctx.LoadStdLib(context.Background(), configVM, "std"); err != nil {
    return nil, nil, fmt.Errorf("can't load stdlib: %v", err)
    }
    source, err := ioutil.ReadAll(src)
    if err != nil {
    return nil, nil, fmt.Errorf("can't read config file: %v", err)
    }
    if _, err := configVM.Run(source); err != nil {
    return nil, nil, fmt.Errorf("can't apply configuration: %v", err)
    }
    if ctx.cfg == nil {
    ctx.cfg = new(Config)
    }
    return ctx.cfg, ctx.tests, nil
    }

    View Slide

  64. Let’s look at an example
    config.js file.

    View Slide

  65. settings({
    name: 'Example Canary',
    });
    var frequency = '10m';
    var timeout = '10m';
    register_test(
    {
    name: 'simple example test',
    frequency: frequency,
    timeout: timeout,
    },
    file('simple-example.js')
    );

    View Slide

  66. How does config.Load even get
    called?

    View Slide

  67. Let's create our daemon and
    call this function.

    View Slide

  68. // cmd/canaryd/main.go
    func main() {}

    View Slide

  69. // cmd/canaryd/main.go
    func main() {
    log.SetFormatter(&log.JSONFormatter{})
    log.SetOutput(os.Stdout)
    var (
    cfgPath = flag.String("cfg", "config.js", "path to a JS config file")
    workDir = flag.String("work.dir", ".", "directory from which to run, should match expectations about
    relative paths in cfg.file")
    )
    flag.Parse()
    if *workDir != "" {
    if err := os.Chdir(*workDir); err != nil {
    log.WithError(err).WithField("work.dir", *workDir).Fatal("can't change dir")
    }
    }
    var (
    vm = otto.New()
    )
    canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath)
    }

    View Slide

  70. func mustLoadConfigs(vm *otto.Otto, filename string) (*canary.Config, []*js.TestConfig) {
    cfg, err := os.Open(filename)
    if err != nil {
    log.WithError(err).WithField("filename", filename).Fatal("could not open configuration file")
    }
    defer cfg.Close()
    canaryConfig, testCfgs, err := canary.Load(vm, cfg)
    if err != nil {
    log.WithError(err).Fatal("cannot load canary test configs")
    }
    return canaryConfig, testCfgs
    }

    View Slide

  71. The Test Runner

    View Slide

  72. // internal/js/runner/runner.go
    func Run(ctx context.Context, vm *otto.Otto, jsctx *js.Context, test *js.Test, id string) error {
    testVM := vm.Copy()
    done := make(chan struct{})
    go func() {
    select {
    case <-ctx.Done():
    testVM.Interrupt <- func() { panic(ctx.Err()) }
    case <-done:
    }
    }()
    _, err := testVM.Run(test.Script)
    if oe, ok := err.(*otto.Error); ok {
    err = errors.New(oe.String())
    }
    close(done)
    return err
    }

    View Slide

  73. Extending the VM with logging,
    HTTP, and standard library
    functionality

    View Slide

  74. Standard Library Functions

    View Slide

  75. // internal/js/context/stdlib.go
    func LoadStdLib(ctx context.Context, vm *otto.Otto, pkgname string) error {
    v, err := vm.Run(`({})`)
    if err != nil {
    return err
    }
    pkg := v.Object()
    for name, subpkg := range map[string]func(*otto.Otto) (otto.Value, error){
    "time": (&timePkg{ctx: ctx}).load,
    "os": (&osPkg{ctx: ctx}).load,
    "do": (newDoPkg(ctx)).load,
    } {
    v, err := subpkg(vm)
    if err != nil {
    return fmt.Errorf("can't load package %q: %v", name, err)
    }
    if err := pkg.Set(name, v); err != nil {
    return fmt.Errorf("can't set package %q: %v", name, err)
    }
    }
    return vm.Set(pkgname, pkg)
    }

    View Slide

  76. type osPkg struct {
    ctx context.Context
    }
    func (svc *osPkg) load(vm *otto.Otto) (otto.Value, error) {
    return ottoutil.ToPkg(vm, map[string]func(all otto.FunctionCall) otto.Value{
    "getenv": svc.getenv,
    }), nil
    }
    func (svc *osPkg) getenv(all otto.FunctionCall) otto.Value {
    vm := all.Otto
    key := ottoutil.String(vm, all.Argument(0))
    v, err := otto.ToValue(os.Getenv(key))
    if err != nil {
    ottoutil.Throw(vm, "can't set string value: %v", err)
    }
    return v
    }

    View Slide

  77. var secret = std.os.getenv("SOME_SECRET_VARIABLE");

    View Slide

  78. HTTP Support

    View Slide

  79. // internal/js/context/http.go
    // LoadHTTP loads an HTTP package in the VM that sends HTTP requests using the
    // given client.
    func LoadHTTP(vm *otto.Otto, pkgname string, client *http.Client, cfgReq
    func(*http.Request) *http.Request) error {
    v, err := (&httpPkg{client: client, cfgReq: cfgReq}).load(vm)
    if err != nil {
    return err
    }
    return vm.Set(pkgname, v)
    }

    View Slide

  80. type httpPkg struct {
    client *http.Client
    cfgReq func(*http.Request) *http.Request
    }
    func (hpkg *httpPkg) load(vm *otto.Otto) (otto.Value, error) {
    v, err := vm.Run(`({})`)
    if err != nil {
    return q, err
    }
    pkg := v.Object()
    for name, method := range map[string]func(all otto.FunctionCall) otto.Value{
    "do": hpkg.do,
    } {
    if err := pkg.Set(name, method); err != nil {
    return q, fmt.Errorf("can't set method %q, %v", name, err)
    }
    }
    return pkg.Value(), nil
    }

    View Slide

  81. func (hpkg *httpPkg) do(all otto.FunctionCall) otto.Value {
    vm := all.Otto
    var (
    method = ottoutil.String(vm, all.Argument(0))
    url = ottoutil.String(vm, all.Argument(1))
    headers = ottoutil.StringMapSlice(vm, all.Argument(2))
    body = ottoutil.String(vm, all.Argument(3))
    )
    req, err := http.NewRequest(method, url, strings.NewReader(body))
    if err != nil {
    ottoutil.Throw(vm, err.Error())
    }
    for k, vals := range headers {
    for _, val := range vals {
    req.Header.Add(k, val)
    }
    }
    resp, err := hpkg.client.Do(hpkg.cfgReq(req))
    if err != nil {
    ottoutil.Throw(vm, err.Error())
    }
    defer resp.Body.Close()
    respBody, err := ioutil.ReadAll(resp.Body)
    if err != nil {
    ottoutil.Throw(vm, err.Error())
    }
    v, err := vm.Run(`({})`)
    if err != nil {
    ottoutil.Throw(vm, err.Error())
    }
    pkg := v.Object()
    for name, value := range map[string]interface{}{
    "code": resp.StatusCode,
    "headers": map[string][]string(resp.Header),
    "body": string(respBody),
    } {
    if err := pkg.Set(name, value); err != nil {
    ottoutil.Throw(vm, err.Error())
    }
    }
    return v
    }

    View Slide

  82. var response = http.do('GET', 'https://icanhazip.com', {});

    View Slide

  83. Structured Logging

    View Slide

  84. // internal/js/context/log.go
    // LoadLog loads a log package in the VM that logs to the given logger
    func LoadLog(vm *otto.Otto, pkgname string, ll log.FieldLogger) error {
    // Setup the logging formatter to be structured as JSON formatted.
    log.SetFormatter(&log.JSONFormatter{})
    // Output the stdout for capturing.
    log.SetOutput(os.Stdout)
    v, err := (&logger{ll: ll}).load(vm)
    if err != nil {
    return err
    }
    return vm.Set(pkgname, v)
    }

    View Slide

  85. type logger struct {
    ll log.FieldLogger
    }

    View Slide

  86. func (ll *logger) load(vm *otto.Otto) (otto.Value, error) {
    v, err := vm.Run(`({})`)
    if err != nil {
    return q, err
    }
    pkg := v.Object()
    for name, method := range map[string]func(all otto.FunctionCall) otto.Value{
    "kv": ll.kv,
    "info": ll.info,
    "error": ll.error,
    "fail": ll.fail,
    } {
    if err := pkg.Set(name, method); err != nil {
    return q, fmt.Errorf("can't set method %q, %v", name, err)
    }
    }
    return pkg.Value(), nil
    }

    View Slide

  87. func (ll *logger) kv(all otto.FunctionCall) otto.Value {
    vm := all.Otto
    var child *log.Entry
    switch {
    case all.Argument(0).IsObject():
    obj := all.Argument(0).Object()
    keys := obj.Keys()
    f := make(log.Fields, len(keys))
    for _, key := range obj.Keys() {
    v, err := obj.Get(key)
    if err != nil {
    ottoutil.Throw(vm, err.Error())
    }
    gov, err := v.Export()
    if err != nil {
    ottoutil.Throw(vm, err.Error())
    }
    f[key] = gov
    }
    child = ll.ll.WithFields(f)
    case len(all.ArgumentList)%2 == 0:
    args := all.ArgumentList
    f := make(log.Fields, len(args)/2)
    for i := 0; i < len(args); i += 2 {
    k := ottoutil.String(vm, args[i])
    v, err := args[i+1].Export()
    if err != nil {
    ottoutil.Throw(vm, err.Error())
    }
    f[k] = v
    }
    child = ll.ll.WithFields(f)
    default:
    ottoutil.Throw(vm, "invalid call to log.kv")
    }
    v, err := (&logger{ll: child}).load(vm)
    if err != nil {
    ottoutil.Throw(vm, err.Error())
    }
    return v
    }

    View Slide

  88. var q = otto.UndefinedValue()
    func (ll *logger) info(all otto.FunctionCall) otto.Value {
    vm := all.Otto
    msg := ottoutil.String(vm, all.Argument(0))
    ll.ll.Info(msg)
    return q
    }

    View Slide

  89. func (ll *logger) fail(all otto.FunctionCall) otto.Value {
    vm := all.Otto
    msg := ottoutil.String(vm, all.Argument(0))
    ll.ll.Error(msg)
    ottoutil.Throw(all.Otto, msg)
    return q
    }

    View Slide

  90. log.kv('foo', 'bar').info('this is an example message');

    View Slide

  91. Let’s add them to the VM in
    the runner

    View Slide

  92. func Run(ctx context.Context, vm *otto.Otto, jsctx *js.Context, test *js.Test, id string) error {
    testVM := vm.Copy()
    reqConfig := func(req *http.Request) *http.Request {
    return req.WithContext(ctx)
    }
    if err := jscontext.LoadStdLib(ctx, testVM, "std"); err != nil {
    return fmt.Errorf("can't setup std package in VM: %v", err)
    }
    if err := jscontext.LoadHTTP(testVM, "http", jsctx.HTTPClient, reqConfig); err != nil {
    return fmt.Errorf("can't setup HTTP package in VM: %v", err)
    }
    if err := jscontext.LoadLog(testVM, "log", jsctx.Log); err != nil {
    return fmt.Errorf("can't setup LOG package in VM: %v", err)
    }

    }

    View Slide

  93. Let's add the runner to the
    daemon.

    View Slide

  94. func main() {
    log.SetFormatter(&log.JSONFormatter{})
    // Output the stdout for capturing.
    log.SetOutput(os.Stdout)
    var (
    cfgPath = flag.String("cfg", "config.js", "path to a JS config file")
    workDir = flag.String("work.dir", ".", "directory from which to run, should match expectations about
    relative paths in cfg.file")
    )
    flag.Parse()
    if *workDir != "" {
    if err := os.Chdir(*workDir); err != nil {
    log.WithError(err).WithField("work.dir", *workDir).Fatal("can't change dir")
    }
    }
    var (
    ctx = context.Background()
    vm = otto.New()
    )
    canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath)
    launchTests(vm, canaryCfg, testCfgs)
    // Block forever because we want the tests to run forever.
    select {}
    }

    View Slide

  95. func launchTests(vm *otto.Otto, config *canary.Config, configs
    []*js.TestConfig) {
    for _, cfg := range configs {
    go runTestForever(db, met, vm, cfg, cfg.Test())
    }
    }

    View Slide

  96. func runTestForever(db dbpkg.CanaryStore, met *metrics.Node, vm *otto.Otto, cfg *js.TestConfig, test *js.Test) {
    ll := log.WithFields(log.Fields{
    "test.name": test.Name,
    })
    for {
    go func(vm *otto.Otto) {
    testID := uuid.New()
    ll = ll.WithField("test.id", testID)
    testCtx := &js.Context{
    Log: ll,
    HTTPClient: &http.Client{
    Transport: http.DefaultTransport,
    },
    }
    ctx, cancel := context.WithTimeout(context.Background(), cfg.Timeout)
    defer cancel()
    terr := runner.Run(ctx, vm, testCtx, test, testID)
    if terr != nil {
    ll.WithError(terr).Error("test failed")
    }
    }(vm.Copy()) // copy VM to avoid polluting global namespace
    // We'll run the test again after the duration we defined.
    time.Sleep(cfg.Frequency)
    }
    }

    View Slide

  97. Saving Test Execution Results
    to a Database

    View Slide

  98. View Slide

  99. // internal/db/db.go
    type CanaryStore interface {
    StartTest(id string, testName string, startTime time.Time) (*TestInstance, error)
    EndTest(test *TestInstance, failure error, endAt time.Time) error
    ListTests() ([]TestInstance, error)
    ListOngoingTests() ([]TestInstance, error)
    FindTestByID(id string) (*TestInstance, error)
    Close() error
    }

    View Slide

  100. // internal/db/models.go
    // TestInstance collects details about the instance of a unique
    // test execution.
    type TestInstance struct {
    TestID string `json:"id,omitempty"`
    TestName string `json:"name,omitempty"`
    StartAt time.Time `json:"start_at,omitempty"`
    EndAt time.Time `json:"end_at,omitempty"`
    Pass bool `json:"pass,omitempty"`
    FailCause string `json:"fail_cause,omitempty"`
    }
    // BoltTestInstance is what gets serialized and saved to the Bolt database. Only
    // difference is that we're going to be using strings for StartAt and EndAt
    type BoltTestInstance struct {
    TestID string `json:"id,omitempty"`
    TestName string `json:"name,omitempty"`
    StartAt string `json:"start_at,omitempty"`
    EndAt string `json:"end_at,omitempty"`
    Pass bool `json:"pass,omitempty"`
    FailCause string `json:"fail_cause,omitempty"`
    }

    View Slide

  101. var _ CanaryStore = (*boltStore)(nil)
    type boltStore struct {
    db *bolt.DB
    ongoingMu sync.RWMutex
    ongoing map[string]TestInstance
    cancel context.CancelFunc
    }

    View Slide

  102. // internal/db/db.go
    func (db *boltStore) StartTest(id string, testName string, startTime
    time.Time) (*TestInstance, error) {
    test := &TestInstance{
    TestID: id,
    TestName: testName,
    StartAt: startTime.UTC(),
    }
    // keep in memory until its finished
    db.ongoingMu.Lock()
    db.ongoing[id] = *test
    db.ongoingMu.Unlock()
    return test, nil
    }

    View Slide

  103. func (db *boltStore) EndTest(test *TestInstance, failure error, endAt time.Time) error {
    db.ongoingMu.Lock()
    defer db.ongoingMu.Unlock()
    t, ok := db.ongoing[test.TestID]
    if !ok {
    return fmt.Errorf("test with ID does not exist: %q", test.TestID)
    }
    delete(db.ongoing, test.TestID)
    t.Pass = failure == nil
    if failure != nil {
    t.FailCause = failure.Error()
    }
    t.EndAt = endAt
    return insertTest(db.db, &t)
    }

    View Slide

  104. func insertTest(db *bolt.DB, test *TestInstance) error {
    return db.Update(func(tx *bolt.Tx) error {
    b := tx.Bucket(testsBucket)
    // Make this something that is saveable by the database.
    dbTest := &BoltTestInstance{
    TestID: test.TestID,
    TestName: test.TestName,
    StartAt: test.StartAt.UTC().Format(time.RFC3339),
    EndAt: test.StartAt.UTC().Format(time.RFC3339),
    Pass: test.Pass,
    FailCause: test.FailCause,
    }
    // Marshal and save the encoded test.
    if buf, err := json.Marshal(dbTest); err != nil {
    return err
    } else if err := b.Put([]byte(dbTest.TestID), buf); err != nil {
    return err
    }
    return nil
    })
    }

    View Slide

  105. Let’s add it to the daemon.

    View Slide

  106. // cmd/canaryd/main.go
    func main() {
    log.SetFormatter(&log.JSONFormatter{})
    // Output the stdout for capturing.
    log.SetOutput(os.Stdout)
    var (
    cfgPath = flag.String("cfg", "config.js", "path to a JS config file")
    workDir = flag.String("work.dir", ".", "directory from which to run, should match expectations about relative paths in
    cfg.file")
    dbPath = flag.String("db.file", "canary.db", "file for the canary database")
    )
    flag.Parse()
    var (
    ctx = context.Background()
    vm = otto.New()
    )
    db := mustOpenBolt(*dbPath)
    defer db.Close()
    canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath)
    launchTests(db, vm, canaryCfg, testCfgs)
    // Block forever because we want the tests to run forever.
    select {}
    }

    View Slide

  107. func mustOpenBolt(path string) dbpkg.CanaryStore {
    db, err := dbpkg.NewBoltStore(path)
    if err != nil {
    log.WithError(err).Fatal("can't open database")
    }
    return db
    }

    View Slide

  108. func runTestForever(db dbpkg.CanaryStore, met *metrics.Node, vm *otto.Otto, cfg *js.TestConfig, test *js.Test) {

    for {
    go func(vm *otto.Otto) {
    testID := uuid.New()
    ll = ll.WithField("test.id", testID)
    testCtx := &js.Context{
    Log: ll,
    HTTPClient: &http.Client{
    Transport: http.DefaultTransport,
    },
    }
    dbtest, err := db.StartTest(
    testID,
    test.Name,
    time.Now(),
    )
    if err != nil {
    ll.WithError(err).Error("could not start the test")
    }
    ctx, cancel := context.WithTimeout(context.Background(), cfg.Timeout)
    defer cancel()
    terr := runner.Run(ctx, vm, testCtx, test, testID)
    if terr != nil {
    ll.WithError(terr).Error("test failed")
    }
    if err := db.EndTest(dbtest, terr, time.Now()); err != nil {
    ll.WithError(err).Error("couldn't mark test as being ended")
    }
    }(vm.Copy()) // copy VM to avoid polluting global namespace
    // We'll run the test again after the duration we defined.
    time.Sleep(cfg.Frequency)
    }
    }

    View Slide

  109. Front-End for Viewing Test
    Execution Results

    View Slide

  110. View Slide

  111. // internal/app/serve.go
    import (
    "github.com/99designs/gqlgen/handler"
    "github.com/gorilla/mux"
    "github.com/iheanyi/simple-canary/internal/db"
    "github.com/sirupsen/logrus"
    )
    // App is an instance of the dashboard for the canary.
    type App struct {
    l logrus.FieldLogger
    db db.CanaryStore
    }
    func New(db db.CanaryStore, r *mux.Router) *App {
    app := &App{
    l: logrus.WithField("component", "app"),
    db: db,
    }
    r.Handle("/", handler.Playground("GraphQL Playground", "/query"))
    r.Handle("/query", handler.GraphQL(NewExecutableSchema(Config{Resolvers: &Resolver{
    db: db,
    }})))
    return app
    }

    View Slide

  112. type TestInstance {
    id: ID!
    name: String!
    start_at: Time!
    end_at: Time
    pass: Boolean
    fail_cause: String
    }
    type Query {
    tests: [TestInstance!]!
    test(id: String!): TestInstance
    ongoingTests: [TestInstance!]!
    }
    scalar Time

    View Slide

  113. type Resolver struct {
    db dbpkg.CanaryStore
    }
    func (r *Resolver) Query() QueryResolver {
    return &queryResolver{r}
    }
    func (r *Resolver) TestInstance() TestInstanceResolver {
    return &testInstanceResolver{r}
    }
    type queryResolver struct{ *Resolver }
    func (r *queryResolver) Tests(ctx context.Context) ([]dbpkg.TestInstance, error) {
    tests, err := r.db.ListTests()
    return tests, err
    }
    func (r *queryResolver) Test(ctx context.Context, id string) (*dbpkg.TestInstance, error) {
    test, err := r.db.FindTestByID(id)
    return test, err
    }
    func (r *queryResolver) OngoingTests(ctx context.Context) ([]dbpkg.TestInstance, error) {
    tests, err := r.db.ListOngoingTests()
    return tests, err
    }
    type testInstanceResolver struct{ *Resolver }
    func (r *testInstanceResolver) ID(ctx context.Context, obj *dbpkg.TestInstance) (string, error) {
    return obj.TestID, nil
    }
    func (r *testInstanceResolver) Name(ctx context.Context, obj *dbpkg.TestInstance) (string, error) {
    return obj.TestName, nil
    }
    func (r *testInstanceResolver) StartAt(ctx context.Context, obj *dbpkg.TestInstance) (time.Time, error) {
    return obj.StartAt, nil
    }
    func (r *testInstanceResolver) EndAt(ctx context.Context, obj *dbpkg.TestInstance) (*time.Time, error) {
    return &obj.EndAt, nil
    }
    func (r *testInstanceResolver) FailCause(ctx context.Context, obj *dbpkg.TestInstance) (*string, error) {
    return &obj.FailCause, nil
    }

    View Slide

  114. Back to the daemon.

    View Slide

  115. func main() {
    log.SetFormatter(&log.JSONFormatter{})
    // Output the stdout for capturing.
    log.SetOutput(os.Stdout)
    var (

    listenHost = flag.String("listen.host", "", "interface on which to listen")
    listenPort = flag.String("listen.port", "8080", "port on which to listen")
    )

    l := mustListen(*listenHost, *listenPort)
    db := mustOpenBolt(*dbPath)
    defer db.Close()
    if err := launchHTTP(ctx, l, db); err != nil {
    log.WithError(err).Fatal("can't launch http server")
    }

    }

    View Slide

  116. func launchHTTP(
    ctx context.Context,
    l net.Listener,
    db dbpkg.CanaryStore,
    ) error {
    addr := l.Addr().(*net.TCPAddr)
    host, err := os.Hostname()
    if err != nil {
    return fmt.Errorf("can't get hostname: %v", err)
    }
    host = net.JoinHostPort(host, strconv.Itoa(addr.Port))
    r := mux.NewRouter().Host(host).Subrouter()
    _ = app.New(db, r)
    log.WithField("host", host).Info("API starting")
    go http.Serve(l, r)
    return nil
    }

    View Slide

  117. View Slide

  118. Last, but not least, metrics.

    View Slide

  119. // internal/metrics/metrics.go
    package metrics
    import (
    "net/http"
    "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/client_golang/prometheus/promhttp"
    )
    // Node is a wrapper around Prometheus's registerer interface
    type Node struct {
    registry prometheus.Registerer
    }
    // Prometheus returns an instance of the metrics node and the prometheus
    // handler.
    func Prometheus() (*Node, http.Handler) {
    registry := prometheus.NewRegistry()
    handler := promhttp.HandlerFor(registry, promhttp.HandlerOpts{})
    registry.MustRegister(prometheus.NewProcessCollector(prometheus.ProcessCollectorOpts{}))
    registry.MustRegister(prometheus.NewGoCollector())
    return &Node{
    registry: registry,
    }, handler
    }

    View Slide

  120. // cmd/canaryd/main.go
    func main() {

    met, hdl := metrics.Prometheus()

    if err := launchHTTP(ctx, l, hdl, db); err != nil {
    log.WithError(err).Fatal("can't launch http server")
    }
    canaryCfg, testCfgs := mustLoadConfigs(vm, *cfgPath)
    launchTests(db, met, vm, canaryCfg, testCfgs)
    // Block forever because we want the tests to run forever.
    select {}
    }

    View Slide

  121. func launchHTTP(
    ctx context.Context,
    l net.Listener,
    promhdl http.Handler,
    db dbpkg.CanaryStore,
    ) error {
    addr := l.Addr().(*net.TCPAddr)
    host, err := os.Hostname()
    if err != nil {
    return fmt.Errorf("can't get hostname: %v", err)
    }
    host = net.JoinHostPort(host, strconv.Itoa(addr.Port))
    r := mux.NewRouter().Host(host).Subrouter()
    _ = app.New(db, r)
    r.PathPrefix("/metrics").Handler(promhdl)
    log.WithField("host", host).Info("API starting")
    go http.Serve(l, r)
    return nil
    }

    View Slide

  122. func runTestForever(db dbpkg.CanaryStore, met *metrics.Node, vm *otto.Otto, cfg *js.TestConfig, test *js.Test) {
    ll := log.WithFields(log.Fields{
    "test.name": test.Name,
    })
    tmet := met.Labels(map[string]string{
    "test_name": test.Name,
    })
    var (
    started = tmet.Counter("test_started_count", "Number of tests that were started")
    finished = tmet.Counter("test_finished_count", "Number of tests that have finished", "result")
    running = tmet.Gauge("test_running_total", "Tests that are currently running")
    _ = tmet.Summary("test_duration_seconds", "Duration of tests", []float64{0.5, 0.75, 0.9, 0.99, 1.0}, "result")
    )

    }

    View Slide

  123. func runTestForever(db dbpkg.CanaryStore, met *metrics.Node, vm *otto.Otto, cfg *js.TestConfig, test *js.Test) {

    for {
    go func(vm *otto.Otto) {

    started.WithLabelValues().Add(1)
    running.Add(1)
    defer running.Add(-1)
    ctx, cancel := context.WithTimeout(context.Background(), cfg.Timeout)
    defer cancel()
    terr := runner.Run(ctx, vm, testCtx, test, testID)
    if terr != nil {
    finished.With(prometheus.Labels{"result": "fail"}).Add(1)
    ll.WithError(terr).Error("test failed")
    } else {
    finished.With(prometheus.Labels{"result": "pass"}).Add(1)
    }
    if err := db.EndTest(dbtest, terr, time.Now()); err != nil {
    ll.WithError(err).Error("couldn't mark test as being ended")
    }
    }(vm.Copy()) // copy VM to avoid polluting global namespace
    // We'll run the test again after the duration we defined.
    time.Sleep(cfg.Frequency)
    }
    }

    View Slide

  124. View Slide

  125. We're done! Let's see a demo?

    View Slide

  126. https://github.com/iheanyi/
    simple-canary

    View Slide

  127. Special thanks to Antoine Grondin

    View Slide

  128. Questions?

    View Slide

  129. Thank you!
    @kwuchu

    View Slide