Slide 1

Slide 1 text

Ruleguard CodeQL Semgrep Iskander (pronounced as “Alex”) @quasilyte vs vs

Slide 2

Slide 2 text

Me & static analysis go-critic NoVerify Ruleguard .-

Slide 3

Slide 3 text

Our starting point We assume that: ● You know that static analysis is cool

Slide 4

Slide 4 text

Our starting point We assume that: ● You know that static analysis is cool ● You’re using golangci-lint

Slide 5

Slide 5 text

Our starting point We assume that: ● You know that static analysis is cool ● You’re using golangci-lint ● You want to create custom code checkers

Slide 6

Slide 6 text

/browsing memes/ Trying to come up with linting idea...

Slide 7

Slide 7 text

!

Slide 8

Slide 8 text

Somewhere on Twitter... Excellent!

Slide 9

Slide 9 text

func f(w io.Writer, b []byte) { - io.WriteString(w, string(b)) + w.Write(b) } Bad code example

Slide 10

Slide 10 text

6 hours later...

Slide 11

Slide 11 text

6 hours later... W hy?! WDYM AST type types are not “types”?! No results on stackoverflow?! How?!

Slide 12

Slide 12 text

Let’s create our own linter! We’ll use a fancy go/analysis framework -...

Slide 13

Slide 13 text

var analyzer = &analysis.Analyzer{ Name: "writestring", Doc: "find sloppy io.WriteString() usages", Run: run, } func run(pass *analysis.Pass) (interface{}, error) { // Analyzer implementation... return nil, nil } Analyzer definition

Slide 14

Slide 14 text

for _, f := range pass.Files { ast.Inspect(f, func(n ast.Node) bool { // Check n node... }) } Analyzer implementation

Slide 15

Slide 15 text

// 1. Is it a call expression? call, ok := n.(*ast.CallExpr) if !ok || len(call.Args) != 2 { return true } Check n node: part 1

Slide 16

Slide 16 text

// 2. Is it io.WriteString() call? fn, ok := call.Fun.(*ast.SelectorExpr) if !ok || fn.Sel.Name != "WriteString" { return true } pkg, ok := fn.X.(*ast.Ident) if !ok || pkg.Name != "io" { return true } Check n node: part 2

Slide 17

Slide 17 text

// 3. Is second arg a string(b) expr? stringCall, ok := call.Args[1].(*ast.CallExpr) if !ok || len(stringCall.Args) != 1 { return true } stringFn, ok := stringCall.Fun.(*ast.Ident) if !ok || stringFn.Name != "string" { return true } Check n node: part 3

Slide 18

Slide 18 text

// 4. Does b has a type of []byte? b := stringCall.Args[0] if pass.TypesInfo.TypeOf(b).String() != "[]byte" { return true } Check n node: part 4

Slide 19

Slide 19 text

// 5. Report the issue msg := "io.WriteString(w, string(b)) -> w.Write(b)" pass.Reportf(call.Pos(), msg) Check n node: part 5

Slide 20

Slide 20 text

func main() { singlechecker.Main(analyzer) } Main function definition

Slide 21

Slide 21 text

It works But not without problems... .-

Slide 22

Slide 22 text

func f(io InputController, b []byte) { io.WriteString(w, string(b)) } io could be something else!

Slide 23

Slide 23 text

func f(io InputController, b []byte) { io.WriteString(w, string(b)) } io could be something else! Need to check that io is a package

Slide 24

Slide 24 text

import "github.com/quasilyte/io" // not stdlib! func f(b []byte) { io.WriteString(w, string(b)) } io could be something else! But even if it is a package we can get confused

Slide 25

Slide 25 text

The warning message is not perfect

Slide 26

Slide 26 text

The warning message is not perfect [ ]byte variable is called “x”, not “b”

Slide 27

Slide 27 text

It could be worse .-

Slide 28

Slide 28 text

Let’s try again Now with ruleguard -...

Slide 29

Slide 29 text

func writeString(m fluent.Matcher) { m.Match(`io.WriteString($w, string($b))`). Where(m["b"].Type.Is("[]byte")). Report("$$ -> $w.Write($b)") } writeString rule

Slide 30

Slide 30 text

func writeString(m fluent.Matcher) { m.Match(`io.WriteString($w, string($b))`). Where(m["b"].Type.Is("[]byte")). Report("$$ -> $w.Write($b)") } writeString rule A rules group named writeString (May include several rules)

Slide 31

Slide 31 text

func writeString(m fluent.Matcher) { m.Match(`io.WriteString($w, string($b))`). Where(m["b"].Type.Is("[]byte")). Report("$$ -> $w.Write($b)") } writeString rule AST pattern

Slide 32

Slide 32 text

func writeString(m fluent.Matcher) { m.Match(`io.WriteString($w, string($b))`). Where(m["b"].Type.Is("[]byte")). Report("$$ -> $w.Write($b)") } writeString rule Result filter

Slide 33

Slide 33 text

func writeString(m fluent.Matcher) { m.Match(`io.WriteString($w, string($b))`). Where(m["b"].Type.Is("[]byte")). Report("$$ -> $w.Write($b)") } writeString rule Warning message template

Slide 34

Slide 34 text

The warning message is perfect!

Slide 35

Slide 35 text

func writeString(m fluent.Matcher) { m.Match(`io.WriteString($w, string($b))`). Where(m["b"].Type.Is("[]byte")). Suggest("$w.Write($b)") } writeString rule Auto fix replacement template (can be combined with Report)

Slide 36

Slide 36 text

With -fix, suggestions are applied automagically

Slide 37

Slide 37 text

Let’s try semgrep -...

Slide 38

Slide 38 text

rules: - id: writestring patterns: - pattern: io.WriteString($W, string($B)) message: "use $W.Write($B)" languages: [go] severity: ERROR writestring.yml

Slide 39

Slide 39 text

Something went wrong...

Slide 40

Slide 40 text

Something went wrong... False positive!

Slide 41

Slide 41 text

rules: - id: writestring patterns: - pattern: io.WriteString($W, string($B)) message: "use $W.Write($B)" languages: [go] severity: ERROR writestring.yml TODO: type filters

Slide 42

Slide 42 text

By the way... Have you heard of YAML5? -...

Slide 43

Slide 43 text

{ rules: [ { id: 'writestring', patterns: [ {pattern: 'io.WriteString($W, string($B))'}, ], message: 'use $W.Write($B)', languages: ['go'], severity: 'ERROR', }, ], } Using YAML5 format for semgrep rules

Slide 44

Slide 44 text

Let’s try CodeQL .-

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

No content

Slide 50

Slide 50 text

import go from CallExpr c, Expr w, ConversionExpr conv, SelectorExpr fn where w = c.getArgument(0) and conv = c.getArgument(1) and fn = c.getCalleeExpr() and fn.getSelector().getName() = "WriteString" and fn.getBase().toString() = "io" and conv.getOperand().getType() instanceof ByteSliceType and conv.getType() instanceof StringType select c, "use " + w + ".Write(" + conv.getOperand() + ")" CodeQL query

Slide 51

Slide 51 text

How to run? ● Use the online query console ● Select quasilyte/codeql-test project ● Copy/paste query from the previous slide

Slide 52

Slide 52 text

CodeQL pros ● SSA support

Slide 53

Slide 53 text

CodeQL pros ● SSA support ● Taint analysis (source-sink)

Slide 54

Slide 54 text

CodeQL pros ● SSA support ● Taint analysis (source-sink) ● Not limited by (Go) syntax rules

Slide 55

Slide 55 text

CodeQL pros ● SSA support ● Taint analysis (source-sink) ● Not limited by (Go) syntax rules ● Real declarative programming language

Slide 56

Slide 56 text

CodeQL pros ● SSA support ● Taint analysis (source-sink) ● Not limited by (Go) syntax rules ● Real declarative programming language ● Backed by GitHub

Slide 57

Slide 57 text

CodeQL pros ● SSA support ● Taint analysis (source-sink) ● Not limited by (Go) syntax rules ● Real declarative programming language ● Backed by GitHub Microsoft

Slide 58

Slide 58 text

CodeQL pros ● SSA support ● Taint analysis (source-sink) ● Not limited by (Go) syntax rules ● Real declarative programming language ● Backed by GitHub Microsoft ● 1st class GitHub integration

Slide 59

Slide 59 text

Truth be told... Ruleguard and Semgrep CodeQL

Slide 60

Slide 60 text

CodeQL cons The main points that I want to cover: 1. Steep learning curve 2. Simple things are not simple 3. Non-trivial QL may look alien for many people

Slide 61

Slide 61 text

Why Ruleguard then? ● Very easy to get started (just “go get” it)

Slide 62

Slide 62 text

Why Ruleguard then? ● Very easy to get started (just “go get” it) ● Rules are written in pure Go

Slide 63

Slide 63 text

Why Ruleguard then? ● Very easy to get started (just “go get” it) ● Rules are written in pure Go ● Integrated in golangci-lint and go-critic

Slide 64

Slide 64 text

Why Ruleguard then? ● Very easy to get started (just “go get” it) ● Rules are written in pure Go ● Integrated in golangci-lint and go-critic ● Simple things are simple

Slide 65

Slide 65 text

Why Ruleguard then? ● Very easy to get started (just “go get” it) ● Rules are written in pure Go ● Integrated in golangci-lint and go-critic ● Simple things are simple ● Very Go-centric (both pro and con)

Slide 66

Slide 66 text

Using ruleguard from golangci -...

Slide 67

Slide 67 text

Enabling Ruleguard 1. Install golangci-lint on your pipeline (if not yet) 2. Prepare a rules file (a Go file with ruleguard rules) 3. Enable ruleguard in golangci-lint config You can also use Ruleguard directly or via go-critic.

Slide 68

Slide 68 text

ruleguard

Slide 69

Slide 69 text

go-critic ruleguard

Slide 70

Slide 70 text

go-critic golangci ruleguard

Slide 71

Slide 71 text

linters: enable: - gocritic linters-settings: gocritic: enabled-checks: - ruleguard settings: ruleguard: rules: "rules.go" .golangci.yml checklist

Slide 72

Slide 72 text

linters: enable: - gocritic linters-settings: gocritic: enabled-checks: - ruleguard settings: ruleguard: rules: "rules.go" .golangci.yml checklist go-critic linter should be enabled

Slide 73

Slide 73 text

linters: enable: - gocritic linters-settings: gocritic: enabled-checks: - ruleguard settings: ruleguard: rules: "rules.go" .golangci.yml checklist ruleguard checker should be enabled

Slide 74

Slide 74 text

linters: enable: - gocritic linters-settings: gocritic: enabled-checks: - ruleguard settings: ruleguard: rules: "rules.go" .golangci.yml checklist rules param should be set

Slide 75

Slide 75 text

Ruleguard guide

Slide 76

Slide 76 text

m.Match(`pattern1`, `pattern2`) Match() does the syntax matching

Slide 77

Slide 77 text

m.Match(`pattern1`, `pattern2`) Match() does the syntax matching Matching alternations: pattern1|pattern2

Slide 78

Slide 78 text

`$x = $x` pattern string

Slide 79

Slide 79 text

`$x = $x` pattern string Parsed AST

Slide 80

Slide 80 text

`$x = $x` pattern string Parsed AST Modified AST (with meta nodes)

Slide 81

Slide 81 text

func match(pat, n ast.Node) bool pat is a compiled pattern n is a node being matched AST matching engine

Slide 82

Slide 82 text

Algorithm ● Both pat and n are traversed ● Non-meta nodes are compared normally ● pat meta nodes are separate cases ● Named matches are collected (capture) ● Some patterns may involve backtracking

Slide 83

Slide 83 text

● $x is a simple “match any” named match ● $_ is a “match any” unnamed match ● $*_ matches zero or more nodes Meta node examples

Slide 84

Slide 84 text

Pattern matching = $x $x += a 10 Pattern $x=$x Target a+=10

Slide 85

Slide 85 text

Pattern matching = $x $x += a 10 Pattern $x=$x Target a+=10

Slide 86

Slide 86 text

Pattern matching = $x $x = a 10 Pattern $x=$x Target a=10

Slide 87

Slide 87 text

Pattern matching = $x $x = a 10 Pattern $x=$x Target a=10

Slide 88

Slide 88 text

Pattern matching = $x $x = a 10 Pattern $x=$x Target a=10 $x is bound to a

Slide 89

Slide 89 text

Pattern matching = $x $x = a 10 Pattern $x=$x Target a=10 a != 10

Slide 90

Slide 90 text

Pattern matching = $x $x = a a Pattern $x=$x Target a=a

Slide 91

Slide 91 text

Pattern matching = $x $x = a a Pattern $x=$x Target a=a

Slide 92

Slide 92 text

Pattern matching = $x $x = a a Pattern $x=$x Target a=a $x is bound to a

Slide 93

Slide 93 text

Pattern matching = $x $x = a a Pattern $x=$x Target a=a a = a, pattern matched

Slide 94

Slide 94 text

m.Where(cond1 && cond2) Where() is for the match filtering

Slide 95

Slide 95 text

m.Where(cond1 && cond2) Where() is for the match filtering Where expression

Slide 96

Slide 96 text

m.Where(cond1 && cond2) Where() is for the match filtering Where expression operands

Slide 97

Slide 97 text

Where() expression operands ● Matched text predicates

Slide 98

Slide 98 text

Where() expression operands ● Matched text predicates ● Properties like AssignableTo/ConvertibleTo/Pure

Slide 99

Slide 99 text

Where() expression operands ● Matched text predicates ● Properties like AssignableTo/ConvertibleTo/Pure ● Check whether a value implements interface

Slide 100

Slide 100 text

Where() expression operands ● Matched text predicates ● Properties like AssignableTo/ConvertibleTo/Pure ● Check whether a value implements interface ● Type matching expressions

Slide 101

Slide 101 text

Where() expression operands ● Matched text predicates ● Properties like AssignableTo/ConvertibleTo/Pure ● Check whether a value implements interface ● Type matching expressions ● File-related filters (like “file imports X”)

Slide 102

Slide 102 text

$t Arbitrary type []byte Byte slice type []$t Arbitrary slice type map[$t]$t Map with $t key and value types map[$t]struct{} Any set-like map func($_) $_ Any T1->T2 function type Type matching examples

Slide 103

Slide 103 text

struct{$*_} Arbitrary struct struct{$x; $x} Struct of 2 $x-typed fields struct{$_; $_} Struct with any 2 fields struct{$x; $*_} Struct that starts with $x field struct{$*_; $x} Struct that ends with $x field struct{$*_; $x; $*_} Struct that contains $x field Type matching examples (cont.)

Slide 104

Slide 104 text

// Just report a message m.Report("warning message") // Report + do an auto fix in -fix mode m.Suggest("autofix template") Report() and Suggest() handle a match

Slide 105

Slide 105 text

More ruleguard examples .-

Slide 106

Slide 106 text

func printFmt(m fluent.Matcher) { m.Match(`fmt.Println($s, $*_)`). Where(m["s"].Text.Matches(`%[sdv]`)). Report("found formatting directives") } Find formatting directives in a non-formatting fmt calls

Slide 107

Slide 107 text

func badLock(m fluent.Matcher) { m.Match(`$mu.Lock(); $mu.Unlock()`). Report(`$mu unlocked immediately`) m.Match(`$mu.Lock(); defer $mu.RUnlock()`). Report(`maybe $mu.RLock() is intended?`) } Find mutex usage issues (real-world example)

Slide 108

Slide 108 text

func sprintErr(m fluent.Matcher) { m.Match(`fmt.Sprint($err)`, `fmt.Sprintf("%s", $err)`, `fmt.Sprintf("%v", $err)`). Where(m["err"].Type.Is(`error)). Suggest(`$err.Error()`) } Suggest error.Error() instead

Slide 109

Slide 109 text

func arrayDeref(m fluent.Matcher) { m.Match(`(*$arr)[$i]`). Where(m["arr"].Type.Is(`*[$_]$_`)). Suggest(`$arr[$i]`) } Find redundant explicit array dereference expressions

Slide 110

Slide 110 text

func osFilepath(m fluent.Matcher) { m.Match(`os.PathSeparator`). Where(m.File().Imports("path/filepath")). Suggest(`filepath.Separator`) } Suggest filepath.Separator instead of os.PathSeparator

Slide 111

Slide 111 text

# -e runs a single inline rule ruleguard -e 'm.Match(`!($x != $y)`)' file.go Running ruleguard with -e

Slide 112

Slide 112 text

Side-by-side comparison

Slide 113

Slide 113 text

Written in go-ruleguard Go Semgrep Mostly OCaml CodeQL ??? (Compler+Runtime are closed source) Ruleguard vs Semgrep vs CodeQL

Slide 114

Slide 114 text

Written in go-ruleguard Go Semgrep Not Go CodeQL Probably not Go Ruleguard vs Semgrep vs CodeQL

Slide 115

Slide 115 text

Matching mechanism go-ruleguard AST patterns Semgrep AST patterns CodeQL Dedicated query language Ruleguard vs Semgrep vs CodeQL

Slide 116

Slide 116 text

Type matching mechanism go-ruleguard Typematch patterns + predicates Semgrep N/A (planned, but not implemented yet) CodeQL Type assertion-like API Ruleguard vs Semgrep vs CodeQL

Slide 117

Slide 117 text

DSL go-ruleguard Go Semgrep YAML files CodeQL Dedicated query language Ruleguard vs Semgrep vs CodeQL

Slide 118

Slide 118 text

Supported languages go-ruleguard Go Semgrep Go + other languages CodeQL Go + other languages Ruleguard vs Semgrep vs CodeQL

Slide 119

Slide 119 text

How much you can do go-ruleguard Simple-medium diagnostics Semgrep Simple-medium diagnostics CodeQL Almost whatever you want Ruleguard vs Semgrep vs CodeQL

Slide 120

Slide 120 text

Links ● Ruleguard quickstart: EN, RU ● Ruleguard DSL documentation ● Ruleguard examples: one, two ● gogrep - AST patterns matching library for Go ● A list of similar tools ● .golangci.yml from go-critic (uses ruleguard)

Slide 121

Slide 121 text

Ruleguard CodeQL Semgrep Искандер (Alex) Шарипов @quasilyte vs vs