for _, f := range pass.Files {
ast.Inspect(f, func(n ast.Node) bool {
// Check n node...
})
}
Analyzer implementation
Slide 15
Slide 15 text
// 1. Is it a call expression?
call, ok := n.(*ast.CallExpr)
if !ok || len(call.Args) != 2 {
return true
}
Check n node: part 1
Slide 16
Slide 16 text
// 2. Is it io.WriteString() call?
fn, ok := call.Fun.(*ast.SelectorExpr)
if !ok || fn.Sel.Name != "WriteString" {
return true
}
pkg, ok := fn.X.(*ast.Ident)
if !ok || pkg.Name != "io" {
return true
}
Check n node: part 2
Slide 17
Slide 17 text
// 3. Is second arg a string(b) expr?
stringCall, ok := call.Args[1].(*ast.CallExpr)
if !ok || len(stringCall.Args) != 1 {
return true
}
stringFn, ok := stringCall.Fun.(*ast.Ident)
if !ok || stringFn.Name != "string" {
return true
}
Check n node: part 3
Slide 18
Slide 18 text
// 4. Does b has a type of []byte?
b := stringCall.Args[0]
if pass.TypesInfo.TypeOf(b).String() != "[]byte" {
return true
}
Check n node: part 4
Slide 19
Slide 19 text
// 5. Report the issue
msg := "io.WriteString(w, string(b)) -> w.Write(b)"
pass.Reportf(call.Pos(), msg)
Check n node: part 5
Slide 20
Slide 20 text
func main() {
singlechecker.Main(analyzer)
}
Main function definition
Slide 21
Slide 21 text
It works
But not without problems...
.-
Slide 22
Slide 22 text
func f(io InputController, b []byte) {
io.WriteString(w, string(b))
}
io could be something else!
Slide 23
Slide 23 text
func f(io InputController, b []byte) {
io.WriteString(w, string(b))
}
io could be something else!
Need to check that io is a package
Slide 24
Slide 24 text
import "github.com/quasilyte/io" // not stdlib!
func f(b []byte) {
io.WriteString(w, string(b))
}
io could be something else!
But even if it is a package we can get confused
Slide 25
Slide 25 text
The warning message is not perfect
Slide 26
Slide 26 text
The warning message is not perfect
[ ]byte variable is called “x”, not “b”
func writeString(m fluent.Matcher) {
m.Match(`io.WriteString($w, string($b))`).
Where(m["b"].Type.Is("[]byte")).
Report("$$ -> $w.Write($b)")
}
writeString rule
A rules group named writeString
(May include several rules)
{
rules: [
{
id: 'writestring',
patterns: [
{pattern: 'io.WriteString($W, string($B))'},
],
message: 'use $W.Write($B)',
languages: ['go'],
severity: 'ERROR',
},
],
}
Using YAML5 format for semgrep rules
Slide 44
Slide 44 text
Let’s try CodeQL
.-
Slide 45
Slide 45 text
No content
Slide 46
Slide 46 text
No content
Slide 47
Slide 47 text
No content
Slide 48
Slide 48 text
No content
Slide 49
Slide 49 text
No content
Slide 50
Slide 50 text
import go
from CallExpr c,
Expr w,
ConversionExpr conv,
SelectorExpr fn
where w = c.getArgument(0)
and conv = c.getArgument(1)
and fn = c.getCalleeExpr()
and fn.getSelector().getName() = "WriteString"
and fn.getBase().toString() = "io"
and conv.getOperand().getType() instanceof ByteSliceType
and conv.getType() instanceof StringType
select c, "use " + w + ".Write(" + conv.getOperand() + ")"
CodeQL query
Slide 51
Slide 51 text
How to run?
● Use the online query console
● Select quasilyte/codeql-test project
● Copy/paste query from the previous slide
Slide 52
Slide 52 text
CodeQL pros
● SSA support
Slide 53
Slide 53 text
CodeQL pros
● SSA support
● Taint analysis (source-sink)
Slide 54
Slide 54 text
CodeQL pros
● SSA support
● Taint analysis (source-sink)
● Not limited by (Go) syntax rules
Slide 55
Slide 55 text
CodeQL pros
● SSA support
● Taint analysis (source-sink)
● Not limited by (Go) syntax rules
● Real declarative programming language
Slide 56
Slide 56 text
CodeQL pros
● SSA support
● Taint analysis (source-sink)
● Not limited by (Go) syntax rules
● Real declarative programming language
● Backed by GitHub
Slide 57
Slide 57 text
CodeQL pros
● SSA support
● Taint analysis (source-sink)
● Not limited by (Go) syntax rules
● Real declarative programming language
● Backed by GitHub Microsoft
Slide 58
Slide 58 text
CodeQL pros
● SSA support
● Taint analysis (source-sink)
● Not limited by (Go) syntax rules
● Real declarative programming language
● Backed by GitHub Microsoft
● 1st class GitHub integration
Slide 59
Slide 59 text
Truth be told...
Ruleguard
and
Semgrep
CodeQL
Slide 60
Slide 60 text
CodeQL cons
The main points that I want to cover:
1. Steep learning curve
2. Simple things are not simple
3. Non-trivial QL may look alien for many people
Slide 61
Slide 61 text
Why Ruleguard then?
● Very easy to get started (just “go get” it)
Slide 62
Slide 62 text
Why Ruleguard then?
● Very easy to get started (just “go get” it)
● Rules are written in pure Go
Slide 63
Slide 63 text
Why Ruleguard then?
● Very easy to get started (just “go get” it)
● Rules are written in pure Go
● Integrated in golangci-lint and go-critic
Slide 64
Slide 64 text
Why Ruleguard then?
● Very easy to get started (just “go get” it)
● Rules are written in pure Go
● Integrated in golangci-lint and go-critic
● Simple things are simple
Slide 65
Slide 65 text
Why Ruleguard then?
● Very easy to get started (just “go get” it)
● Rules are written in pure Go
● Integrated in golangci-lint and go-critic
● Simple things are simple
● Very Go-centric (both pro and con)
Slide 66
Slide 66 text
Using ruleguard from golangci
-...
Slide 67
Slide 67 text
Enabling Ruleguard
1. Install golangci-lint on your pipeline (if not yet)
2. Prepare a rules file (a Go file with ruleguard rules)
3. Enable ruleguard in golangci-lint config
You can also use Ruleguard directly or via go-critic.
func match(pat, n ast.Node) bool
pat is a compiled pattern
n is a node being matched
AST matching engine
Slide 82
Slide 82 text
Algorithm
● Both pat and n are traversed
● Non-meta nodes are compared normally
● pat meta nodes are separate cases
● Named matches are collected (capture)
● Some patterns may involve backtracking
Slide 83
Slide 83 text
● $x is a simple “match any” named match
● $_ is a “match any” unnamed match
● $*_ matches zero or more nodes
Meta node examples
Pattern matching
=
$x $x
=
a 10
Pattern $x=$x Target a=10
$x is bound to a
Slide 89
Slide 89 text
Pattern matching
=
$x $x
=
a 10
Pattern $x=$x Target a=10
a != 10
Slide 90
Slide 90 text
Pattern matching
=
$x $x
=
a a
Pattern $x=$x Target a=a
Slide 91
Slide 91 text
Pattern matching
=
$x $x
=
a a
Pattern $x=$x Target a=a
Slide 92
Slide 92 text
Pattern matching
=
$x $x
=
a a
Pattern $x=$x Target a=a
$x is bound to a
Slide 93
Slide 93 text
Pattern matching
=
$x $x
=
a a
Pattern $x=$x Target a=a
a = a, pattern matched
Slide 94
Slide 94 text
m.Where(cond1 && cond2)
Where() is for the match filtering
Slide 95
Slide 95 text
m.Where(cond1 && cond2)
Where() is for the match filtering
Where expression
Slide 96
Slide 96 text
m.Where(cond1 && cond2)
Where() is for the match filtering
Where expression operands
Slide 97
Slide 97 text
Where() expression operands
● Matched text predicates
Slide 98
Slide 98 text
Where() expression operands
● Matched text predicates
● Properties like AssignableTo/ConvertibleTo/Pure
Slide 99
Slide 99 text
Where() expression operands
● Matched text predicates
● Properties like AssignableTo/ConvertibleTo/Pure
● Check whether a value implements interface
Slide 100
Slide 100 text
Where() expression operands
● Matched text predicates
● Properties like AssignableTo/ConvertibleTo/Pure
● Check whether a value implements interface
● Type matching expressions
Slide 101
Slide 101 text
Where() expression operands
● Matched text predicates
● Properties like AssignableTo/ConvertibleTo/Pure
● Check whether a value implements interface
● Type matching expressions
● File-related filters (like “file imports X”)
Slide 102
Slide 102 text
$t Arbitrary type
[]byte Byte slice type
[]$t Arbitrary slice type
map[$t]$t Map with $t key and value types
map[$t]struct{} Any set-like map
func($_) $_ Any T1->T2 function type
Type matching examples
Slide 103
Slide 103 text
struct{$*_} Arbitrary struct
struct{$x; $x} Struct of 2 $x-typed fields
struct{$_; $_} Struct with any 2 fields
struct{$x; $*_} Struct that starts with $x field
struct{$*_; $x} Struct that ends with $x field
struct{$*_; $x; $*_} Struct that contains $x field
Type matching examples (cont.)
Slide 104
Slide 104 text
// Just report a message
m.Report("warning message")
// Report + do an auto fix in -fix mode
m.Suggest("autofix template")
Report() and Suggest() handle a match
Slide 105
Slide 105 text
More ruleguard examples
.-
Slide 106
Slide 106 text
func printFmt(m fluent.Matcher) {
m.Match(`fmt.Println($s, $*_)`).
Where(m["s"].Text.Matches(`%[sdv]`)).
Report("found formatting directives")
}
Find formatting directives in a
non-formatting fmt calls
# -e runs a single inline rule
ruleguard -e 'm.Match(`!($x != $y)`)' file.go
Running ruleguard with -e
Slide 112
Slide 112 text
Side-by-side comparison
Slide 113
Slide 113 text
Written in
go-ruleguard Go
Semgrep Mostly OCaml
CodeQL ??? (Compler+Runtime are closed source)
Ruleguard vs Semgrep vs CodeQL
Slide 114
Slide 114 text
Written in
go-ruleguard Go
Semgrep Not Go
CodeQL Probably not Go
Ruleguard vs Semgrep vs CodeQL
Slide 115
Slide 115 text
Matching mechanism
go-ruleguard AST patterns
Semgrep AST patterns
CodeQL Dedicated query language
Ruleguard vs Semgrep vs CodeQL
Slide 116
Slide 116 text
Type matching mechanism
go-ruleguard Typematch patterns + predicates
Semgrep N/A (planned, but not implemented yet)
CodeQL Type assertion-like API
Ruleguard vs Semgrep vs CodeQL
Slide 117
Slide 117 text
DSL
go-ruleguard Go
Semgrep YAML files
CodeQL Dedicated query language
Ruleguard vs Semgrep vs CodeQL
Slide 118
Slide 118 text
Supported languages
go-ruleguard Go
Semgrep Go + other languages
CodeQL Go + other languages
Ruleguard vs Semgrep vs CodeQL
Slide 119
Slide 119 text
How much you can do
go-ruleguard Simple-medium diagnostics
Semgrep Simple-medium diagnostics
CodeQL Almost whatever you want
Ruleguard vs Semgrep vs CodeQL
Slide 120
Slide 120 text
Links
● Ruleguard quickstart: EN, RU
● Ruleguard DSL documentation
● Ruleguard examples: one, two
● gogrep - AST patterns matching library for Go
● A list of similar tools
● .golangci.yml from go-critic (uses ruleguard)
Slide 121
Slide 121 text
Ruleguard
CodeQL
Semgrep
Искандер (Alex) Шарипов @quasilyte
vs
vs