Slide 1

Slide 1 text

Analyzing Go programs No reading required Francesc Campoy Gopher Developer Advocate at Google

Slide 2

Slide 2 text

Agenda Program Analysis Dynamic Static A cool demo

Slide 3

Slide 3 text

About me Developer Advocate for Google Cloud Platform @francesc [email protected]

Slide 4

Slide 4 text

Program Analysis Given a program we extract properties: correctness robustness performance others (well formatted, idiomatic, ...) This is done automatically, no human is involved.

Slide 5

Slide 5 text

Program analysis families There's two families: Dynamic Static

Slide 6

Slide 6 text

Dynamic Analysis

Slide 7

Slide 7 text

Dynamic Analysis We observe the behavior of the running program often it requires instrumenting the program it doesn't prove a property, it looks for failures

Slide 8

Slide 8 text

Dynamic analysis: unit tests Verify the correctness of a function. f u n c S u m ( v s [ ] i n t ) i n t { s : = 0 f o r _ , v : = r a n g e v s { s + = v } r e t u r n s }

Slide 9

Slide 9 text

Dynamic analysis: unit tests The verification is done with more code, no instrumentation needed. f u n c T e s t S u m ( t * t e s t i n g . T ) { v s : = [ ] i n t { 1 , 2 , 3 } s : = S u m ( v s ) i f s ! = 6 { t . E r r o r f ( " s u m ( % v ) s h o u l d b e 6 ; g o t % v " , v s , s ) } } You can run tests: $ g o t e s t P A S S o k g o l a n g . o r g / x / t a l k s / 2 0 1 5 / p r o g r a m - a n a l y s i s / s u m 0 . 0 1 9 s

Slide 10

Slide 10 text

Dynamic analysis: benchmarks Benchmarks can be used to f u n c B e n c h m a r k S u m 1 ( b * t e s t i n g . B ) { f o r i : = 0 ; i < b . N ; i + + { s = S u m ( [ ] i n t { 1 } ) } } f u n c B e n c h m a r k S u m 1 0 ( b * t e s t i n g . B ) { f o r i : = 0 ; i < b . N ; i + + { s = S u m ( [ ] i n t { 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 1 0 } ) } } You can run benchmarks: $ g o t e s t - b e n c h = . P A S S B e n c h m a r k S u m 1 - 4 5 0 0 0 0 0 0 0 0 3 . 5 3 n s / o p B e n c h m a r k S u m 1 0 - 4 1 0 0 0 0 0 0 0 0 1 1 . 3 n s / o p o k g o l a n g . o r g / x / t a l k s / 2 0 1 5 / p r o g r a m - a n a l y s i s / s u m 3 . 2 9 0 s

Slide 11

Slide 11 text

Dynamic analysis: bound checks Go programs are instrumented to detect accesses to invalid positions in a slice. f u n c S u m ( v s [ ] i n t ) i n t { s : = 0 f o r i : = 0 ; i < = l e n ( v s ) ; i + + { s + = v s [ i ] } r e t u r n s } f u n c m a i n ( ) { v : = [ ] i n t { 1 , 2 , 3 , 4 } f m t . P r i n t l n ( S u m ( v [ : 3 ] ) ) } $ g o r u n b o u n d s . g o p a n i c : r u n t i m e e r r o r : i n d e x o u t o f r a n g e We can disable bounds checking with g c f l a g s = - B

Slide 12

Slide 12 text

Dynamic analysis: race detector All memory accesses are instrumented to detect data races. Is this code correct? g o t e s t - r a c e r a c e . g o f u n c m a i n ( ) { n : = 0 g o f u n c ( ) { f o r r a n g e t i m e . T i c k ( t i m e . S e c o n d ) { n + + } } ( ) f o r r a n g e t i m e . T i c k ( t i m e . S e c o n d / 5 ) { f m t . P r i n t l n ( n ) } } Run

Slide 13

Slide 13 text

Sieve of Erathotestenes A method to obtain the list of prime numbers from wikipedia

Slide 14

Slide 14 text

Concurrent prime sieve Using channels and go routines by Russ Cox Generate generates all the numbers starting from 2 Filter filters all the numbers multiple of a given value

Slide 15

Slide 15 text

Dynamic analysis: pprof Analyzing the code for the concurrent prime sieve f u n c m a i n ( ) { c , e r r : = o s . C r e a t e ( " p r i m e s . p r o f " ) i f e r r ! = n i l { l o g . F a t a l ( e r r ) } p p r o f . S t a r t C P U P r o f i l e ( c ) d e f e r p p r o f . S t o p C P U P r o f i l e ( ) r u n ( ) } Execute the program first, then generate a graph $ g o t o o l p p r o f - - s v g p p r o f p r i m e s . p r o f > p r i m e s . s v g Similar analysis can be done on memory and locks.

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Dynamic analysis: execution tracer Captures with nanosecond precision: goroutine creation/start/end goroutine blocking/unblocking network blocking system calls GC events

Slide 19

Slide 19 text

Dynamic analysis: execution tracer Run a test with - t r a c e $ g o t e s t - t r a c e = t r a c e . o u t And start a tracer on it $ g o t o o l t r a c e t r a c e . t e s t t r a c e . o u t

Slide 20

Slide 20 text

trace of all the program activity

Slide 21

Slide 21 text

trace for a go routine

Slide 22

Slide 22 text

Dynamic analysis: others Other dynamic analysis tools: Debuggers Code coverage go-fuzz what else?

Slide 23

Slide 23 text

Dynamic analysis conclusion Pros: often conceptually simple finds only issues that are ACTUALLY occurring Cons: often makes your program slower finds ONLY issues that are actually occurring

Slide 24

Slide 24 text

Static Analysis

Slide 25

Slide 25 text

Static Analysis The program is not executed, instead the code is analyzed. no need to instrument any code unlike dynamic analysis, it can prove specific property

Slide 26

Slide 26 text

The usual suspects Some go* gofmt: verify formatting conventions i m p o r t ( " m a t h " ; " f m t " ; " i o " ) golint: other conventions (naming, docs, etc) v a r s o m e _ v a l u e i n t go vet: find possible errors f m t . P r i n t f ( " % v + % v = % v " , a , b ) godoc: find exported identifiers and their docs

Slide 27

Slide 27 text

Errcheck Finds all the errors that have been implicitly ignored. f u n c m a i n ( ) { o s . R e m o v e ( " s o m e f i l e " ) / / t h i s i s r e p o r t e d } Errors ignored explicitly are not reported. f u n c m a i n ( ) { _ = o s . R e m o v e ( " s o m e f i l e " ) / / t h i s i s f i n e } github.com/kisielk/errcheck

Slide 28

Slide 28 text

Analysis of "almost-correct" code Impossible to do with dynamic analysis. Useful for "lazy" code authors (like me).

Slide 29

Slide 29 text

goimports Finds unused imported packages and removes them Finds missing imported packages and finds them in your GOPATH: by matching the name of the package and the name of the identifier used Limitation: what package defines t e m p l a t e . T e m p l a t e ?

Slide 30

Slide 30 text

goreturns Finds return statements where less values than expected are returned, and the last type returned is an e r r o r and adds the missing zero values. g o r e t u r n s turns: f u n c f o o ( ) ( i n t , b o o l , s t r i n g , e r r o r ) { r e t u r n f m t . E r r o r f ( " a w f u l s t u f f " ) } into: f u n c f o o ( ) ( i n t , b o o l , s t r i n g , e r r o r ) { r e t u r n 0 , f a l s e , " " , f m t . E r r o r f ( " a w f u l s t u f f " ) } github.com/sqs/goreturns

Slide 31

Slide 31 text

A bit harder What's the value returned by this function? f u n c f o o ( n i n t ) i n t e r f a c e { } { s w i t c h { c a s e n < 0 : r e t u r n n i l c a s e n % 2 = = 0 : r e t u r n " e v e n " c a s e n = = 4 2 : r e t u r n 4 2 . 0 d e f a u l t : r e t u r n f a l s e } } We want to understand more than types: we want to follow the flow of a program.

Slide 32

Slide 32 text

Single Static Assignment Intermediary Representation

Slide 33

Slide 33 text

SSA IR

Slide 34

Slide 34 text

SSA IR Single Static Assignment Intermediary Representation Intermediary representation for code Every variable is assigned at most once Data flow analysis is easier based on this form Data flow analyzes the possible values for a variable at some point therefore, the possible ways some code can be executed

Slide 35

Slide 35 text

SSA A p r o g r a m i s d e f i n e d t o b e i n S S A f o r m i f e a c h v a r i a b l e i s a t a r g e t o f e x a c t l y o n e a s s i g n m e n t s t a t e m e n t i n t h e p r o g r a m t e x t . So given a piece of code like this: a : = 0 b : = 1 b = a + b We can convert it to: a 1 : = 0 b 1 : = 1 b 2 : = a 1 + b 1

Slide 36

Slide 36 text

The importance of naming Variables in Go have the same issue x : = 1 y : = x + 1 x = 2 z : = x + 1 What's the value of x ? Are y and z equal?

Slide 37

Slide 37 text

Referential transparency SSA enforces one condition: one definition for each variable, so a variable value won't ever change. Similar to enforcing: c o n s t in C++ f i n a l in Java but not c o n s t in Go

Slide 38

Slide 38 text

Referential transparency The value of a variable is independent of its position. Referential transparent expressions are independent of order of evaluation

Slide 39

Slide 39 text

golang.org/x/tools/go/ssa SSA IR is an intermediate representation g o l a n g . o r g / x / t o o l s / g o / s s a provides the building blocks g o l a n g . o r g / x / t o o l s / c m d / s s a d u m p provides a tool to display SSA forms of Go programs

Slide 40

Slide 40 text

ssadump Given this factorial function: f u n c f a c t ( x i n t ) i n t { i f x = = 0 { r e t u r n 1 } r e t u r n x * f a c t ( x - 1 ) } We can generate its SSA dump running s s a d u m p - b u i l d = F f a c t . g o

Slide 41

Slide 41 text

ssadump # N a m e : f a c t . f a c t # P a c k a g e : f a c t # L o c a t i o n : f a c t . g o : 3 : 6 f u n c f a c t ( x i n t ) i n t : 0 : e n t r y P : 0 S : 2 t 0 = x = = 0 : i n t b o o l i f t 0 g o t o 1 e l s e 2 1 : i f . t h e n P : 1 S : 0 r e t u r n 1 : i n t 2 : i f . d o n e P : 1 S : 0 t 1 = x - 1 : i n t i n t t 2 = f a c t ( t 1 ) i n t t 3 = x * t 2 i n t r e t u r n t 3

Slide 42

Slide 42 text

SSA and Go

Slide 43

Slide 43 text

SSA New SSA Backend plan for 1.6 Some of the expected improvements: better common subexpression elimination better dead code elimination better register allocation better stack frame allocation

Slide 44

Slide 44 text

Oracle Source analysis tool invoked by an editor answers questions about Go programs Powered by: SSA IR Pointer Analysis

Slide 45

Slide 45 text

Demo time!

Slide 46

Slide 46 text

Pointer analysis on godoc g o d o c - a n a l y s i s = p o i n t e r live

Slide 47

Slide 47 text

Conclusion Program analysis: dynamic static Provides tools for: verification edition exploration Use the tools that exist and build the ones that you want!

Slide 48

Slide 48 text

Thank you Francesc Campoy Gopher Developer Advocate at Google @francesc [email protected]