Slide 1

Slide 1 text

Refactoring Systems with Confidence Jesse Toth and Nathan Witmer OSCON 2015

Slide 2

Slide 2 text

! # $ # %

Slide 3

Slide 3 text

Domain Model &

Slide 4

Slide 4 text

Permissions

Slide 5

Slide 5 text

access control search result filtering event feeds

Slide 6

Slide 6 text

Repository User

Slide 7

Slide 7 text

Repository User Collaborator

Slide 8

Slide 8 text

Repository User Collaborator Organization

Slide 9

Slide 9 text

Repository User Collaborator Team Repository Team Organization Team Member

Slide 10

Slide 10 text

Repository User Collaborator Team Repository Team Admin Team Organization Team Member

Slide 11

Slide 11 text

repository.pullable_by?(user)

Slide 12

Slide 12 text

Organic growth and complexity

Slide 13

Slide 13 text

Performance problems

Slide 14

Slide 14 text

SELECT  r.id   FROM  r,  (      SELECT  r.id  as  r_ids,  2  as  perms  from  r      WHERE  r.owner_id  =  99999      AND  (r.x  =  0)      UNION  ALL      SELECT  r.id  as  r_ids,  1  as  perms  from  r      INNER  JOIN  p  ON  r.id  =  p.r_id      WHERE  p.u_id  =  99999      AND  (r.x  =  0)      UNION  ALL      SELECT  r.id  as  r_ids,  2  as  perms  from  r      INNER  JOIN                t  ON  r.o_id  =  t.o_id      INNER  JOIN  t_m  ON  t.id  =  t_m.t_id      WHERE  t.name  =  'X'      AND  t_m.u_id  =  99999      AND  (r.x  =  0)      UNION  ALL      SELECT  r.id  as  r_ids,  GROUP_CONCAT(distinct  t.p)  as  perms  from  r      INNER  JOIN  t_m  r_t  ON  r.id  =  r_t.r_id      INNER  JOIN                                    t  ON  r_t.t_id  =  t.id      INNER  JOIN  t_m  u_t  ON  t.id  =  u_t.t_id      WHERE  u_t.u_id  =  99999      AND  t.name  !=  'X'      AND  t.p  in  (2,  1,  0)      AND  (r.x  =  0)      GROUP  BY  r.id      UNION  ALL      SELECT  r.id  as  r_ids,  0  as  perms  from  r      JOIN  u  ON  r.plan_owner_id  =  u.id      JOIN  t  ON  t.o_id  =  u.id      JOIN  t_m  ON  t.id  =  t_m.t_id      WHERE  u.type  =  'XX'      AND  t.name  =  'X'      AND  t_m.u_id  =  99999      AND  (r.x  =  0)      AND  r.parent_id  IS    NOT  NULL  )  AS  unioned   WHERE  r.id  =  r_ids; user.accessible_repositories

Slide 15

Slide 15 text

Edge cases and transitional states

Slide 16

Slide 16 text

Difficult to build on

Slide 17

Slide 17 text

A New Permissions System '

Slide 18

Slide 18 text

Goals

Slide 19

Slide 19 text

Simple, flexible interface

Slide 20

Slide 20 text

Fast lookups

Slide 21

Slide 21 text

Easy to integrate and operate

Slide 22

Slide 22 text

Abilities

Slide 23

Slide 23 text

action Actor Subject

Slide 24

Slide 24 text

action Actor Subject User Team

Slide 25

Slide 25 text

action Actor Subject User Team read write admin

Slide 26

Slide 26 text

action Actor Subject User Team read write admin Team Repository

Slide 27

Slide 27 text

User Team Repository action Actor Subject

Slide 28

Slide 28 text

User Team Repository action Actor Subject read User read Team

Slide 29

Slide 29 text

User Team Repository action Actor Subject read write User read Team Team write Repository

Slide 30

Slide 30 text

User read Team Team write Repository User write Repository User Team Repository action Actor Subject read write write

Slide 31

Slide 31 text

user.can?  :read,  repository SELECT  1  FROM  abilities    WHERE  actor_id          =  user_id        AND  actor_type      =  'User'        AND  subject_id      =  repository_id        AND  subject_type  =  'Repository'

Slide 32

Slide 32 text

user.accessible_repositories SELECT  subject_id      FROM  abilities    WHERE  actor_id          =  user_id        AND  actor_type      =  'User'        AND  subject_type  =  'Repository'

Slide 33

Slide 33 text

Replacing the permissions system

Slide 34

Slide 34 text

Refactoring with Scientist $

Slide 35

Slide 35 text

You can’t be confident that test cases fully cover the complexity of real-world data

Slide 36

Slide 36 text

If test coverage is thin, you can’t be sure that the tests you add fully cover all behavior — especially if the indended behavior is not 100% clear in the collective knowlege of the team

Slide 37

Slide 37 text

Current behavior may have unintended bugs and side-effects that users are relying on!

Slide 38

Slide 38 text

Test suites don’t cover production performance

Slide 39

Slide 39 text

Production data is the real test

Slide 40

Slide 40 text

Compare return values Legacy Code Refactored Code repository.pullable_by?(user) execute execute return result Metrics publish

Slide 41

Slide 41 text

Scientist A Ruby library for carefully refactoring critical paths

Slide 42

Slide 42 text

class  Repository      def  pullable_by?(user)          #  old  code...      end   end  

Slide 43

Slide 43 text

class  Repository      def  pullable_by?(user)          pullable_by_legacy?(user)      end      def  pullable_by_legacy?(user)          #  old  code      end   end  

Slide 44

Slide 44 text

class  Repository      def  pullable_by?(user)          pullable_by_legacy?(user)      end      def  pullable_by_legacy?(user)          #  old  code      end      def  pullable_by_refactored?(user)          #  new  code      end   end  

Slide 45

Slide 45 text

class  Repository      include  Scientist      def  pullable_by?(user)          pullable_by_legacy?(user)      end      def  pullable_by_legacy?(user)          #  old  code      end      def  pullable_by_refactored?(user)          #  new  code      end   end  

Slide 46

Slide 46 text

class  Repository      include  Scientist      def  pullable_by?(user)          #  Let's  do  an  experiment          science  "repository.pullable_by"  do  |experiment|          end      end   end  

Slide 47

Slide 47 text

class  Repository      include  Scientist      def  pullable_by?(user)          #  Let's  do  an  experiment          science  "repository.pullable_by"  do  |experiment|              #  Return  this  value  no  matter  what              experiment.use  {  pullable_by_legacy?(user)  }          end      end   end  

Slide 48

Slide 48 text

class  Repository      include  Scientist      def  pullable_by?(user)          #  Let's  do  an  experiment          science  "repository.pullable_by"  do  |experiment|              #  Return  this  value  no  matter  what              experiment.use  {  pullable_by_legacy?(user)  }              #  Run  the  new  code  too,  and  compare  the  results              experiment.try  {  pullable_by_refactored?(user)  }          end      end   end  

Slide 49

Slide 49 text

class  Repository      include  Scientist      def  pullable_by?(user)          #  Let's  do  an  experiment          science  "repository.pullable_by"  do  |experiment|              #  Return  this  value  no  matter  what              experiment.use  {  pullable_by_legacy?(user)  }              #  Run  the  new  code  too,  and  compare  the  results              experiment.try  {  pullable_by_refactored?(user)  }              #  Some  context  for  published  results              experiment.context  :user  =>  user,  :repo  =>  self          end      end   end

Slide 50

Slide 50 text

class  Repository      include  Scientist      def  pullable_by?(user)          #  Let's  do  an  experiment          science  "repository.pullable_by"  do  |experiment|              #  Return  this  value  no  matter  what              experiment.use  {  pullable_by_legacy?(user)  }              #  Run  the  new  code  too,  and  compare  the  results              experiment.try  {  pullable_by_refactored?(user)  }              #  Some  context  for  published  results              experiment.context  :user  =>  user,  :repo  =>  self          end      end   end

Slide 51

Slide 51 text

Metrics • num times the experiment has run • num times the use and try blocks’ return values differed • timings of each block’s execution Redis • return values of use and try blocks
 for experiments that mismatched

Slide 52

Slide 52 text

No content

Slide 53

Slide 53 text

No content

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

Backfill and Validation (

Slide 59

Slide 59 text

Legacy Permissions Abilities read & verify

Slide 60

Slide 60 text

Legacy Permissions Abilities write read & verify

Slide 61

Slide 61 text

Legacy Permissions Abilities write read & verify backfill

Slide 62

Slide 62 text

Legacy Permissions Abilities write read & verify backfill

Slide 63

Slide 63 text

Legacy Permissions Abilities write read & verify backfill repair

Slide 64

Slide 64 text

Legacy Permissions Abilities write read & verify backfill repair

Slide 65

Slide 65 text

Legacy Permissions Abilities write read & verify backfill & validation repair

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

Legacy Permissions Abilities write read & verify backfill & validation repair

Slide 68

Slide 68 text

The Results %

Slide 69

Slide 69 text

No content

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

Lessons Learned )

Slide 73

Slide 73 text

Production Data is the Real Test

Slide 74

Slide 74 text

Data Quality is Paramount

Slide 75

Slide 75 text

Math is Important

Slide 76

Slide 76 text

User Team Repository read write write 1 Team with 10,000 Users × 5,000 Repositories

Slide 77

Slide 77 text

User Team Repository read write write 1 Team with 10,000 Users × 5,000 Repositories = 50,000,000 rows

Slide 78

Slide 78 text

User Team Repository read write write User read Team Team write Repository User write Repository action Actor Subject

Slide 79

Slide 79 text

No content

Slide 80

Slide 80 text

No content

Slide 81

Slide 81 text

Double-Check your Queries

Slide 82

Slide 82 text

No content

Slide 83

Slide 83 text

More Uses of Scientist *

Slide 84

Slide 84 text

Pipeline to render HTML, Markdown, and Blobs

Slide 85

Slide 85 text

Load-testing a new search cluster

Slide 86

Slide 86 text

Testing performance and correctness of query changes

Slide 87

Slide 87 text

+/github/scientist

Slide 88

Slide 88 text

Thanks! ♥