Refactoring Systems with Confidence

Refactoring Systems with Confidence

Over the last year and a half my team and I replaced a large subsystem in our application without (hardly) anyone noticing. I talked about that story, the tools and techniques we used, and what we learned along the way.

Scientist is open-source: https://github.com/github/scientist

Talk given at denver.rb and boulder.rb, December 2014.

Dc4c070ab7a2ea86886ca7f43cb49dbb?s=128

Nathan Witmer

December 09, 2014
Tweet

Transcript

  1. REFACTORING SYSTEMS with CONFIDENCE ! # $ # %

  2. NATHAN WITMER & zerowidth ' ( ) * !

  3. The DOMAIN (

  4. Permissions

  5. Who has access to what?

  6. + Access control , Search filtering - Event feeds

  7. Repository User

  8. Repository User Collaborator

  9. Repository Organization User Collaborator

  10. Repository Organization User Collaborator Team Team Repository Team Member

  11. Repository Organization User Collaborator Team Team Repository Team Member Admin

    Team
  12. repository.pullable_by?(user)

  13. user.accessible_repository_ids

  14. SELECT rs.id as ris, 2 as ps from rs WHERE

    rs.wid = 123 UNION ALL SELECT rs.id as ris, 1 as ps from rs INNER JOIN ps ON rs.id = ps.rid WHERE ps.uid = 123 UNION ALL SELECT rs.id as ris, 2 as ps from rs INNER JOIN ts ON rs.oid = ts.oid INNER JOIN tms ON ts.id = tms.tid WHERE ts.n = 1 AND tms.uid = 123 UNION ALL SELECT rs.id as ris, GROUP_CONCAT(distinct ts.p) as ps from rs INNER JOIN tms rt ON rs.id = rt.rid INNER JOIN ts ON rt.tid = ts.id INNER JOIN tms ut ON ts.id = ut.tid WHERE ut.uid = 123 AND ts.n != 1 AND ts.p in (0,1,2) GROUP BY rs.id UNION ALL SELECT r.id as ri, 0 as p from rs JOIN us ON rs.wid = us.id JOIN ts ON ts.oid = us.id JOIN tms ON ts.id = tms.tid WHERE us.type = 0 AND ts.n = 1 AND tms.uid = 123 AND (rs.p = 0) AND rs.pid IS NOT NULL WHERE rs.id = ris
  15. The PROBLEM .

  16. None
  17. + Correctness / Simplicity 0 Flexibility

  18. –@rick, 2012 We could do worse than cleaning up the

    repo / permissioning mess, but it will take quite a bit of effort. It's part of the core domain, it's in the middle of the app, and it's going to eventually bite everyone who comes near it.
  19. –@defunkt, 2012 I'd love to see someone take a step

    back and define all the possible relationships a user can have to a repository, then wrap that up in a tasty API.
  20. The REWRITE 1

  21. Abilities

  22. Subject Actor action

  23. Subject Actor action User Team

  24. Subject Actor action User Team read write admin

  25. Subject Actor action User Team Repository Team read write admin

  26. + Permissions 2 Membership

  27. User Repository Team Actor Action Subject

  28. User Repository Team read Actor Action Subject User read Team

  29. User Repository Team read write Actor Action Subject User read

    Team Team write Repository
  30. User Repository Team read write Actor Action Subject User read

    Team Team write Repository User write Repository write
  31. user.can? :read, repository

  32. user.can? :read, repository SELECT 1 FROM abilities WHERE actor_id =

    user_id AND actor_type = 'User' AND subject_id = repository_id AND subject_type = ‘Repository'
  33. user.accessible_repository_ids SELECT subject_id FROM abilities WHERE actor_type = 'User' AND

    actor_id = user_id AND subject_type = ‘Repository'
  34. The TOOLS $

  35. . Does it work? 3 Is it correct? % Is

    it fast?
  36. Legacy system New system reads

  37. scientist A Ruby library for carefully refactoring critical paths

  38. class Repository < ActiveRecord::Base def pullable_by?(user) # old code end

    end
  39. class Repository < ActiveRecord::Base def pullable_by?(user) pullable_by_legacy?(user) end def pullable_by_legacy?(user)

    # old code end end
  40. class Repository < ActiveRecord::Base def pullable_by?(user) pullable_by_legacy?(user) end def pullable_by_legacy?(user)

    # old code end def pullable_by_abilities?(user) user.can? :read, self end end
  41. class Repository < ActiveRecord::Base include Scientist def pullable_by?(user) pullable_by_legacy?(user) end

    def pullable_by_legacy?(user) # old code end def pullable_by_abilities?(user) user.can? :read, self end end
  42. class Repository < ActiveRecord::Base include Scientist def pullable_by?(user) # Let’s

    do an experiment: science "repository.pullable_by" do |experiment| end end # ... end

  43. class Repository < ActiveRecord::Base include Scientist def pullable_by?(user) # Let’s

    do an experiment: science "repository.pullable_by" do |experiment| # Return this value no matter what experiment.use { pullable_by_legacy?(user) } end end # ... end
  44. class Repository < ActiveRecord::Base include Scientist def pullable_by?(user) # Let’s

    do an experiment: science "repository.pullable_by" do |experiment| # Return this value no matter what experiment.use { pullable_by_legacy?(user) } # Run the new code too, and compare the results experiment.try { pullable_by_abilities?(user) } end end end
  45. class Repository < ActiveRecord::Base include Scientist def pullable_by?(user) # Let's

    do an experiment science "repository.pullable_by" do |experiment| # Return this value no matter what experiment.use { pullable_by_legacy?(user) } # Run the new code too, and compare the results experiment.try { pullable_by_abilities?(user) } end end end
  46. class MyExperiment < ActiveRecord::Base end

  47. class MyExperiment < ActiveRecord::Base include Scientist::Experiment end

  48. class MyExperiment < ActiveRecord::Base include Scientist::Experiment def enabled? percent_enabled >

    0 && rand(100) < percent_enabled end end
  49. None
  50. class MyExperiment < ActiveRecord::Base include Scientist::Experiment def enabled? percent_enabled >

    0 && rand(100) < percent_enabled end def publish(result) end end
  51. class MyExperiment < ActiveRecord::Base include Scientist::Experiment def enabled? percent_enabled >

    0 && rand(100) < percent_enabled end def publish(result) $statsd.timing "science.#{name}.control", result.control.duration $statsd.timing "science.#{name}.candidate", result.candidate.duration end end
  52. None
  53. class MyExperiment < ActiveRecord::Base include Scientist::Experiment def enabled? percent_enabled >

    0 && rand(100) < percent_enabled end def publish(result) $statsd.timing "science.#{name}.control", result.control.duration $statsd.timing "science.#{name}.candidate", result.candidate.duration if result.matched? $statsd.increment "science.#{name}.matched" elsif result.ignored? $statsd.increment "science.#{name}.ignored" else $statsd.increment "science.#{name}.mismatched" store_mismatch_data_in_redis(result) end end end
  54. None
  55. science "monster" do |experiment| experiment.class # => Scientist::Default end

  56. # replace `Scientist::Default` as the implementation module Scientist::Experiment def self.new(name)

    MyExperiment.find_or_initialize_by(name: name) end end
  57. # replace `Scientist::Default` as the implementation module Scientist::Experiment def self.new(name)

    MyExperiment.find_or_initialize_by(name: name) end end science "monster" do |experiment| experiment.class # => MyExperiment end
  58. class MyExperiment < ActiveRecord::Base include Scientist::Experiment def enabled? percent_enabled >

    0 && rand(100) < percent_enabled end def publish(result) $statsd.timing "science.#{name}.control", result.control.duration $statsd.timing "science.#{name}.candidate", result.candidate.duration if result.matched? $statsd.increment "science.#{name}.matched" elsif result.ignored? $statsd.increment "science.#{name}.ignored" else $statsd.increment "science.#{name}.mismatched" store_mismatch_data_in_redis(result) end end end
  59. class MyExperiment def enabled? percent_enabled > 0 && rand(100) <

    percent_enabled end end
  60. class MyExperiment def enabled? return true if Rails.env.test? percent_enabled >

    0 && rand(100) < percent_enabled end end
  61. class MyExperiment def enabled? return true if Rails.env.test? percent_enabled >

    0 && rand(100) < percent_enabled end end if Rails.env.test? MyExperiment.raise_on_mismatches = true end
  62. class Repository def pullable_by?(user) science "pullable_by" do |experiment| experiment.use {

    true } experiment.try { false } end end end
  63. class Repository def pullable_by?(user) science "pullable_by" do |experiment| experiment.use {

    true } experiment.try { false } end end end test "repository is pullable by a collaborator" do assert repository.pullable_by?(collaborator) # => raises Scientist::Experiment::MismatchError end
  64. class Repository < ActiveRecord::Base include Scientist def pullable_by?(user) science "repository.pullable_by"

    do |e| e.use { pullable_by_legacy?(user) } e.try { pullable_by_abilities?(user) } end end end
  65. class Repository < ActiveRecord::Base include Scientist def pullable_by?(user) science "repository.pullable_by"

    do |e| e.context :user => user.id, :repository => id e.use { pullable_by_legacy?(user) } e.try { pullable_by_abilities?(user) } end end end
  66. class Repository < ActiveRecord::Base include Scientist def pullable_by?(user) science "repository.pullable_by"

    do |e| e.context :user => user.id, :repository => id e.use { pullable_by_legacy?(user) } e.try { pullable_by_abilities?(user) } e.ignore { user.staff? } end end end
  67. class Repository < ActiveRecord::Base include Scientist def pullable_by?(user) science "repository.pullable_by"

    do |e| e.context :user => user.id, :repository => id e.use { pullable_by_legacy?(user) } e.try { pullable_by_abilities?(user) } e.ignore { user.staff? } e.ignore do |control, candidate| # unconfirmed users aren't handled yet: control && !candidate && user.unconfirmed? end end end end
  68. class User def teams science "user.teams" do |e| e.use {

    legacy_teams } # DatabaseTeams e.try { new_teams } # WebServiceTeams end end end
  69. class User def teams science "user.teams" do |e| e.use {

    legacy_teams } # DatabaseTeams e.try { new_teams } # WebServiceTeams e.compare do |control, candidate| control == candidate end end end end
  70. class User def teams science "user.teams" do |e| e.use {

    legacy_teams } # DatabaseTeams e.try { new_teams } # WebServiceTeams e.compare do |control, candidate| control.map(&:id) == candidate.map(&:id) end end end end
  71. . Does it work? 3 Is it correct? % Is

    it fast?
  72. Legacy system New system reads

  73. Legacy system New system reads writes

  74. class Team def add_member(user) legacy_members << user # old system

    end end
  75. class Team def add_member(user) legacy_members << user # old system

    grant user, :read # new system end end
  76. class Team def add_member(user) legacy_members << user # old system

    grant user, :read # new system end def grant(actor, permission) return unless Abilities.enabled? # ... end end
  77. Legacy system New system reads writes

  78. Legacy system New system reads writes validate and backfill

  79. The LESSONS 4

  80. Documentation

  81. Refactoring for understanding

  82. Refactoring for simplicity

  83. class Team has_many :memberships has_many :users, :through => :memberships end

    class Membership belongs_to :team belongs_to :user end class User has_many :memberships has_many :teams, :through => :memberships end
  84. # Team <--> Membership <--> User Membership.create! team.users << user

    user.teams << team membership.destroy team.users.delete user user.teams.delete team
  85. team.add_member(user) team.remove_member(user)

  86. Production data is the real test

  87. Math is important

  88. User Repository Team

  89. User Repository Team One team, with 10,000 users and 5,000

    repositories:
  90. User Repository Team read One team, with 10,000 users and

    5,000 repositories: 10,000 user → team records
  91. User Repository Team read write One team, with 10,000 users

    and 5,000 repositories: 10,000 user → team records 5,000 team → repository records
  92. User Repository Team read write write One team, with 10,000

    users and 5,000 repositories: 10,000 user → team records 5,000 team → repository records 50,000,000 user → repository records
  93. User Repository Team read write write One team, with 10,000

    users and 5,000 repositories: 10,000 user → team records 5,000 team → repository records 50,000,000 user → repository records
  94. Measurements give you confidence

  95. None
  96. Scientist for everything?

  97. class User def search_repositories(query) science "new-search-cluster" do e.use { search_old_cluster(query)

    } e.try { search_new_cluster(query) } e.compare { true } # don't care end end end
  98. Incremental development

  99. The RESULT %

  100. Abilities system is authoritative

  101. The code is cleaner

  102. + Correctness / Simplicity 0 Flexibility

  103. ?

  104. THANKS! ♥ & github/scientist