Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Refactoring Code with the Standard Library - NBPy 2018

9c98621c87f4f1333e0dcbb5282fa65b?s=47 John Reese
November 04, 2018

Refactoring Code with the Standard Library - NBPy 2018

What if you could refactor your entire code base, safely and automatically? How much old code could you fix or replace if you didn't worry about updating every reference by hand? I'll show you how a concrete syntax tree (CST) can help you do just that using only the standard Python library.

Python includes a concrete syntax tree (CST) in the standard library, useful for mass refactoring code bases of all sizes. I'll walk through the differences between abstract and concrete syntax trees (AST and CST), why a CST is useful for refactoring, and how you can build basic refactoring tools on top of a CST to modify your entire code base quickly and safely. Lastly, I'll demonstrate what's possible with these tools, including upgrading code to new interfaces, or wholesale movement of code between modules. This talk assumes a basic understanding of how Python works, but will otherwise provide enough context to be useful to those who haven't previously worked with syntax trees.

9c98621c87f4f1333e0dcbb5282fa65b?s=128

John Reese

November 04, 2018
Tweet

Transcript

  1. None
  2. Refactoring Code With the Standard Library John Reese Production Engineer,

    Facebook @n7cmdr
 github.com/jreese
  3. • Principal sponsor of the PSF • Third most popular

    in Facebook • Biggest language in Instagram • Frontend, backend, tools, and automation • Need safe, reliable refactoring tools Python @ Facebook
  4. • Modify source code • Change names or interfaces •

    Update all references Refactoring
  5. • Consistent style or formatting • Remove code smells •

    Enhance or replace an API • Support new use cases • Remove dead code Why refactor?
  6. None
  7. None
  8. • Usually automated refactoring • Atomic changes to the entire

    codebase • Update API and consumers simultaneously • Ensure no build/tests are broken Code mods
  9. Syntax Trees

  10. • Modify code as nested objects • Based on Python

    grammar • Semantic context for elements • “Guaranteed” valid syntax Syntax tree refactoring
  11. • Tree structure, nodes and leaves • Semantic representation of

    code • Intended for execution Abstract syntax tree
  12. None
  13. Call Name [] args [] func keywords print id Str

    ‘Hello World’ s
  14. None
  15. • Tree structure, nodes and leaves • Decomposed units of

    syntax and grammar • Literal representation of on-disk code • Whitespace, formatting, comments, etc Concrete syntax tree
  16. lib2to3

  17. • Concrete syntax tree • Built for the 2to3 tool

    • Can parse all Python grammars lib2to3
  18. • Part of the standard library • Always up to

    date with new syntax • Contains refactoring framework Why lib2to3?
  19. • Leaf for each distinct token • Node for semantic

    groupings • Nodes contain one or more children • Generic objects, token/symbol type • Collapsed grammar Tree structure
  20. None
  21. None
  22. power

  23. power atom_expr

  24. power atom_expr atom

  25. power atom_expr atom NAME

  26. power atom_expr trailer atom NAME

  27. power atom_expr trailer arglist atom NAME

  28. power atom_expr trailer arglist argument atom NAME

  29. power atom_expr trailer arglist argument atom NAME STRING

  30. power atom_expr trailer arglist argument atom NAME STRING

  31. power atom_expr trailer arglist argument atom NAME STRING

  32. None
  33. None
  34. None
  35. None
  36. None
  37. Building Code Mods

  38. • Designed for 2to3 tools • Pattern match to find

    elements • In-place transforms to tree Fixers
  39. • Search for grammar elements • Can be arbitrarily nested,

    combined • Capture specific nodes or leaves • Include literals or token types Pattern Matching
  40. • Called for each match • Add, modify, remove, or

    replace elements • Not restricted to matched elements Transforms
  41. None
  42. None
  43. None
  44. None
  45. None
  46. None
  47. None
  48. None
  49. None
  50. • Runs fixers on each file • Runs transforms at

    matching nodes • Collects final tree to diff/write • Defaults to loading 2to3 fixers Refactoring Tool
  51. None
  52. None
  53. github.com/jreese/pycon

  54. Safe refactoring for modern Python

  55. • Code mod framework • Built on lib2to3 primitives •

    Fluent API to generate fixers • Optimized for large codebases • MIT Licensed Bowler
  56. • Automatic support for new Python releases • Encourages reuse

    of components • Productionizes common refactoring • Useful as a tool and a library Why Bowler?
  57. • Selectors build a search pattern • Optionally filter elements

    • Modify matched elements • Compose multiple transforms • Generate diffs or interactive results Query pipeline
  58. None
  59. None
  60. None
  61. None
  62. None
  63. None
  64. None
  65. Query Chaining

  66. None
  67. None
  68. None
  69. None
  70. None
  71. None
  72. None
  73. None
  74. None
  75. None
  76. None
  77. None
  78. None
  79. None
  80. None
  81. None
  82. None
  83. None
  84. None
  85. None
  86. None
  87. • Facebook Incubator project • Fluent API is fluid •

    Incomplete set of selectors, filters, transforms • Needs more unit testing Bowler
  88. • More use cases • Linter features • Integrations •

    More testing • More contributors! Roadmap
  89. https://pybowler.io

  90. John Reese Production Engineer, Facebook @n7cmdr
 github.com/jreese https://pybowler.io