Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Refactoring Code with the Standard Library - NBPy 2018

John Reese
November 04, 2018

Refactoring Code with the Standard Library - NBPy 2018

What if you could refactor your entire code base, safely and automatically? How much old code could you fix or replace if you didn't worry about updating every reference by hand? I'll show you how a concrete syntax tree (CST) can help you do just that using only the standard Python library.

Python includes a concrete syntax tree (CST) in the standard library, useful for mass refactoring code bases of all sizes. I'll walk through the differences between abstract and concrete syntax trees (AST and CST), why a CST is useful for refactoring, and how you can build basic refactoring tools on top of a CST to modify your entire code base quickly and safely. Lastly, I'll demonstrate what's possible with these tools, including upgrading code to new interfaces, or wholesale movement of code between modules. This talk assumes a basic understanding of how Python works, but will otherwise provide enough context to be useful to those who haven't previously worked with syntax trees.

John Reese

November 04, 2018
Tweet

More Decks by John Reese

Other Decks in Programming

Transcript

  1. • Principal sponsor of the PSF • Third most popular

    in Facebook • Biggest language in Instagram • Frontend, backend, tools, and automation • Need safe, reliable refactoring tools Python @ Facebook
  2. • Consistent style or formatting • Remove code smells •

    Enhance or replace an API • Support new use cases • Remove dead code Why refactor?
  3. • Usually automated refactoring • Atomic changes to the entire

    codebase • Update API and consumers simultaneously • Ensure no build/tests are broken Code mods
  4. • Modify code as nested objects • Based on Python

    grammar • Semantic context for elements • “Guaranteed” valid syntax Syntax tree refactoring
  5. • Tree structure, nodes and leaves • Semantic representation of

    code • Intended for execution Abstract syntax tree
  6. • Tree structure, nodes and leaves • Decomposed units of

    syntax and grammar • Literal representation of on-disk code • Whitespace, formatting, comments, etc Concrete syntax tree
  7. • Concrete syntax tree • Built for the 2to3 tool

    • Can parse all Python grammars lib2to3
  8. • Part of the standard library • Always up to

    date with new syntax • Contains refactoring framework Why lib2to3?
  9. • Leaf for each distinct token • Node for semantic

    groupings • Nodes contain one or more children • Generic objects, token/symbol type • Collapsed grammar Tree structure
  10. • Designed for 2to3 tools • Pattern match to find

    elements • In-place transforms to tree Fixers
  11. • Search for grammar elements • Can be arbitrarily nested,

    combined • Capture specific nodes or leaves • Include literals or token types Pattern Matching
  12. • Called for each match • Add, modify, remove, or

    replace elements • Not restricted to matched elements Transforms
  13. • Runs fixers on each file • Runs transforms at

    matching nodes • Collects final tree to diff/write • Defaults to loading 2to3 fixers Refactoring Tool
  14. • Code mod framework • Built on lib2to3 primitives •

    Fluent API to generate fixers • Optimized for large codebases • MIT Licensed Bowler
  15. • Automatic support for new Python releases • Encourages reuse

    of components • Productionizes common refactoring • Useful as a tool and a library Why Bowler?
  16. • Selectors build a search pattern • Optionally filter elements

    • Modify matched elements • Compose multiple transforms • Generate diffs or interactive results Query pipeline
  17. • Facebook Incubator project • Fluent API is fluid •

    Incomplete set of selectors, filters, transforms • Needs more unit testing Bowler
  18. • More use cases • Linter features • Integrations •

    More testing • More contributors! Roadmap