A Prettier Printer

A Prettier Printer A Different Take on Formatting Revolution Conf
2017 @jlongster

@jlongster github.com/jlongster James Long http://jlongster.com I’m James Long, blah blah

* Last December I wrote a tool call Prettier, which
is a syntax formatting tool for JavaScript. It's been widely adopted and used in major projects like React and Jest, and we have over 10,000 stars on github in about 4 months. It’s been tremendous to watch it grow. I’m going to talk about why I built it and how it can help you. * A question I like to ask: "what are we wasting our time on?" * We waste so much time when we build software. Some of it is social problems and team dynamics. Some of it is technical problems, because building software is diﬃcult.

• Core architecture • Testing • Performance • Documentation •
Design • Tooling (CI/reviews/etc) • Deployment • Monitoring/reporting/logging • Security • Accessibility • User experience • And more! * There are so many things to get right… * You have to prioritize, and do it wisely. If you spend time optimizing a bunch of code and making it more accessible, only to ﬁnd that you need to heavily refactor your code later due to new needs, you've completely wasted your time. Same with designing something to ﬁnd out you don't need it. You'll never get rid of waste, but good scheduling will help reduce it. * All this work is "large scale", and I want to talk about "small scale" work…

Small Scale * This focuses on the individual, and is
all the stuﬀ that you do every few minutes. Typing code, getting feedback from your editor, refactoring a function into multiple smaller functions, etc. Optimizing this kind of work is about streamlining your environment. Making it as easy as possible to get the ideas from your head into working code. * You end up wasting time here with small distractions. Anything that keeps bringing you out of focus makes you slower. It's a huge diﬀerence if your build step takes 1 second or 10 seconds. If you're tools error frequently. If your API are bad and you keep having to look up documentation every time you need to call one of them. The problem isn't the actual time wasted, but the context switching and distractions which prevent you from being able to focus.

* A good example of removing distractions is live coding,
or known as hot module replacement. This establishes a direct connection from your brain to the screen, allowing you to make any changes and instantly see them without having to reload and ﬁddle with the UI. It only saves a few seconds, but the eﬀect is quite transformative when you're experimenting with a lot of changes. * I believe there is another place where we are constantly distracted when writing code, and it's slowing us down: syntax. It’s slow to write code that involves complex syntax, and think about how to format it, especially when continually refactoring code. It distracts you and wastes time.

S-expressions paredit / parinfer One answer is to simplify the
syntax. Languages like Lisp and Scheme do this by using s-expressions, where the entire language is made up of parentheses and brackets. This allows powerful tools like paredit and parinfer, which performs structural editing of code. Work with expressions, not lines of code. (live demo, open up clojure.cljs, search for update-nl-state) But what are we to do in languages like JavaScript? How can we narrow this gap between thinking of code and writing it, when complex syntax makes it slow to write and refactor code?

JavaScript has tools like eslint, which is great for catching
semantic errors and pointing out pitfalls in your code, but it's not good for formatting because it's very piecemeal and only forces silly stuﬀ like whitespace in the right place. We need a more holistic approach for formatting. Last fall started writing ReasonML code, and they already have a tool out-of-the-box for formatting it. As a beginner it was enlightening to write some code, try to format it, it would tell me what was invalid, I'd ﬁx, and then it would format it in standard ReasonML code automatically. This opened my eyes, and I never wanted to manually format code again. (have gif)

I started researching how to build it for JavaScript. I
wanted a tool that would cast aside any of the original formatting and completely reformat a ﬁle. This would not only let you write whatever kind of ugly code you want, but also force consistency across teams.

JavaScript AST Formatted JavaScript compile print A straight-forward way to
do this is to compile JavaScript to an AST and pretty-print the AST. There wasn't any tool that already did this good enough. Recast was the best at the time, but it has diﬀerent goals like respecting the original formatting, so I forked it and had a good place to start. I built it and it works! Let me show you an example. (live demo) * open ReactChildFiber.js and search for useFiber, add conditional and function call with object as last param * open budget.js and move JSX around

It turns out what we are doing is called "pretty
printing" and there's already a large amount of research on this over the last few decades. In my research I found Wadler's paper, "A Prettier Printer", which describes an algorithm for doing this. This is the algorithm that prettier still uses today. You might be wondering: why is this so hard? Can't you just apply a few ﬁxed rules when printing each AST node type? No. Let me show you an example

let config = { flag: false }; Say you have
an object

let config = { flag: false, makeErrorMessage(err) { return err.toString()
}}; Now we add a new property. This is getting long and we already have to wrap the code on the slide.

}}; Our style enforces newlines at the beginning and end of function bodies, so it would do this which ﬁxes that problem But that's clearly not an ideal format…

init(makeConfig({ flag: false, makeErrorMessage(err) { return err.toString() }})); especially if
it's wrapped in something else like function calls

} }; What we need to do is "break" the object up like this

init( makeConfig({ flag: false, makeErrorMessage(err) { return err.toString(); } })
); And for the function calls, we need to break them too. We need to break everything up to the top-level expression

let config = { flag: false, count: 5 }; init(makeConfig({
flag: false, count: 5 })); But all of this is done only because the `makeErrorMessage` function was added. Adding any other property that doesn't have any newlines shouldn't break up anything There are *all sorts* of cases like this where we want to do a speciﬁc thing based on how syntax is composed.

gofmt `gofmt` is a similar tool for Go. It's worth
nothing that it's a much simpler printer that *does* mostly implement ﬁxed rules for each type of syntax. The reason they can get away with it is because Go's syntax isn't as complex as JavaScript, and because they've had that tool since Go came out, so they were able to enforce style from the beginning. JavaScript's syntax is a lot more complex, and how we use it is a lot more diverse. We do a lot of weird things in JS. For this tool to be successful, we need to accommodate all of this so people will actually use it. There's something important that I haven't talked about yet, and it's a key part of all of this: the maximum line length. This is a really important factor if you want this to work. Take `gofmt` for example, it allows this style

func foo(arg1 int, arg2 int, arg3 int, arg4 int, arg5
int, arg6 int, arg7 int, arg8 int) (res int) { return 1 } That line is 100 characters long. gofmt doesn't ever enforce a maximum line length. It's fine for Go because the syntax and general patterns already encourage spreading out code vertically, and the syntax is simple enough that if something really gets too long, it's not hard to break it up in a consistent way. But JavaScript is more complex and there is a lot of diverse styles already, and it's too easy to write inconsistent code because we break it up differently in different places

const variable = someCondition && otherCondition ? callFunction1(arg1, arg2) :
callFunction2(arg1, arg2); For example, take this piece of code

const variable = someCondition && otherCondition ? callFunction1(arg1, arg2) :
callFunction2(arg1, arg2); const variable = someCondition && otherCondition ? callFunction1(arg1, arg2) : callFunction2(arg1, arg2); const variable = someCondition && otherCondition ? callFunction1(arg1, arg2) : callFunction2(arg1, arg2); Every time I need to break up code I need to decide what the format should be, and it's usually a unique composition of syntax that takes a good bit of energy to think about the best way to break that specific piece of code up. This brings us out of focus. It wastes time. It's also very likely that your coworkers would break it up differently, causing style inconsistency. The maximum line length is a critical piece of information for formatting your code. That's what really makes prettier different, and capable of taking on the problem of JavaScript formatting. Without it we can't actually enforce a consistent style and you would still be required to intervene all the time. We want you to be able to trust prettier completely and not think about formatting at all.

f(code: string, printWidth: int): string Wadler's paper describes an algorithm
that takes a string of code and a maximum line length, and returns a formatted string of code. That's all we need. `refmt`, the formatter for ReasonML, works this way too. It works by trying to fit as much code as it can within the width parameter, and breaking expressions apart if it can't fit. I'll show you an example. DEMO: Here is the entire code of React, and right now the print width is set to 80. It's breaking up the code to all fit within 80. Here I have a special build of prettier that aggressively inlines code. This build never forces a newline, meaning it will fit the entire file into a single line of code if it can. The normal build of prettier forces hard newlines in many places and wouldn't do this. (live demo: show React file, bump up print width to a high number, then to a really small number)

JavaScript AST Formatted JavaScript compile print Let's look at how
this works. Before I mentioned 2 stages: parse to an AST and then pretty-print it. There are actually 3 stages…

JavaScript AST Formatted JavaScript compile compile Document print First we
parse the JavaScript using any of the currently available parsers. Then we take this AST and generate a "document" representation of it, which is a very simple language that describes groups of strings. Then we generate a string from the document representation given a line width. The reason there's a sort of "intermediate" representation is because we need to measure the strings in order to see if they will ﬁt on the same line. This has to be a second pass where all the strings are available, otherwise it'd be a lot more complicated as we'd have to measure *after* the real code is generated and do a lot of reprinting. It's actually a lot more performant to emit a basic document representation that we can measure really fast and make a lot of decisions there.

group One of the fundamental document types is `group`, and
that tells the printer that it can break everything inside of it across lines. If you think about it, when the printer measures an entire program and it doesn't ﬁt on one line, how does it know what to break? Without `group` it would either print the whole program on one line or, when it doesn't ﬁt, break the entire program up. `group` limits the breaking to just that section.

callFunction(arg1, arg2); reallyLongFunction(longArgument1, longArgument2, longArgument3, longArgument4); Here are two expressions,
both function calls but the second one is a lot longer

group(concat([ "callFunction(", line, "arg1,", line, "arg2", line, ")" ])) group(concat([
"reallyLongFunction(", line, "longArgument1,", line, "longArgument2,", line, "longArgument3", line, ")" ])) Here is the document representation of them (somewhat simplified). `group`, `concat`, and `line` are all “commands” that instruct the printer. Here we have 2 groups that simply combine a bunch of strings, with newlines in between them if they don’t fit. The printer will measure each group separately and break any of them that doesn’t fit.

callFunction(arg1, arg2); reallyLongFunction( longArgument1, longArgument2, longArgument3, longArgument4 ); So the
second group, since it’s a lot longer, is going to break first before the first group. Groups give the printer an atomic unit to measure and break if it doesn’t fit.

outerFunction(reallyLongFunction(arg1, arg2, arg3, arg4)); Groups can be nested. Here we
are passing the result of `reallyLongFunction` to another function, `outerFunction`.

group(concat([ "outerFunction(", line, group(concat([ "reallyLongFunction(", line, "arg1,", line, "arg2,", line,
"arg3", line "arg4", line, ")" ])), line, ")" ])) The document representation looks like this. There are two groups, and one is inside the other. Prettier will always break the outer group ﬁrst. This makes sense if you think about it: the outer group always encloses the inner group, so it's always going to hit the max line length ﬁrst.

outerFunction( reallyLongFunction(arg1, arg2, arg3, arg4) ); So the above document
could be printed as this, where the ﬁrst group was broken (it moved down the call to `reallyLongFunction`) but that call was not broken up.

outerFunction( reallyLongFunction( arg1, arg2, arg3, arg4 ) ); If both
were broken it'd look like this.

When Reality Hits You've seen the basic rules by which
prettier operates, but making a real-world JavaScript formatter is actually a lot harder. There are so many diverse patterns of JS code that naively applying a formatter to real code would make a lot of stuﬀ ugly. So we have all kinds of hand-tuned formats embedded in the prettier logic. Here are some examples

Break last argument ﬁrst foo(arg1, arg2, arg3); foo( arg1, arg2,
arg3 );

Break last argument ﬁrst foo(1, 2, 3, { pleaseJustWork: true,
andBeFast: true }); foo(1, 2, 3, { pleaseJustWork: true, andBeFast: true }); foo(1, 2, 3, val => { return val * 2; });

Member expressions & chains object.propertyThatIsTooLong() object .propertyThatIsTooLong() .chained(); object .propertyThatIsTooLong()
.more.prop.accesses.chained();

Member expressions & chains promise.then(x => { return x +
1 }); promise .then(x => { return x + 1; }) .catch(err => { logError(err); });

Member expressions & chains Object.keys(obj).map(name => { return name.toLowerCase(); });
Object.keys(obj) .map(name => { return name.toLowerCase(); }) .filter(name => name !== "james"); object .keys(obj) .map(name => { return name.toLowerCase(); }) .filter(name => name !== "james");

Comments if(x) { foo(); } else { bar(); }

Comments // In case this happens, call foo if(x) {
foo(); } // Otherwise call bar! else { bar(); } // In case this happens, call foo if (x) { foo(); } else { // Otherwise call bar! bar(); }

Comments // In case this happens, call foo if(x) {
foo(); } // Otherwise call bar! else { bar(); }

And lots more… *Holy crap* there are so many other
cases like this.

Teachability x && y || z (x && y) ||
z;

Takeaways A few takeaways to remember from this talk.

Takeaways • Consistency • Teachability • Freedom The first and
foremost win with this tool is consistency. Not necessarily consistency across the entire JS community, but at least consistency across a company or at least a team. The second win is teachability. Prettier gives a certain tactile feeling to your code, allowing you to actually learn from what it does. The third, and most important, win is freedom. Prettier is about a lot more than just consistency. You just don’t have to worry about how your code looks anymore. This is immensely freeing when writing code, and allows you to focus on the real problem and move around code quickly. There’s an interesting side effect of all of this, which is that if prettier gets used on a majority of projects, we have a chance to allow you, as an individual, to view the code however you want. You can load the code, print it with a custom printer yourself, and you just need to re-print it with the project before committing. You'd be able to always view the code in your own style no matter where it came from. This is why we've relaxed our opinion on options, and prettier comes with several options for major things like semicolons and tabs. You could even write a completely custom printer that prints it in a totally different syntax. For what it's worth, I think there's great value in standardizing on a general format for JavaScript. But the fact is that the JavaScript community is *massive* and I don't think we'll ever converge on a single style for all. It's possible that given the flexibility to view code how you want, the JavaScript stored on the disk becomes an artifact that you don't really care about and we never have to debate about syntax again. The community has been phenomenal and we’ve received so many great contributions….

New Release: 1.4.0! TypeScript and CSS support I’d like to
announce that we just released a new version that includes TypeScript and CSS support. We already have really good JSX and Flow support, and it’s great that we can support other variants as well. I want to thank Christopher Chedeux speciﬁcally for being such an active maintainer and contributor, and pushing me to complete this project. Thank you.

Thank you! James Long @jlongster github.com/jlongster

A Prettier Printer

A Prettier Printer

More Decks by James Long

Featured

Transcript