Slide 1

Slide 1 text

The Missing Static Type Ballad quasilyte @ PHP Yoshkar-Ola 2019 vk.com infrastructure team

Slide 2

Slide 2 text

Bef ore we begin... This presentation is created in LibreOffice Impress. I didn’t like the experience at all.

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

Few words about NoVerify ● Fast: several times f aster than most linters. ● Extensible: extensions in Go and PHP. ● Language server: supports LSP. Telegram group: https:/ /t.me/noverify_linter

Slide 5

Slide 5 text

What type of a presentation this is? I want to convince you that PHP needs more type f acilities even though it get a f ew new f eatures recently.

Slide 6

Slide 6 text

Why do we need types inf o? ● Documentation: API contracts ● IDE: navigation, autocomplete, ref actoring, etc. ● Static analysis: find more bugs. ● JIT and AOT: more optimization space. ● Meta: more inf o f or API/schema/code gen.

Slide 7

Slide 7 text

Dynamic (and implicit) types ● Documentation: API contracts ● IDE: navigation, autocomplete, ref actoring, etc. ● Static analysis: find more bugs. ● JIT and AOT: more optimization space. ● Meta: more inf o f or API/schema/code gen. (As long as reflection is enough f or you.)

Slide 8

Slide 8 text

How much types do we need?

Slide 9

Slide 9 text

How much types do we need? For tools, it’s “as much as possible”.

Slide 10

Slide 10 text

How much types do we need? For tools, it’s “as much as possible”. For humans, we need to strike a good balance, so people get enough inf ormation and do not f eel overwhelmed.

Slide 11

Slide 11 text

“Loosing static types inf o since 1995” © PHP

Slide 12

Slide 12 text

How exactly do we loose types? Almost every PHP program has type leaks

Slide 13

Slide 13 text

Late static binding (bad-1) class Foo { /** @return Foo */ public static function create() { return new static(); } } class Bar extends Foo {} $b = Bar::create(); // $b:Foo

Slide 14

Slide 14 text

Late static binding (bad-2) class Foo { /** @return self */ public static function create() { return new static(); } } class Bar extends Foo {} $b = Bar::create(); // $b:Foo

Slide 15

Slide 15 text

Late static binding (fixed) class Foo { /** @return static */ public static function create() { return new static(); } } class Bar extends Foo {} $b = Bar::create(); // $b:Bar

Slide 16

Slide 16 text

array type hint (bad) function first_value(array $xs) { foreach ($xs as $x) { return $x->value; // $x:mixed } return null; }

Slide 17

Slide 17 text

array type hint (fixed) /** @param $xs WithValue[] */ function first_value(array $xs) { foreach ($xs as $x) { return $x->value; // $x:WithValue } return null; }

Slide 18

Slide 18 text

Mixed type propagation 1 function identity($x) { return $x; } $x = 10; // $x:int $y = identity($x); // $y:mixed

Slide 19

Slide 19 text

Mixed type propagation 2 $i = 1; // $i:int $mixed = [$i]; $i2 = $mixed[0]; // $i2:mixed

Slide 20

Slide 20 text

Guide: how not to loose types inf o “You don’t know what you have until it’s gone”

Slide 21

Slide 21 text

Docs: human-only types // $xs is expected to be an array // of integers (only int keys). function last($xs) { return $xs[count($xs)-1]; } // Loosing all types info.

Slide 22

Slide 22 text

Add type hints function last(array $xs) : int { return $xs[count($xs)-1]; } // Still missing a lot of info...

Slide 23

Slide 23 text

Add phpdoc tags /** * @param int[] $xs * * @return int */ function last(array $xs) : int { return $xs[count($xs)-1]; } // No int-keys restriction...

Slide 24

Slide 24 text

Add generics-aware tags /** * @param int[] $xs * @psalm-param array * @return int */ function last(array $xs) : int { return $xs[count($xs)-1]; }

Slide 25

Slide 25 text

It doesn’t look good... Can we do better?

Slide 26

Slide 26 text

If we had generic arrays function last(array $xs) : int { return $xs[count($xs)-1]; }

Slide 27

Slide 27 text

PHP 10 ʕ⊙ϖ⊙ʔϖ⊙ϖ⊙ʔʔ func last(xs []int) int { return xs[len(xs)-1] }

Slide 28

Slide 28 text

Seriously, we need changes ● Educate people, explain why we need them or at least “generic arrays”. ● Get noticeable community support. ● Work on the v2 proposal f or generics. ● Convince PHP devs that this f eature is needed. ● Find people who will implement generics.

Slide 29

Slide 29 text

Type inf ormation sources Type inf erence and related problems

Slide 30

Slide 30 text

Type inf erence Since types inf ormation is mostly implicit, we need to inf er it from expressions. It’s not always possible to get a precise result, since we almost always loose at least some types inf o along the way. Type can also depend on the run-time inf ormation that we don’t have.

Slide 31

Slide 31 text

Type guessing game Since types inf ormation is mostly implicit, we need to inf er it from expressions. It’s not always possible to get a precise result, since we almost always loose at least some types inf o along the way. Type can also depend on the run-time inf ormation that we don’t have. Trying to guess types

Slide 32

Slide 32 text

Precise type inf o The most reliable types inf o sources

Slide 33

Slide 33 text

Scalars and literals $i = 1; // $i:int $f = 1.3; // $f:float $s = "x"; // $s:string $o = new Obj(); // $o:Obj $ii = [1, 2]; // $ii:int[]

Slide 34

Slide 34 text

Smart-casts if ($o instanceof Obj) { // $o is Obj inside this block. } if (!is_string($s)) { return } // $s is string below this if statement

Slide 35

Slide 35 text

Type hints (in strict mode) function f(int $i, array $xs) { // $i is int. // $xs is an array (of mixed). }

Slide 36

Slide 36 text

Typed properties class Typed { public int $i = 10; public ?string $s; } $t = new Typed(); $i = $t->i; // $i:int $s = $t->s; // $s:?string

Slide 37

Slide 37 text

Imprecise type inf o Secondary types inf o sources

Slide 38

Slide 38 text

phpdoc annotations /** @var DBConnection $db */ global $db; // + function/methods phpdocs. // + PhpStorm meta files (stubs).

Slide 39

Slide 39 text

Flow-related types if ($moon_phase) { $x = 10; } else { $x = "hello"; } // $x:int|string

Slide 40

Slide 40 text

Optimistic inf erence $xs = [1, 2]; // $xs:int[] $x = $xs[$i]; // $x:int // In reality, if $i key is not in $xs, // $x will be null, so the exact type // if more like ?int, but most // programs perform optimistic inference, // where we omit some details...

Slide 41

Slide 41 text

Pragmatical sacrifices It can be bad to be smarter than PhpStorm when we’re talking about types. You don’t want to resolve less types, but it’s not always desirable to resolve more. This is especially true with global types inf erence.

Slide 42

Slide 42 text

Local type inf erence T o get expression type, local type inf erence only uses that expression plus the context inf o (variable types, functions inf o, etc).

Slide 43

Slide 43 text

Global type inf erence With global type inf erence, an expression type might depend on very distant parts of a program. Seemingly irrelevant code changes can cause a lot of changes in types inf erence results.

Slide 44

Slide 44 text

Local type inf erence // $x => ? // returns => ? function get_x($p) { return $p->x; } // Somewhere inside a code base: $p = new Point(); $x = get_x($p);

Slide 45

Slide 45 text

Global type inf erence // $x => Point // returns => float function get_x($p) { return $p->x; } // Somewhere inside a code base: $p = new Point(); $x = get_x($p);

Slide 46

Slide 46 text

Global type inf erence // $x => Point|int // returns => float|null function get_x($p) { return $p->x; } // Somewhere inside a code base: $p = new Point(); $x = get_x($p); $x2 = get_x(19);

Slide 47

Slide 47 text

Side-by-side comparison Local + Simplicity + Locality + Faster Good enough f or most static analysis tools. Global + Completeness + Precision Good f or optimizing compilers and audit- oriented static analysis tools.

Slide 48

Slide 48 text

NoVerify types resolving

Slide 49

Slide 49 text

The problem function f3() { // f3:? return f2(); } function f2() { // f2:? return f1(); } function f1() { // f1:int return 10; }

Slide 50

Slide 50 text

The problem ● Dependent symbols can live f ar away from each other (diff erent parts of a pro ject). ● Pro jects can be too large to keep them in memory (several GB). ● We don’t want to make extra “passes” over the source code (too slow). ● We also don’t want to re-calculate all types when one file is changed.

Slide 51

Slide 51 text

Solution: lazy types function f3() { // f3:f2() return f2(); } function f2() { // f2:f1() return f1(); } function f1() { // f1:int return 10; }

Slide 52

Slide 52 text

Solution: lazy types ● First pass: index the entire pro ject, record symbols inf o and lazy types. ● Second pass: do the analysis itself. When type inf o is needed, it’s “solved” on demand. Only files that are currently being analyzed are loaded into the memory.

Slide 53

Slide 53 text

Solving f3() $x = f3(); // $x:f3()

Slide 54

Slide 54 text

Solving f3() $x = f3(); // $x:f2() typeof(f3()) => call(f2)

Slide 55

Slide 55 text

Solving f3() $x = f3(); // $x:f1() typeof(f3()) => call(f2) typeof(call(f2)) => call(f1)

Slide 56

Slide 56 text

Solving f3() $x = f3(); // $x:int typeof(f3()) => call(f2) typeof(call(f2)) => call(f1) typeop(call(f1)) => int

Slide 57

Slide 57 text

Challenges ● 2-passes limitation make it harder to collect whole-program f acts. ● Lazy types are slower than precalculated types. If we cache them, we loose some of their benefits.

Slide 58

Slide 58 text

Is single-pass possible? If we have f orward declarations, like in C, then yes. But that’s not what you would expect from a modern programming language.

Slide 59

Slide 59 text

Metadata cache ● The “first pass” (indexing) is only executed if we don’t have file inf o. If there is none, indexing is executed and results are saved to a disk. So in practice it’s one-pass in some cases.

Slide 60

Slide 60 text

Get involved! NoVerify is an open-source pro ject, your contributions are welcome.

Slide 61

Slide 61 text

PhpStorm limitations

Slide 62

Slide 62 text

Imprecise suggestions (1/2) function f($x) { if (…) { return false; } return (int)g(); }

Slide 63

Slide 63 text

Imprecise suggestions (2/2) /** @return int|bool */ function f($x) { if (…) { return false; } return (int)g(); }

Slide 64

Slide 64 text

Union-typed array elements /** @var (Foo|Bar)[] $a */ $a = [ new Foo(), new Bar(), ]; $foo = $foos[0]; // $foo:mixed

Slide 65

Slide 65 text

Homogeneous array literals $foos = [ new Foo(), new Foo(), ]; // $foo is not inferred to be Foo. // Need @var phpdoc. $foo = $foos[0]; // $foo:mixed

Slide 66

Slide 66 text

Tuples /** @return tuple(int,string) */ function add1($x) { if (!is_numeric($x)) { return [0, '$x must be numerical']; } return [$x + 1, '']; } list($v, $err) = add1($x); // $v:mixed, $err:mixed

Slide 67

Slide 67 text

Resources ● Generic arrays RFC (2016) ● Generics RFC (2016) ● Generics and why we need them ● Typed properties RFC ● PhpStorm stubs ● PhpStorm deep-assoc plugin ● PHPDoc types f ormat (ABNF)

Slide 68

Slide 68 text

NoVerify resources ● Habr: NoVerify public announcement ● Habr: NoVerify dynamic rules ● NoVerify Telegram group

Slide 69

Slide 69 text

Keep your types saf e!

Slide 70

Slide 70 text

The Missing Static Type Ballad quasilyte @ PHP Yoshkar-Ola 2019 vk.com infrastructure team