Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PHP code->rules
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Iskander (Alex) Sharipov
October 24, 2020
Programming
78
1
Share
PHP code->rules
Iskander (Alex) Sharipov
October 24, 2020
More Decks by Iskander (Alex) Sharipov
See All by Iskander (Alex) Sharipov
quasigo
quasilyte
0
97
Go gamedev: XM music
quasilyte
0
140
Zero alloc pathfinding
quasilyte
0
670
Mycelium
quasilyte
0
96
Roboden game pitch
quasilyte
0
280
Ebitengine Ecosystem Overview
quasilyte
1
970
Go gamedev patterns
quasilyte
0
520
profile-guided code analysis
quasilyte
0
380
Go inlining
quasilyte
0
150
Other Decks in Programming
See All in Programming
t *testing.T は どこからやってくるの?
otakakot
1
690
ドメインイベントでビジネスロジックを解きほぐす #phpcon_odawara
kajitack
3
790
Server-Side Kotlin LT大会 vol.18 [Kotlin-lspの最新情報と Neovimのlsp設定例]
yasunori0418
1
160
GNU Makeの使い方 / How to use GNU Make
kaityo256
PRO
16
5.6k
Running Swift without an OS
kishikawakatsumi
0
840
アクセシビリティ試験の"その後"を仕組み化する
yuuumiravy
0
150
書籍「ユーザーストーリーマッピング」が私のバイブル
asumikam
4
380
Spec Driven Development | AI Summit Vilnius
danielsogl
PRO
1
110
Making the RBS Parser Faster
soutaro
0
440
年間50登壇、単著出版、雑誌寄稿、Podcast出演、YouTube、CM、カンファレンス主催……全部やってみたので面白さ等を比較してみよう / I’ve tried them all, so let’s compare how interesting they are.
nrslib
4
790
UIの境界線をデザインする | React Tokyo #15 メイントーク
sasagar
2
370
YJITとZJITにはイカなる違いがあるのか?
nakiym
0
220
Featured
See All Featured
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.6k
Design in an AI World
tapps
1
200
Become a Pro
speakerdeck
PRO
31
5.9k
Groundhog Day: Seeking Process in Gaming for Health
codingconduct
0
150
Measuring & Analyzing Core Web Vitals
bluesmoon
9
810
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
How to Think Like a Performance Engineer
csswizardry
28
2.6k
How to build a perfect <img>
jonoalderson
1
5.4k
DevOps and Value Stream Thinking: Enabling flow, efficiency and business value
helenjbeal
1
170
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
1.2k
What the history of the web can teach us about the future of AI
inesmontani
PRO
1
530
Keith and Marios Guide to Fast Websites
keithpitt
413
23k
Transcript
Code -> Linter rules Pattern-based static analysis
Right into the action!
$last = $a[count($a)]; Step 1: find the bad code example
Off-by-one mistake
Step 2: extract it as a pattern $last = $a[count($a)];
That’s our pattern!
Step 3: apply the pattern
phpgrep by examples
@$_ Find all usages of error suppress operator 4.7s /
6kk SLOC / 56 Cores
in_array($x, [$y]) Find in_array calls that can be replaced with
$x == $y 4.6s / 6kk SLOC / 56 Cores
$x ? true : false Find all ternary expressions that
could be replaced by just $x 4.7s / 6kk SLOC / 56 Cores
$_ == null null == $_ Find all non-strict comparisons
with null 4.5s / 6kk SLOC / 56 Cores
for ($_ == $_; $_; $_) $_ Find for loops
where == is used instead of = inside init clause 4.6s / 6kk SLOC / 56 Cores
Just like Semgrep?
None
Semgrep NoVerify+phpgrep
• A brief phpgrep history Main topics for today
• A brief phpgrep history • NoVerify dynamic rules Main
topics for today
• A brief phpgrep history • NoVerify dynamic rules •
AST pattern matching Main topics for today
• A brief phpgrep history • NoVerify dynamic rules •
AST pattern matching • Running rules efficiently Main topics for today
• A brief phpgrep history • NoVerify dynamic rules •
AST pattern matching • Running rules efficiently • Dynamic rules pros & cons Main topics for today
phpgrep history
gogrep
gogrep gogrep is cool!
gogrep phpgrep
phpgrep CLI phpgrep lib php-parser A
phpgrep CLI phpgrep lib NoVerify php-parser A php-parser B Incompatible
AST types :(
phpgrep CLI phpgrep lib NoVerify phpgrep lib fork php-parser A
php-parser B
phpgrep CLI NoVerify phpgrep lib fork php-parser B
NoVerify dynamic rules
Concepts overview phpgrep noverify dynamic rules Structural PHP search using
AST patterns
Concepts overview phpgrep noverify dynamic rules PHP linter capable of
running dynamic rules
Concepts overview phpgrep noverify dynamic rules NoVerify format for the
phpgrep-style rules
Concepts overview phpgrep noverify dynamic rules Written in
• Types info (NoVerify type inference) Dynamic rules vs phpgrep
• Types info (NoVerify type inference) • Efficient multi-pattern execution
Dynamic rules vs phpgrep
• Types info (NoVerify type inference) • Efficient multi-pattern execution
• Logical pattern grouping Dynamic rules vs phpgrep
• Types info (NoVerify type inference) • Efficient multi-pattern execution
• Logical pattern grouping • Documentation mechanisms Dynamic rules vs phpgrep
noverify PHP file PHP file PHP file rules1 rules2
rules2 noverify PHP file PHP file PHP file Dynamic rules
are loaded rules1
noverify PHP file PHP file PHP file Then files are
analyzed rules2 rules1
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; }
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; } Dynamic rules group name
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; } Warning message
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; } phpgrep pattern
Is this transformation safe? f() ? f() : 0 =>
f() ?: 0
Is this transformation safe? f() ? f() : 0 =>
f() ?: 0 Only if f() is free of side effects
Dynamic rule example (extended) function ternarySimplify() { /** * @warning
rewrite as $x ?: $y * @pure $x */ $x ? $x : $y; }
Dynamic rule example (extended) function ternarySimplify() { /** * @warning
rewrite as $x ?: $y * @pure $x */ $x ? $x : $y; } $x should be side effect free
Dynamic rule example (extended) function ternarySimplify() { /** * @warning
rewrite as $x ?: $y * @pure $x * @fix $x ?: $y */ $x ? $x : $y; } auto fix action for NoVerify
Dynamic rule example (@comment) /** * @comment Find ternary expr
that can be simplified * @before $x ? $x : $y * @after $x ?: $y */ function ternarySimplify() { // ...as before } Dynamic rule documentation
function argsOrder() { /** @warning suspicious args order */ any:
{ str_replace($_, $_, ${"char"}, ${"*"}); str_replace($_, $_, "", ${"*"}); } }
function argsOrder() { /** @warning suspicious args order */ any:
{ str_replace($_, $_, ${"char"}, ${"*"}); str_replace($_, $_, "", ${"*"}); } } “any” pattern grouping
function bitwiseOps() { /** * @warning maybe && is intended?
* @fix $x && $y * @type bool $x * @type bool $y */ $x & $y; }
function bitwiseOps() { /** * @warning maybe && is intended?
* @fix $x && $y * @type bool $x * @type bool $y */ $x & $y; } Type filters
T T typed expression object Arbitrary object type T[] Array
of T-typed elements !T Any type except T !(A|B) Any type except A and B ?T Same as (T|null) Type matching examples
function stringCmp() { /** * @warning compare strings with ===
* @fix $x === $y * @type string $x * @or * @type string $y */ $x == $y; }
function stringCmp() { /** * @warning compare strings with ===
* @fix $x === $y * @type string $x * @or * @type string $y */ $x == $y; } Or-connected constraints
1. Create a rules file 2. Run NoVerify with -rules
flag How to run custom rules $ noverify -rules rules.php target
AST pattern matching
“$x = $x” pattern string
“$x = $x” pattern string Parsed AST
“$x = $x” pattern string Parsed AST Modified AST (with
meta nodes)
function match(Node $pat, Node $n) $pat is a compiled pattern
$n is a node being matched Matching AST
• Both $pat and $n are traversed • Non-meta nodes
are compared normally • $pat meta nodes are separate cases • Named matches are collected (capture) Algorithm
• $x is a simple “match any” named match •
$_ is a “match any” unnamed match • ${"str"} matches string literals • ${"str:x"} is a capturing form of ${"str"} • ${"*"} matches zero or more nodes Valid PHP Syntax! Meta node examples
$_ = ${"str"} matches $foo->x = "abc"; $x = '';
$_ = ${"str"} rejects $foo->x = f(); $x = $y;
f() matches f() F() Unless explicitly marked as case-sensitive
new T() matches new T() new t() Unless explicitly marked
as case-sensitive
Pattern matching = $x $x += $a 10 Pattern $x=$x
Target $a+=10
Pattern matching = $x $x += $a 10 Pattern $x=$x
Target $a+=10
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10 $x is bound to $a
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10 $a != 10
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a $x is bound to $a
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a $a = $a, pattern matched
Trying to make pattern matching work faster...
“$x = $x” pattern string Parsed AST Modified AST
“$x = $x” pattern string Parsed AST Polish notation +
stack
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> = <NamedAny x> <NamedAny x>
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> $a <NamedAny x> $a <NamedAny x>
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> $a <NamedAny x> <NamedAny x>
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> <NamedAny x> <NamedAny x>
• 2-4 times faster matching • No AST types dependency
• More optimization opportunities Stack-based matching
Running rules efficiently
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file N * M problem
• AST is traversed only once • For every node,
run only relevant rules We can tune the matching engine to work very fast N*M cure: categorized rules
rule PHP file ... Assign rule ... TernaryExpr
rule PHP file ... Assign rule ... TernaryExpr Node categories
rule PHP file ... Assign rule ... TernaryExpr Categorized rules
• Local: run rules only inside functions • Root: run
rules only inside global scope • Universal: run rules everywhere Extra registry layer: scopes
rule PHP file ... Assign rule ... TernaryExpr Global scope
rule ... Assign rule ... TernaryExpr Local scope
rule PHP file ... Assign rule ... TernaryExpr Global scope
rule ... Assign rule ... TernaryExpr Local scope Scoped group
• Expression can’t contain a statement • Some statements are
top-level only We don’t use this knowledge right now. Extra registry layer: expr vs stmt
If any rule from a group matched, all other rules
inside the group are skipped for the current node. • Helps to avoid matching conflicts • Improves performance Group cutoff
// input: $a[0] = $a[0] + 1 function assignOp() {
/** @fix ++$x */ $x = $x + 1; /** @fix $x += $y */ $x = $x + $y; }
// input: $a[0] = $a[0] + 1 function assignOp() {
/** @fix ++$x */ $x = $x + 1; /** @fix $x += $y */ $x = $x + $y; } Matched, ++$a[0] suggested
// input: $a[0] = $a[0] + 1 function assignOp() {
/** @fix ++$x */ $x = $x + 1; /** @fix $x += $y */ $x = $x + $y; } Skipped
Dynamic rules pros & cons
• No need to re-compile NoVerify Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple • No Go coding required Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple • No Go coding required • Rules are declarative Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple • No Go coding required • Rules are declarative • No need to know linter internals Dynamic rules advantages
• Not very composable • Too verbose for non-trivial cases
• Hard to get the autocompletion working PHPDoc-based attributes
• Hard to express flow-based rules • PHP syntax limitations
• Recursive block search is problematic AST pattern limitations
Comparison with Ruleguard
None
Rule group name
gogrep pattern
Type filter
Auto fix action
None
Target language go-ruleguard Go NoVerify rules PHP NoVerify vs Ruleguard
DSL core go-ruleguard Fluent API DSL NoVerify rules Top-level patterns
+ PHPDoc NoVerify vs Ruleguard
Filtering mechanism go-ruleguard Go expressions NoVerify rules PHPDoc annotations NoVerify
vs Ruleguard
Type filters go-ruleguard Type matching patterns NoVerify rules Simple type
expressions NoVerify vs Ruleguard
• NoVerify - static analyzer (linter) • phpgrep - structural
PHP search • phpgrep VS Code extension • Dynamic rules example • Dynamic rules for static analysis article • Ruleguard - dynamic rules for Go Links
Code -> Linter rules Pattern-based static analysis