Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PHP code->rules
Search
Iskander (Alex) Sharipov
October 24, 2020
Programming
1
55
PHP code->rules
Iskander (Alex) Sharipov
October 24, 2020
Tweet
Share
More Decks by Iskander (Alex) Sharipov
See All by Iskander (Alex) Sharipov
quasigo
quasilyte
0
47
Go gamedev: XM music
quasilyte
0
110
Zero alloc pathfinding
quasilyte
0
470
Mycelium
quasilyte
0
58
Roboden game pitch
quasilyte
0
190
Ebitengine Ecosystem Overview
quasilyte
1
840
Go gamedev patterns
quasilyte
0
460
profile-guided code analysis
quasilyte
0
350
Go inlining
quasilyte
0
120
Other Decks in Programming
See All in Programming
プロダクト開発でも使おう 関数のオーバーロード
yoiwamoto
0
160
Webの外へ飛び出せ NativePHPが切り拓くPHPの未来
takuyakatsusa
2
170
なぜ適用するか、移行して理解するClean Architecture 〜構造を超えて設計を継承する〜 / Why Apply, Migrate and Understand Clean Architecture - Inherit Design Beyond Structure
seike460
PRO
1
260
Go Modules: From Basics to Beyond / Go Modulesの基本とその先へ
kuro_kurorrr
0
120
型付きアクターモデルがもたらす分散シミュレーションの未来
piyo7
0
800
AIネイティブなプロダクトをGolangで挑む取り組み
nmatsumoto4
0
120
たった 1 枚の PHP ファイルで実装する MCP サーバ / MCP Server with Vanilla PHP
okashoi
0
130
C++20 射影変換
faithandbrave
0
500
GoのGenericsによるslice操作との付き合い方
syumai
2
670
すべてのコンテキストを、 ユーザー価値に変える
applism118
2
430
Gleamという選択肢
comamoca
6
740
Haskell でアルゴリズムを抽象化する / 関数型言語で競技プログラミング
naoya
17
4.8k
Featured
See All Featured
Unsuck your backbone
ammeep
671
58k
A Modern Web Designer's Workflow
chriscoyier
693
190k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
137
34k
Practical Orchestrator
shlominoach
188
11k
Reflections from 52 weeks, 52 projects
jeffersonlam
351
20k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Fireside Chat
paigeccino
37
3.5k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
46
9.6k
Mobile First: as difficult as doing things right
swwweet
223
9.7k
We Have a Design System, Now What?
morganepeng
52
7.6k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
228
22k
Faster Mobile Websites
deanohume
307
31k
Transcript
Code -> Linter rules Pattern-based static analysis
Right into the action!
$last = $a[count($a)]; Step 1: find the bad code example
Off-by-one mistake
Step 2: extract it as a pattern $last = $a[count($a)];
That’s our pattern!
Step 3: apply the pattern
phpgrep by examples
@$_ Find all usages of error suppress operator 4.7s /
6kk SLOC / 56 Cores
in_array($x, [$y]) Find in_array calls that can be replaced with
$x == $y 4.6s / 6kk SLOC / 56 Cores
$x ? true : false Find all ternary expressions that
could be replaced by just $x 4.7s / 6kk SLOC / 56 Cores
$_ == null null == $_ Find all non-strict comparisons
with null 4.5s / 6kk SLOC / 56 Cores
for ($_ == $_; $_; $_) $_ Find for loops
where == is used instead of = inside init clause 4.6s / 6kk SLOC / 56 Cores
Just like Semgrep?
None
Semgrep NoVerify+phpgrep
• A brief phpgrep history Main topics for today
• A brief phpgrep history • NoVerify dynamic rules Main
topics for today
• A brief phpgrep history • NoVerify dynamic rules •
AST pattern matching Main topics for today
• A brief phpgrep history • NoVerify dynamic rules •
AST pattern matching • Running rules efficiently Main topics for today
• A brief phpgrep history • NoVerify dynamic rules •
AST pattern matching • Running rules efficiently • Dynamic rules pros & cons Main topics for today
phpgrep history
gogrep
gogrep gogrep is cool!
gogrep phpgrep
phpgrep CLI phpgrep lib php-parser A
phpgrep CLI phpgrep lib NoVerify php-parser A php-parser B Incompatible
AST types :(
phpgrep CLI phpgrep lib NoVerify phpgrep lib fork php-parser A
php-parser B
phpgrep CLI NoVerify phpgrep lib fork php-parser B
NoVerify dynamic rules
Concepts overview phpgrep noverify dynamic rules Structural PHP search using
AST patterns
Concepts overview phpgrep noverify dynamic rules PHP linter capable of
running dynamic rules
Concepts overview phpgrep noverify dynamic rules NoVerify format for the
phpgrep-style rules
Concepts overview phpgrep noverify dynamic rules Written in
• Types info (NoVerify type inference) Dynamic rules vs phpgrep
• Types info (NoVerify type inference) • Efficient multi-pattern execution
Dynamic rules vs phpgrep
• Types info (NoVerify type inference) • Efficient multi-pattern execution
• Logical pattern grouping Dynamic rules vs phpgrep
• Types info (NoVerify type inference) • Efficient multi-pattern execution
• Logical pattern grouping • Documentation mechanisms Dynamic rules vs phpgrep
noverify PHP file PHP file PHP file rules1 rules2
rules2 noverify PHP file PHP file PHP file Dynamic rules
are loaded rules1
noverify PHP file PHP file PHP file Then files are
analyzed rules2 rules1
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; }
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; } Dynamic rules group name
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; } Warning message
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; } phpgrep pattern
Is this transformation safe? f() ? f() : 0 =>
f() ?: 0
Is this transformation safe? f() ? f() : 0 =>
f() ?: 0 Only if f() is free of side effects
Dynamic rule example (extended) function ternarySimplify() { /** * @warning
rewrite as $x ?: $y * @pure $x */ $x ? $x : $y; }
Dynamic rule example (extended) function ternarySimplify() { /** * @warning
rewrite as $x ?: $y * @pure $x */ $x ? $x : $y; } $x should be side effect free
Dynamic rule example (extended) function ternarySimplify() { /** * @warning
rewrite as $x ?: $y * @pure $x * @fix $x ?: $y */ $x ? $x : $y; } auto fix action for NoVerify
Dynamic rule example (@comment) /** * @comment Find ternary expr
that can be simplified * @before $x ? $x : $y * @after $x ?: $y */ function ternarySimplify() { // ...as before } Dynamic rule documentation
function argsOrder() { /** @warning suspicious args order */ any:
{ str_replace($_, $_, ${"char"}, ${"*"}); str_replace($_, $_, "", ${"*"}); } }
function argsOrder() { /** @warning suspicious args order */ any:
{ str_replace($_, $_, ${"char"}, ${"*"}); str_replace($_, $_, "", ${"*"}); } } “any” pattern grouping
function bitwiseOps() { /** * @warning maybe && is intended?
* @fix $x && $y * @type bool $x * @type bool $y */ $x & $y; }
function bitwiseOps() { /** * @warning maybe && is intended?
* @fix $x && $y * @type bool $x * @type bool $y */ $x & $y; } Type filters
T T typed expression object Arbitrary object type T[] Array
of T-typed elements !T Any type except T !(A|B) Any type except A and B ?T Same as (T|null) Type matching examples
function stringCmp() { /** * @warning compare strings with ===
* @fix $x === $y * @type string $x * @or * @type string $y */ $x == $y; }
function stringCmp() { /** * @warning compare strings with ===
* @fix $x === $y * @type string $x * @or * @type string $y */ $x == $y; } Or-connected constraints
1. Create a rules file 2. Run NoVerify with -rules
flag How to run custom rules $ noverify -rules rules.php target
AST pattern matching
“$x = $x” pattern string
“$x = $x” pattern string Parsed AST
“$x = $x” pattern string Parsed AST Modified AST (with
meta nodes)
function match(Node $pat, Node $n) $pat is a compiled pattern
$n is a node being matched Matching AST
• Both $pat and $n are traversed • Non-meta nodes
are compared normally • $pat meta nodes are separate cases • Named matches are collected (capture) Algorithm
• $x is a simple “match any” named match •
$_ is a “match any” unnamed match • ${"str"} matches string literals • ${"str:x"} is a capturing form of ${"str"} • ${"*"} matches zero or more nodes Valid PHP Syntax! Meta node examples
$_ = ${"str"} matches $foo->x = "abc"; $x = '';
$_ = ${"str"} rejects $foo->x = f(); $x = $y;
f() matches f() F() Unless explicitly marked as case-sensitive
new T() matches new T() new t() Unless explicitly marked
as case-sensitive
Pattern matching = $x $x += $a 10 Pattern $x=$x
Target $a+=10
Pattern matching = $x $x += $a 10 Pattern $x=$x
Target $a+=10
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10 $x is bound to $a
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10 $a != 10
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a $x is bound to $a
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a $a = $a, pattern matched
Trying to make pattern matching work faster...
“$x = $x” pattern string Parsed AST Modified AST
“$x = $x” pattern string Parsed AST Polish notation +
stack
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> = <NamedAny x> <NamedAny x>
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> $a <NamedAny x> $a <NamedAny x>
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> $a <NamedAny x> <NamedAny x>
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> <NamedAny x> <NamedAny x>
• 2-4 times faster matching • No AST types dependency
• More optimization opportunities Stack-based matching
Running rules efficiently
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file N * M problem
• AST is traversed only once • For every node,
run only relevant rules We can tune the matching engine to work very fast N*M cure: categorized rules
rule PHP file ... Assign rule ... TernaryExpr
rule PHP file ... Assign rule ... TernaryExpr Node categories
rule PHP file ... Assign rule ... TernaryExpr Categorized rules
• Local: run rules only inside functions • Root: run
rules only inside global scope • Universal: run rules everywhere Extra registry layer: scopes
rule PHP file ... Assign rule ... TernaryExpr Global scope
rule ... Assign rule ... TernaryExpr Local scope
rule PHP file ... Assign rule ... TernaryExpr Global scope
rule ... Assign rule ... TernaryExpr Local scope Scoped group
• Expression can’t contain a statement • Some statements are
top-level only We don’t use this knowledge right now. Extra registry layer: expr vs stmt
If any rule from a group matched, all other rules
inside the group are skipped for the current node. • Helps to avoid matching conflicts • Improves performance Group cutoff
// input: $a[0] = $a[0] + 1 function assignOp() {
/** @fix ++$x */ $x = $x + 1; /** @fix $x += $y */ $x = $x + $y; }
// input: $a[0] = $a[0] + 1 function assignOp() {
/** @fix ++$x */ $x = $x + 1; /** @fix $x += $y */ $x = $x + $y; } Matched, ++$a[0] suggested
// input: $a[0] = $a[0] + 1 function assignOp() {
/** @fix ++$x */ $x = $x + 1; /** @fix $x += $y */ $x = $x + $y; } Skipped
Dynamic rules pros & cons
• No need to re-compile NoVerify Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple • No Go coding required Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple • No Go coding required • Rules are declarative Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple • No Go coding required • Rules are declarative • No need to know linter internals Dynamic rules advantages
• Not very composable • Too verbose for non-trivial cases
• Hard to get the autocompletion working PHPDoc-based attributes
• Hard to express flow-based rules • PHP syntax limitations
• Recursive block search is problematic AST pattern limitations
Comparison with Ruleguard
None
Rule group name
gogrep pattern
Type filter
Auto fix action
None
Target language go-ruleguard Go NoVerify rules PHP NoVerify vs Ruleguard
DSL core go-ruleguard Fluent API DSL NoVerify rules Top-level patterns
+ PHPDoc NoVerify vs Ruleguard
Filtering mechanism go-ruleguard Go expressions NoVerify rules PHPDoc annotations NoVerify
vs Ruleguard
Type filters go-ruleguard Type matching patterns NoVerify rules Simple type
expressions NoVerify vs Ruleguard
• NoVerify - static analyzer (linter) • phpgrep - structural
PHP search • phpgrep VS Code extension • Dynamic rules example • Dynamic rules for static analysis article • Ruleguard - dynamic rules for Go Links
Code -> Linter rules Pattern-based static analysis