Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
PHP code->rules
Search
Iskander (Alex) Sharipov
October 24, 2020
Programming
1
44
PHP code->rules
Iskander (Alex) Sharipov
October 24, 2020
Tweet
Share
More Decks by Iskander (Alex) Sharipov
See All by Iskander (Alex) Sharipov
quasigo
quasilyte
0
23
Go gamedev: XM music
quasilyte
0
81
Zero alloc pathfinding
quasilyte
0
400
Mycelium
quasilyte
0
42
Roboden game pitch
quasilyte
0
160
Ebitengine Ecosystem Overview
quasilyte
1
710
Go gamedev patterns
quasilyte
0
410
profile-guided code analysis
quasilyte
0
320
Go inlining
quasilyte
0
98
Other Decks in Programming
See All in Programming
Simple組み合わせ村から大都会Railsにやってきた俺は / Coming to Rails from the Simple
moznion
3
3.5k
為你自己學 Python
eddie
0
540
PHPとAPI Platformで作る本格的なWeb APIアプリケーション(入門編) / phpcon 2024 Intro to API Platform
ttskch
0
410
Pythonでもちょっとリッチな見た目のアプリを設計してみる
ueponx
0
110
Amazon Bedrock Multi Agentsを試してきた
tm2
1
190
DMMオンラインサロンアプリのSwift化
hayatan
0
230
AWSマネコンに複数のアカウントで入れるようになりました
yuhta28
2
150
ASP.NET Core の OpenAPIサポート
h455h1
0
150
AWS Lambda functions with C# 用の Dev Container Template を作ってみた件
mappie_kochi
0
210
Fixstars高速化コンテスト2024準優勝解法
eijirou
0
200
定理証明プラットフォーム lapisla.net
abap34
1
640
ゼロからの、レトロゲームエンジンの作り方
tokujiros
3
1.1k
Featured
See All Featured
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
44
9.4k
Reflections from 52 weeks, 52 projects
jeffersonlam
348
20k
Documentation Writing (for coders)
carmenintech
67
4.6k
Designing for humans not robots
tammielis
250
25k
Rebuilding a faster, lazier Slack
samanthasiow
79
8.8k
Code Reviewing Like a Champion
maltzj
521
39k
[RailsConf 2023] Rails as a piece of cake
palkan
53
5.2k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
98
18k
Being A Developer After 40
akosma
89
590k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
365
25k
Making Projects Easy
brettharned
116
6k
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
Transcript
Code -> Linter rules Pattern-based static analysis
Right into the action!
$last = $a[count($a)]; Step 1: find the bad code example
Off-by-one mistake
Step 2: extract it as a pattern $last = $a[count($a)];
That’s our pattern!
Step 3: apply the pattern
phpgrep by examples
@$_ Find all usages of error suppress operator 4.7s /
6kk SLOC / 56 Cores
in_array($x, [$y]) Find in_array calls that can be replaced with
$x == $y 4.6s / 6kk SLOC / 56 Cores
$x ? true : false Find all ternary expressions that
could be replaced by just $x 4.7s / 6kk SLOC / 56 Cores
$_ == null null == $_ Find all non-strict comparisons
with null 4.5s / 6kk SLOC / 56 Cores
for ($_ == $_; $_; $_) $_ Find for loops
where == is used instead of = inside init clause 4.6s / 6kk SLOC / 56 Cores
Just like Semgrep?
None
Semgrep NoVerify+phpgrep
• A brief phpgrep history Main topics for today
• A brief phpgrep history • NoVerify dynamic rules Main
topics for today
• A brief phpgrep history • NoVerify dynamic rules •
AST pattern matching Main topics for today
• A brief phpgrep history • NoVerify dynamic rules •
AST pattern matching • Running rules efficiently Main topics for today
• A brief phpgrep history • NoVerify dynamic rules •
AST pattern matching • Running rules efficiently • Dynamic rules pros & cons Main topics for today
phpgrep history
gogrep
gogrep gogrep is cool!
gogrep phpgrep
phpgrep CLI phpgrep lib php-parser A
phpgrep CLI phpgrep lib NoVerify php-parser A php-parser B Incompatible
AST types :(
phpgrep CLI phpgrep lib NoVerify phpgrep lib fork php-parser A
php-parser B
phpgrep CLI NoVerify phpgrep lib fork php-parser B
NoVerify dynamic rules
Concepts overview phpgrep noverify dynamic rules Structural PHP search using
AST patterns
Concepts overview phpgrep noverify dynamic rules PHP linter capable of
running dynamic rules
Concepts overview phpgrep noverify dynamic rules NoVerify format for the
phpgrep-style rules
Concepts overview phpgrep noverify dynamic rules Written in
• Types info (NoVerify type inference) Dynamic rules vs phpgrep
• Types info (NoVerify type inference) • Efficient multi-pattern execution
Dynamic rules vs phpgrep
• Types info (NoVerify type inference) • Efficient multi-pattern execution
• Logical pattern grouping Dynamic rules vs phpgrep
• Types info (NoVerify type inference) • Efficient multi-pattern execution
• Logical pattern grouping • Documentation mechanisms Dynamic rules vs phpgrep
noverify PHP file PHP file PHP file rules1 rules2
rules2 noverify PHP file PHP file PHP file Dynamic rules
are loaded rules1
noverify PHP file PHP file PHP file Then files are
analyzed rules2 rules1
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; }
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; } Dynamic rules group name
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; } Warning message
Dynamic rule example function ternarySimplify() { /** @warning rewrite as
$x ?: $y */ $x ? $x : $y; } phpgrep pattern
Is this transformation safe? f() ? f() : 0 =>
f() ?: 0
Is this transformation safe? f() ? f() : 0 =>
f() ?: 0 Only if f() is free of side effects
Dynamic rule example (extended) function ternarySimplify() { /** * @warning
rewrite as $x ?: $y * @pure $x */ $x ? $x : $y; }
Dynamic rule example (extended) function ternarySimplify() { /** * @warning
rewrite as $x ?: $y * @pure $x */ $x ? $x : $y; } $x should be side effect free
Dynamic rule example (extended) function ternarySimplify() { /** * @warning
rewrite as $x ?: $y * @pure $x * @fix $x ?: $y */ $x ? $x : $y; } auto fix action for NoVerify
Dynamic rule example (@comment) /** * @comment Find ternary expr
that can be simplified * @before $x ? $x : $y * @after $x ?: $y */ function ternarySimplify() { // ...as before } Dynamic rule documentation
function argsOrder() { /** @warning suspicious args order */ any:
{ str_replace($_, $_, ${"char"}, ${"*"}); str_replace($_, $_, "", ${"*"}); } }
function argsOrder() { /** @warning suspicious args order */ any:
{ str_replace($_, $_, ${"char"}, ${"*"}); str_replace($_, $_, "", ${"*"}); } } “any” pattern grouping
function bitwiseOps() { /** * @warning maybe && is intended?
* @fix $x && $y * @type bool $x * @type bool $y */ $x & $y; }
function bitwiseOps() { /** * @warning maybe && is intended?
* @fix $x && $y * @type bool $x * @type bool $y */ $x & $y; } Type filters
T T typed expression object Arbitrary object type T[] Array
of T-typed elements !T Any type except T !(A|B) Any type except A and B ?T Same as (T|null) Type matching examples
function stringCmp() { /** * @warning compare strings with ===
* @fix $x === $y * @type string $x * @or * @type string $y */ $x == $y; }
function stringCmp() { /** * @warning compare strings with ===
* @fix $x === $y * @type string $x * @or * @type string $y */ $x == $y; } Or-connected constraints
1. Create a rules file 2. Run NoVerify with -rules
flag How to run custom rules $ noverify -rules rules.php target
AST pattern matching
“$x = $x” pattern string
“$x = $x” pattern string Parsed AST
“$x = $x” pattern string Parsed AST Modified AST (with
meta nodes)
function match(Node $pat, Node $n) $pat is a compiled pattern
$n is a node being matched Matching AST
• Both $pat and $n are traversed • Non-meta nodes
are compared normally • $pat meta nodes are separate cases • Named matches are collected (capture) Algorithm
• $x is a simple “match any” named match •
$_ is a “match any” unnamed match • ${"str"} matches string literals • ${"str:x"} is a capturing form of ${"str"} • ${"*"} matches zero or more nodes Valid PHP Syntax! Meta node examples
$_ = ${"str"} matches $foo->x = "abc"; $x = '';
$_ = ${"str"} rejects $foo->x = f(); $x = $y;
f() matches f() F() Unless explicitly marked as case-sensitive
new T() matches new T() new t() Unless explicitly marked
as case-sensitive
Pattern matching = $x $x += $a 10 Pattern $x=$x
Target $a+=10
Pattern matching = $x $x += $a 10 Pattern $x=$x
Target $a+=10
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10 $x is bound to $a
Pattern matching = $x $x = $a 10 Pattern $x=$x
Target $a=10 $a != 10
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a $x is bound to $a
Pattern matching = $x $x = $a $a Pattern $x=$x
Target $a=$a $a = $a, pattern matched
Trying to make pattern matching work faster...
“$x = $x” pattern string Parsed AST Modified AST
“$x = $x” pattern string Parsed AST Polish notation +
stack
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> = <NamedAny x> <NamedAny x>
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> $a <NamedAny x> $a <NamedAny x>
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> $a <NamedAny x> <NamedAny x>
Stack-based matching = $a $a Pattern $x=$x Target $a=$a Instructions
Stack <Assign> <NamedAny x> <NamedAny x>
• 2-4 times faster matching • No AST types dependency
• More optimization opportunities Stack-based matching
Running rules efficiently
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file
Imagine that we have a lot of rules... rule-1 ...
rule-N PHP file PHP file N * M problem
• AST is traversed only once • For every node,
run only relevant rules We can tune the matching engine to work very fast N*M cure: categorized rules
rule PHP file ... Assign rule ... TernaryExpr
rule PHP file ... Assign rule ... TernaryExpr Node categories
rule PHP file ... Assign rule ... TernaryExpr Categorized rules
• Local: run rules only inside functions • Root: run
rules only inside global scope • Universal: run rules everywhere Extra registry layer: scopes
rule PHP file ... Assign rule ... TernaryExpr Global scope
rule ... Assign rule ... TernaryExpr Local scope
rule PHP file ... Assign rule ... TernaryExpr Global scope
rule ... Assign rule ... TernaryExpr Local scope Scoped group
• Expression can’t contain a statement • Some statements are
top-level only We don’t use this knowledge right now. Extra registry layer: expr vs stmt
If any rule from a group matched, all other rules
inside the group are skipped for the current node. • Helps to avoid matching conflicts • Improves performance Group cutoff
// input: $a[0] = $a[0] + 1 function assignOp() {
/** @fix ++$x */ $x = $x + 1; /** @fix $x += $y */ $x = $x + $y; }
// input: $a[0] = $a[0] + 1 function assignOp() {
/** @fix ++$x */ $x = $x + 1; /** @fix $x += $y */ $x = $x + $y; } Matched, ++$a[0] suggested
// input: $a[0] = $a[0] + 1 function assignOp() {
/** @fix ++$x */ $x = $x + 1; /** @fix $x += $y */ $x = $x + $y; } Skipped
Dynamic rules pros & cons
• No need to re-compile NoVerify Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple • No Go coding required Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple • No Go coding required • Rules are declarative Dynamic rules advantages
• No need to re-compile NoVerify • Simple things are
simple • No Go coding required • Rules are declarative • No need to know linter internals Dynamic rules advantages
• Not very composable • Too verbose for non-trivial cases
• Hard to get the autocompletion working PHPDoc-based attributes
• Hard to express flow-based rules • PHP syntax limitations
• Recursive block search is problematic AST pattern limitations
Comparison with Ruleguard
None
Rule group name
gogrep pattern
Type filter
Auto fix action
None
Target language go-ruleguard Go NoVerify rules PHP NoVerify vs Ruleguard
DSL core go-ruleguard Fluent API DSL NoVerify rules Top-level patterns
+ PHPDoc NoVerify vs Ruleguard
Filtering mechanism go-ruleguard Go expressions NoVerify rules PHPDoc annotations NoVerify
vs Ruleguard
Type filters go-ruleguard Type matching patterns NoVerify rules Simple type
expressions NoVerify vs Ruleguard
• NoVerify - static analyzer (linter) • phpgrep - structural
PHP search • phpgrep VS Code extension • Dynamic rules example • Dynamic rules for static analysis article • Ruleguard - dynamic rules for Go Links
Code -> Linter rules Pattern-based static analysis