Database Firewall from Scratch

Slide 1

Slide 1 text

ptsecurity.ru Database Firewall from Scratch ptsecurity.com Arseny Reutov [email protected] @ru_raz0r Denis Kolegov [email protected] @dnkolegov

Slide 2

Slide 2 text

About us • Arseny Reutov • Head of application security research at Positive Technologies • Member of Positive Hack Days (https://phdays.com) conference board • Occasional web security blogger (https://raz0r.name) • Denis Kolegov • Team lead of Application Firewall research at Positive Technologies • PhD, associate professor at Tomsk State University • Web security micro blogger (https://twitter.com/dnkolegov)

Slide 3

Slide 3 text

Outline • The stuff we are going to talk about is joint work of PT Application Firewall Research Team developing a database firewall prototype as a part of our application firewall • Thanks to  Arseny Reutov  Denis Kolegov  Vladimir Kochetkov  Igor Kanygin  Nikolay Tkachenko  Ivan Hudyashov  Sergey Grechnev  Sergey Reshetnikov

Slide 4

Slide 4 text

Agenda • Intro • WAF and DBFW • Related Work • Our Prototype • Parser • Protectors  Profiler  Dejector  SQLi  Access Control  IAM • Roadmap

Slide 5

Slide 5 text

WAF vs DBFW

Slide 6

Slide 6 text

What is WAF for most people? WAF is a blackbox system that applies a set of rules to an HTTP conversation

Slide 7

Slide 7 text

What is WAF for most people? WAF protection is pattern matching

Slide 8

Slide 8 text

What is WAF for most people? For a WAF web application is just a series of HTTP transactions

Slide 9

Slide 9 text

Web Applications in 2017 by Jeff Atwood @codinghorror

Slide 10

Slide 10 text

WAF nowadays Web Application Firewall

Slide 11

Slide 11 text

WAF nowadays Client-Side Firewall (waf.js) Web Application Firewall

Slide 12

Slide 12 text

WAF nowadays Inspected Application Module (IAM) Client-Side Firewall (waf.js) Web Application Firewall

Slide 13

Slide 13 text

WAF nowadays Inspected Application Module (IAM) Client-Side Firewall (waf.js) Database Firewall (DBFW) Web Application Firewall (WAF)

Slide 14

Slide 14 text

What is Database Firewall? Database firewalls are a type of application firewalls which • Monitor database activity • Detect database specific attacks • Protect sensitive information stored in the databases • Implement adequate access control models

Slide 15

Slide 15 text

Database Firewall Deployment Like WAFs database firewalls can be deployed • in proxy mode • in sniffer mode via a SPAN port (mirrored traffic) • as a host-based agent

Slide 16

Slide 16 text

What Database Firewall Can Do? Database firewalls can do several actions on each query: • Pass • Log for monitoring purposes • Alert • Rewrite query • Block (either by dropping connection or by generating a native error code)

Slide 17

Slide 17 text

WAF vs DBFW WAF XSS XXE RCE LFI SSRF IDOR CSRF Path Traversal Open Redirect Object Injection Session Fixation …

Slide 18

Slide 18 text

WAF vs DBFW WAF XSS XXE RCE LFI SSRF IDOR CSRF Path Traversal Open Redirect Object Injection Session Fixation … DBFW Segregation of Duties Audit & Monitoring Sensitive Data Discovery

Slide 19

Slide 19 text

WAF vs DBFW WAF XSS XXE RCE LFI SSRF IDOR CSRF Path Traversal Open Redirect Object Injection Session Fixation … DBFW Segregation of Duties Audit & Monitoring Sensitive Data Discovery Access Control SQL Injection Buffer Overflow Data Leakage Prevention Data Masking

Slide 20

Slide 20 text

Related Work

Slide 21

Slide 21 text

SQL Injection Detection: Green SQL • Green SQL have been mod_security of DBFWs for many years, but open source project is no longer maintained • SQL Injection detection is based on risk score using metrics:  SQL comments  Sensitive tables  OR token  UNION token  Variable comparison  Always true expressions  and more

Slide 22

Slide 22 text

SQL Injection Detection: Machine Learning SOFIA: An Automated Security Oracle for Black-Box Testing of SQL-Injection Vulnerabilities

Slide 23

Slide 23 text

SQL Injection Detection: Machine Learning • ”SOFIA is significantly more accurate than antiSQLi and GreenSQL and significantly faster than antiSQLi in classifying legitimate SQL statements and SQLi attacks.” • However, it takes lots of computing power to train the model since tree operations are time expensive • The algorithm is not tolerant to attacks during training

Slide 24

Slide 24 text

Prototype Architecture

Slide 25

Slide 25 text

Architecture • Core  Go  Python/Twisted • Parser  ANTLR 4  ANTLR Grammars-v4  Python/C++ • Storage  Mongo • UI  React  Redux  GraphQL

Slide 26

Slide 26 text

DBFW Data Flow dbfwgo (1) xsql query (6) check result dbfwpython (2) request context Web server Parser DBMS (10) xsql result (7) response context (3) parser request (4) response (AST, tokens, etc.) (8) xsql query (9) xsql result Protector 1 … Protector N (5) query attributes

Slide 27

Slide 27 text

Parser Design

Slide 28

Slide 28 text

Naïve Approach Parser Syntax analysis Protectors Protector 1 … Protector N Concrete syntax tree mysql query

Slide 29

Slide 29 text

Naïve Approach Syntax analysis Protectors Protector 1 … Protector N Concrete syntax tree Parser mysql Parser tsql Parser plsql query

Slide 30

Slide 30 text

CST vs AST … select 1 CST AST

Slide 31

Slide 31 text

SQL Parsing Flow Mysql parser AST tsqlParser plsqlParser Syntax analysis Semantic analysis … Collector 1 … Collector N CST AST Abstract syntax analysis query

Slide 32

Slide 32 text

SQL Parsing Implementation ANTLR4 xsql parser Code generator xsql grammar SQL AST specification Parser’s classes AST Node classes query xsql visitor CST AST

Slide 33

Slide 33 text

MySQL Parsing Time Results Python 2.7 ~ 0.3 / 0.001 sec ~ 0.4 / 0.001 sec ~ 1.5 / 0.003 sec ~ 1.6 / 0.004 sec C++ 11 ~ 0.01 / 0.00008 sec ~ 0.01 / 0.00029 sec ~ 0.06 / 0.0001 sec ~ 0.06 / 0.004 sec Test Minimal query* parsing Minimal query full parsing Heavy query** parsing Heavy query full parsing ~ 983 sec ~ 2 KB/sec ~ 48 sec ~ 40 KB/sec Bitrix queries file (2 MB) ~ 432 sec ~ 420 query/sec ~ 16 sec ~ 11000 query/sec Wordpress queries (181540) * Minimal query: select 1 ** Heavy query: select * from (((((((select col1 from t1) as ttt))))))

Slide 34

Slide 34 text

Profiler

Slide 35

Slide 35 text

Profiler • SQL profiler is a basic protection mechanism implemented in all database firewalls • It works like linting utilities or linters (e.g. eslint, pylint, cppint, etc.), but analyses SQL queries and check if they satisfy security policy (SQL profile) • The main goal is to prevent using of SQLi automatic tools and exploits • SQL profile can be  Static: created by manual configuration  Dynamic: created by source code analysis tools

Slide 36

Slide 36 text

Example of SQL Profile Rules Options ["log", 20] Rule no-union ["block", 150] max-query-length ["block", 6] max-columns ["block", 0] max-union ["block", "benchmark", "md5", "if"] function-blacklist ["block"] entity-length "off" no-subqueries ["block"] no-hex ["block"] no-stacked-queries ["block"] no-comments ["block"] no-os-commands

Slide 37

Slide 37 text

Dejector

Slide 38

Slide 38 text

AST Example Dejector is a context-free parse tree validation approach to preventing SQL Injection, proposed by Hansen and Patterson in 2005 Given a set of known-good queries and the base formal grammar, Dejector builds a new subgrammar that contains only the rules required to produce exactly the queries in the known-good set Strings recognized by the subgrammar are guaranteed to be structurally identical to those in the known-good set The subgrammar is then used with a parser generator such as bison or ANTLR to produce a recognizer for the sublanguage

Slide 39

Slide 39 text

ANTLR-based Dejector xsql Parser Compiler (3) Trusted xsql queries Aggregator (4) CST list (6) Union CST (7) Grammar subxsql.g4 ANTLR (8) subxsql parser (1) xsql.g4 (2) xsql Parser (5) Dejector mode Application Security language mechanism Developer

Slide 40

Slide 40 text

Strict Mode a b c f h g i j l k m UCST ANTLR v4 grammar: a: b | c | pd ; c: f; f: g | h; g: i | j; i: l; … p d

Slide 41

Slide 41 text

Strict Mode a b c f d h g i j l k m a b f d UCST CST

Slide 42

Slide 42 text

Strict Mode a b c f h g i j l k m b1 f d1 new UCST ANTLR v4 grammar: a: b | c | pd | b1 | d1 ; b1: f; c: f; f: g | h; g: i | j; i: l; … p d

Slide 43

Slide 43 text

Strict Dejector Parsing Time Results Python 2.7 MySQL ~ 0.643 / 0.0019 sec ~ 0.67 / 0.002 sec ~ 0.33 / 0.003 sec ~ 0.32 / 0.009 sec Python 2.7 SubMySQL ~ 0.09 / 0.0011 sec ~ 0.102 / 0.0011 sec ~ 0.09 / 0.001 sec ~ 0.18 / 0.005 sec Test SELECT * FROM a WHERE b='c' SELECT * FROM a WHERE b BETWEEN 'c' AND 'd' INSERT INTO passbook VALUES('a','b','c','d','e','f','g','h') CREATE TABLE a (b int(5) AUTO_INCREMENT, c date, d VARCHAR(255), e VARCHAR(255), f VARCHAR(255), g int(10), h int(10), i float(10,2), j VARCHAR(255), PRIMARY KEY (b)) ~ 1.54 / 0.003 sec ~ 0.09 / 0.001 sec SELECT * FROM (((((((SELECT col1 FROM t1) AS ttt))))))* * Query can not be derived in SubMySQL grammar

Slide 44

Slide 44 text

HTTP & SQL Correlation

Slide 45

Slide 45 text

WAF + DBFW • Suppose that we have both WAF and DBFW deployed: Client WAF Web Server DBFW Database

Slide 46

Slide 46 text

HTTP & SQL Correlation In order to correlate SQL queries with HTTP packets a host-based module can be deployed on the web server which will append session cookie into each SQL query in a comment section

Slide 47

Slide 47 text

HTTP & SQL Correlation • When these modified queries reach DBFW it can look up those session identifiers in the database shared with WAF • WAF holding access control policy for web users acts as information point, i.e. it provides user information given a session cookie • DBFW serves as enforcement point, effectively blocking or allowing queries

Slide 48

Slide 48 text

HTTP & SQL Correlation • What if we do not have a chance to deploy a host module (agent)? • We can still try to correlate HTTP and SQL using time-throttled request processing • Idea is that we process HTTP requests synchronously, observe emitted SQL queries, and associate them with HTTP requests

Slide 49

Slide 49 text

SQL Injection Detection • Using host-based agent we can effectively detect SQL Injections • Agent injects into an SQL comment data about HTTP parameters that were observed when executing SQL query

Slide 50

Slide 50 text

SQLi Detection Approach • DBFW replaces each occurrence of HTTP parameter value found in SQL query with a constant • Then it tries to parse and get tokens firstly for the original query and then for the second one with replaced constants • If a number of tokens is different, an SQL Injection is reported since constant replacement have caused changes in the query structure

Slide 51

Slide 51 text

AST-based Detection • A better approach is to compare ASTs instead • After traversal of the ASTs, if differences are found, an SQL Injection is reported because constant replacement have caused changes in AST

Slide 52

Slide 52 text

It decreases number of false positives. Does this mechanism decrease false negatives too? One of bypasses for owasp-modsecurity-crs found by Ivan Novikov It is not detected by libinjection too due to the context issue From Theory to Practice curl 'localhost/index.html?id=1%20or%20true' 1%20or%20true id=1.or-id id=.1or-UTC_DATE— )-sleep(9999 sleep(9999) */UNION SELECT password FROM users--

Slide 53

Slide 53 text

Show Me Impact Michael Stepankin. Advanced Web Application Fuzzing

Slide 54

Slide 54 text

Slide 55

Slide 55 text

SQLi Detection Example GET /app/?id=1%20or%20true HTTP/1.1 Host: example.com Connection: close Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding: gzip, deflate, sdch Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4 /*{"get_args":[{"id":"1 or true"}]}*/ select * from users where clientid = 1 or true

Slide 56

Slide 56 text

Slide 57

Slide 57 text

SQLi Detection Example select * from users where clientid = "" select * from users where clientid = 1 or true

Slide 58

Slide 58 text

SQLi Detection Example select * from users where clientid = "" select * from users where clientid = 1 or true select * from users where id = "" Lexems select * from users where id = 1 or true Lexems

Slide 59

Slide 59 text

SQLi Detection Example select * from users where clientid = "" select * from users where clientid = 1 or true select * from users where id = "" Lexems select * from users where id = 1 or true Lexems 8 ≠ 10

Slide 60

Slide 60 text

SQLi Detection Example select * from users where id = "" select * from users where id = 1 or true select * from users where clientid = "" Lexems select * from users where clientid = 1 or true Lexems 8 ≠ 10

Slide 61

Slide 61 text

Access Control

Slide 62

Slide 62 text

Access Control • All types of application firewalls should have access control mechanisms • The main statement of any access policy: All entities must be identified • Entities identification in account-based systems: at least it is necessary to identify web application subjects (users) that initiate queries to DBMS • Approaches  Many-to-many applications  HTTP and SQL user tracking  RASP • Angine - ABAC eNgine

Slide 63

Slide 63 text

XACML Data Flow Model

Slide 64

Slide 64 text

Angine Data Flow Model Subjects, resources, environment PAP (6) attribute query (1) ALFAScript policy (5) request context (12) response (4) request Compiler Access requester PDP (2) Lua code Context handler PEP (3) access request PIP (9) attribute (7) attribute query (11) response context (10) attributes (8) attributes Generator Data structures Language ALFAScript IDL policy spec

Slide 65

Slide 65 text

ALFAScript Policy Example interface Entity { abstract id: String; } interface SQLEntity <: Entity { database: String; table: String; column: String; level: Number; tags: [String]; } interface Subject <: Entity { name: String; level: Number; tags: [String]; }

Slide 66

Slide 66 text

ALFAScript Policy Example namespace example { export policy Main { target clause action == "select“ apply denyUnlessPermit rule mls { permit target clause subject.level > entity.level } rule rbac { permit condition rbacCheckRoles() == true } } }

Slide 67

Slide 67 text

Generated Classes class Entity(object): def __init__(self): super(Entity, self).__init__() @abstractproperty def id(self): pass class SQLEntity(Entity): def __init__(self, database=None, table=None, column=None, level=None, tags=None): super(SQLEntity, self).__init__() self.database = database self.table = table self.column = column self.level = level self.tags = tags class Subject(Entity): def __init__(self, name=None, level=None, tags=None): super(Subject, self).__init__() self.name = name self.level = level self.tags = tags

Slide 68

Slide 68 text

Angine Example from Angine.policy import Policy from Angine.pip import PIP from Angine.pdp import PDP def pep(): ... request = get_request(network) policy = Policy(alfa_mysql_policy) pip = PIP.init_data(mongo_connection) pdp = PDP(policy.get_lua_policy()) ctx = pip.create_ctx(request) response = pdp.evaluate(ctx) if response["result"]["decision"] != "permit": return None else: return process(request)

Slide 69

Slide 69 text

Angine Results • ALFAScript IDL to runtime language code generator • ALFAScript language • ALFAScript to Lua transcompiler • PDP and PIP implementations for runtime language • Common parsers (HTTP, mysql, tsql)

Slide 70

Slide 70 text

Roadmap • Host agents for C#, Java • ANTLR-based C++ parser • Release MySQL grammar for ANTLR4 • PT Application Firewall integration • SQL user tracking • Machine learning for sensitive data discovery • Inspected Application Module for DBFW

Slide 71

Slide 71 text

Thank you! ptsecurity.com

Slide 72

Slide 72 text

Inspected Application Module

Slide 73

Slide 73 text

Inspected Application Module Vladimir Kochetkov. Do WAFs dream of static analyzers? Peculiarities  Web-only IAM can not process non HTTP attack vectors  There are some cases when CompFG is not adequate to detect attacks • Loops, recursion • Internal and external dependencies The idea is to build SQL profile based on application code, compile it to binary module and run on the DBFW This approach can be used to detect second order SQL injection attacks

Slide 74

Slide 74 text

Inspected Application Module Flow Application Inspector Deployed application Configuration WAF IAM WAF Front-end Web client (2) HTTP request context Application CompFG (5) HTTP response context (11) SQL response context Computation flow model (8) SQL request context DBMS (7) SQL request (6) HTTP request (12) SQL request (13) SQL response (14) SQL response (15) HTTP response (16) HTTP response (1) HTTP request Compiled binary module Source code Application Firewall

Slide 75

Slide 75 text

Slide 76

Slide 76 text

SQL IAM Example … $sql = "select * from data where id= intval(".$_POST["id"]) $result = mysql_query($sql, $connection) $row = mysql_fetch_row($result) $sql = "select * from data where fname=' ".$row[2]. " ' " Untrusted data read from database. What if fname is ' or '1' = '1 ? Second order SQL injection

Slide 77

Slide 77 text

Slide 78

Slide 78 text

Slide 79

Slide 79 text

SQL IAM Example … $sql = "select * from data where id= intval(".$_POST["id"]) $result = mysql_query($sql, $connection) $row = mysql_fetch_row($result) $sql = "select * from data where fname=' ".$row[2]. " ' " (concat "select * from data where fname=" ( concat (index-access row 2) "'")) (call mysql_fetch_row (call mysql_query (concat "select * from data where id=intval(" (concat (index- access POST, "id") ")")) connection))

Slide 80

Slide 80 text

Slide 81

Slide 81 text

SQLi IAM Example GET /app/?id=1000 HTTP/1.1 Host: example.com Connection: close Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Encoding: gzip, deflate, sdch Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4 select * from data where id=1000 select * from data where fname='john' or '1'='1' 1 ≠ 2