Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Database Firewall from Scratch

Database Firewall from Scratch

Denis Kolegov

May 22, 2017
Tweet

More Decks by Denis Kolegov

Other Decks in Programming

Transcript

  1. ptsecurity.com
    Database Firewall From Scratch
    Arseny Reutov
    [email protected]
    Denis Kolegov
    [email protected]

    View Slide

  2. About us
    • Arseny Reutov
    • Head of application security research at Positive Technologies
    • Member of Positive Hack Days (https://phdays.com) conference board
    • Occasional web security blogger (https://raz0r.name)
    • Denis Kolegov
    • Team lead of Application Firewall research at Positive Technologies
    • PhD, associate professor at Tomsk State University
    • Web security micro blogger (https://twitter.com/dnkolegov)

    View Slide

  3. Outline
    • The stuff we are going to talk about is joint work of PT Application Firewall
    Research Team developing a database firewall prototype as a part of our application
    firewall
    • Thanks to
     Arseny Reutov
     Denis Kolegov
     Vladimir Kochetkov
     Igor Kanygin
     Nikolay Tkachenko
     Ivan Hudyashov
     Sergey Grechnev
     Sergey Reshetnikov

    View Slide

  4. Agenda
    • Intro
    • WAF and DBFW
    • Related Work
    • Our Prototype
    • Parser
    • Protectors
     Profiler
     Dejector
     SQLi
     Access Control
     IAM
    • Roadmap

    View Slide

  5. WAF vs DBFW

    View Slide

  6. What is WAF for most people?
    WAF is a blackbox system that applies a
    set of rules to an HTTP conversation

    View Slide

  7. What is WAF for most people?
    WAF protection is pattern
    matching

    View Slide

  8. What is WAF for most people?
    For a WAF web application is just a
    series of HTTP transactions

    View Slide

  9. Web Applications in 2017
    by Jeff Atwood @codinghorror

    View Slide

  10. WAF nowadays
    Web
    Application
    Firewall

    View Slide

  11. WAF nowadays
    Client-Side
    Firewall
    (waf.js)
    Web
    Application
    Firewall

    View Slide

  12. WAF nowadays
    Inspected Application
    Module (IAM)
    Client-Side
    Firewall
    (waf.js)
    Web
    Application
    Firewall

    View Slide

  13. WAF nowadays
    Inspected Application
    Module (IAM)
    Client-Side
    Firewall
    (waf.js)
    Database Firewall
    (DBFW)
    Web
    Application
    Firewall
    (WAF)

    View Slide

  14. What is Database Firewall?
    Database firewalls are a type of application firewalls which
    • Monitor database activity
    • Detect database specific attacks
    • Protect sensitive information stored in the databases
    • Implement adequate access control models

    View Slide

  15. Database Firewall Deployment
    Like WAFs database firewalls can be deployed
    • in proxy mode
    • in sniffer mode via a SPAN port (mirrored traffic)
    • as a host-based agent

    View Slide

  16. What Database Firewall Can Do?
    Database firewalls can do several actions on each query:
    • Pass
    • Log for monitoring purposes
    • Alert
    • Rewrite query
    • Block (either by dropping connection or by generating a native error code)

    View Slide

  17. WAF vs DBFW
    WAF
    XSS
    XXE
    RCE
    LFI
    SSRF
    IDOR
    CSRF
    Path Traversal
    Open Redirect
    Object Injection
    Session Fixation

    View Slide

  18. WAF vs DBFW
    WAF
    XSS
    XXE
    RCE
    LFI
    SSRF
    IDOR
    CSRF
    Path Traversal
    Open Redirect
    Object Injection
    Session Fixation

    DBFW
    Segregation of
    Duties
    Audit &
    Monitoring
    Sensitive Data
    Discovery

    View Slide

  19. WAF vs DBFW
    WAF
    XSS
    XXE
    RCE
    LFI
    SSRF
    IDOR
    CSRF
    Path Traversal
    Open Redirect
    Object Injection
    Session Fixation

    DBFW
    Segregation of
    Duties
    Audit &
    Monitoring
    Sensitive Data
    Discovery
    Access Control
    SQL Injection
    Buffer Overflow
    Data Leakage Prevention
    Data Masking

    View Slide

  20. Related Work

    View Slide

  21. SQL Injection Detection: Green SQL
    • Green SQL have been mod_security of DBFWs for many
    years, but open source project is no longer maintained
    • SQL Injection detection is based on risk score using metrics:
     SQL comments
     Sensitive tables
     OR token
     UNION token
     Variable comparison
     Always true expressions
     and more

    View Slide

  22. SQL Injection Detection: Machine Learning
    SOFIA: An Automated Security Oracle for Black-Box Testing of SQL-Injection Vulnerabilities

    View Slide

  23. SQL Injection Detection: Machine Learning
    • ”SOFIA is significantly more accurate than antiSQLi and GreenSQL and
    significantly faster than antiSQLi in classifying legitimate SQL
    statements and SQLi attacks.”
    • However, it takes lots of computing power to train the model since tree
    operations are time expensive
    • The algorithm is not tolerant to attacks during training

    View Slide

  24. Prototype Architecture

    View Slide

  25. Architecture
    • Core
     Go
     Python/Twisted
    • Parser
     ANTLR 4
     ANTLR Grammars-v4
     Python/C++
    • Storage
     Mongo
    • UI
     React
     Redux
     GraphQL

    View Slide

  26. DBFW Data Flow
    dbfwgo
    (1) xsql query
    (6) check
    result
    dbfwpython
    (2) request
    context
    Web
    server
    Parser
    DBMS
    (10) xsql result
    (7) response
    context
    (3) parser
    request
    (4) response
    (AST, tokens, etc.)
    (8) xsql query
    (9) xsql result
    Protector 1

    Protector N
    (5) query
    attributes

    View Slide

  27. Parser Design

    View Slide

  28. Naïve Approach
    Parser
    Syntax analysis Protectors
    Protector 1

    Protector N
    Concrete
    syntax tree
    mysql
    query

    View Slide

  29. Naïve Approach
    Syntax analysis Protectors
    Protector 1

    Protector N
    Concrete
    syntax tree
    Parser
    mysql
    Parser
    tsql
    Parser
    plsql
    query

    View Slide

  30. CST vs AST

    select 1
    CST AST

    View Slide

  31. SQL Parsing Flow
    Mysql parser
    AST
    tsqlParser
    plsqlParser
    Syntax analysis Semantic analysis

    Collector 1

    Collector N
    CST AST
    Abstract syntax
    analysis
    query

    View Slide

  32. SQL Parsing Implementation
    ANTLR4 xsql parser
    Code
    generator
    xsql grammar
    SQL AST
    specification
    Parser’s
    classes
    AST Node
    classes
    query
    xsql visitor
    CST
    AST

    View Slide

  33. MySQL Parsing Time Results
    Python 2.7
    ~ 0.3 / 0.001 sec
    ~ 0.4 / 0.001 sec
    ~ 1.5 / 0.003 sec
    ~ 1.6 / 0.004 sec
    C++ 11
    ~ 0.01 / 0.00008 sec
    ~ 0.01 / 0.00029 sec
    ~ 0.06 / 0.0001 sec
    ~ 0.06 / 0.004 sec
    Test
    Minimal query* parsing
    Minimal query full parsing
    Heavy query** parsing
    Heavy query full parsing
    ~ 983 sec
    ~ 2 KB/sec
    ~ 48 sec
    ~ 40 KB/sec
    Bitrix queries file (2 MB)
    ~ 432 sec
    ~ 420 query/sec
    ~ 16 sec
    ~ 11000 query/sec
    Wordpress queries (181540)
    * Minimal query: select 1
    ** Heavy query: select * from (((((((select col1 from t1) as ttt))))))

    View Slide

  34. Profiler

    View Slide

  35. Profiler
    • SQL profiler is a basic protection mechanism implemented in all database firewalls
    • It works like linting utilities or linters (e.g. eslint, pylint, cppint, etc.), but analyses
    SQL queries and check if they satisfy security policy (SQL profile)
    • The main goal is to prevent using of SQLi automatic tools and exploits
    • SQL profile can be
     Static: created by manual configuration
     Dynamic: created by source code analysis tools

    View Slide

  36. Example of SQL Profile Rules
    Options
    ["log", 20]
    Rule
    no-union
    ["block", 150]
    max-query-length
    ["block", 6]
    max-columns
    ["block", 0]
    max-union
    ["block", "benchmark", "md5", "if"]
    function-blacklist
    ["block"]
    entity-length
    "off"
    no-subqueries
    ["block"]
    no-hex
    ["block"]
    no-stacked-queries
    ["block"]
    no-comments
    ["block"]
    no-os-commands

    View Slide

  37. Dejector

    View Slide

  38. AST Example
    Dejector is a context-free parse tree validation approach to preventing SQL Injection,
    proposed by Hansen and Patterson in 2005
    Given a set of known-good queries and the base formal grammar, Dejector builds a
    new subgrammar that contains only the rules required to produce exactly the queries
    in the known-good set
    Strings recognized by the subgrammar are guaranteed to be structurally identical to
    those in the known-good set
    The subgrammar is then used with a parser generator such as bison or ANTLR to
    produce a recognizer for the sublanguage

    View Slide

  39. ANTLR-based Dejector
    xsql Parser
    Compiler
    (3) Trusted
    xsql queries
    Aggregator
    (4) CST
    list
    (6) Union CST
    (7) Grammar
    subxsql.g4
    ANTLR
    (8) subxsql
    parser
    (1) xsql.g4
    (2) xsql
    Parser
    (5) Dejector
    mode
    Application
    Security
    language
    mechanism
    Developer

    View Slide

  40. Strict Mode
    a
    b c
    f
    h
    g
    i j
    l
    k
    m
    UCST
    ANTLR v4 grammar:
    a: b | c | pd ;
    c: f;
    f: g | h;
    g: i | j;
    i: l;

    p d

    View Slide

  41. Strict Mode
    a
    b c
    f
    d
    h
    g
    i j
    l
    k
    m
    a
    b
    f
    d
    UCST CST

    View Slide

  42. Strict Mode
    a
    b c
    f
    h
    g
    i j
    l
    k
    m
    b1
    f
    d1
    new UCST
    ANTLR v4 grammar:
    a: b | c | pd | b1 | d1 ;
    b1: f;
    c: f;
    f: g | h;
    g: i | j;
    i: l;

    p d

    View Slide

  43. Strict Dejector Parsing Time Results
    Python 2.7 MySQL
    ~ 0.643 / 0.0019 sec
    ~ 0.67 / 0.002 sec
    ~ 0.33 / 0.003 sec
    ~ 0.32 / 0.009 sec
    Python 2.7 SubMySQL
    ~ 0.09 / 0.0011 sec
    ~ 0.102 / 0.0011 sec
    ~ 0.09 / 0.001 sec
    ~ 0.18 / 0.005 sec
    Test
    SELECT * FROM a WHERE b='c'
    SELECT * FROM a WHERE b BETWEEN 'c' AND 'd'
    INSERT INTO passbook VALUES('a','b','c','d','e','f','g','h')
    CREATE TABLE a (b int(5) AUTO_INCREMENT, c date, d
    VARCHAR(255), e VARCHAR(255), f VARCHAR(255), g int(10), h
    int(10), i float(10,2), j VARCHAR(255), PRIMARY KEY (b))
    ~ 1.54 / 0.003 sec ~ 0.09 / 0.001 sec
    SELECT * FROM (((((((SELECT col1 FROM t1) AS ttt))))))*
    * Query can not be derived in SubMySQL grammar

    View Slide

  44. HTTP & SQL Correlation

    View Slide

  45. WAF + DBFW
    • Suppose that we have both WAF and DBFW deployed:
    Client WAF Web Server DBFW Database

    View Slide

  46. HTTP & SQL Correlation
    In order to correlate SQL queries with HTTP packets a host-based module can be
    deployed on the web server which will append session cookie into each SQL query in
    a comment section

    View Slide

  47. HTTP & SQL Correlation
    • When these modified queries reach DBFW it can look up those
    session identifiers in the database shared with WAF
    • WAF holding access control policy for web users acts as
    information point, i.e. it provides user information given a
    session cookie
    • DBFW serves as enforcement point, effectively blocking or
    allowing queries

    View Slide

  48. HTTP & SQL Correlation
    • What if we do not have a chance to deploy a host module (agent)?
    • We can still try to correlate HTTP and SQL using time-throttled request processing
    • Idea is that we process HTTP requests synchronously, observe emitted SQL queries, and associate
    them with HTTP requests

    View Slide

  49. SQL Injection Detection
    • Using host-based agent we can effectively detect SQL Injections
    • Agent injects into an SQL comment data about HTTP parameters that were
    observed when executing SQL query

    View Slide

  50. SQLi Detection Approach
    • DBFW replaces each occurrence of HTTP parameter value
    found in SQL query with a constant
    • Then it tries to parse and get tokens firstly for the original query
    and then for the second one with replaced constants
    • If a number of tokens is different, an SQL Injection is reported
    since constant replacement have caused changes in the query
    structure

    View Slide

  51. AST-based Detection
    • A better approach is to compare ASTs instead
    • After traversal of the ASTs, if differences are found, an SQL Injection is
    reported because constant replacement have caused changes in AST

    View Slide

  52. It decreases number of false positives. Does this mechanism decrease
    false negatives too?
    One of bypasses for owasp-modsecurity-crs found by Ivan Novikov
    It is not detected by libinjection too due to the context issue
    From Theory to Practice
    curl 'localhost/index.html?id=1%20or%20true'
    1%20or%20true
    id=1.or-id
    id=.1or-UTC_DATE—
    )-sleep(9999
    sleep(9999)
    */UNION SELECT password FROM users--

    View Slide

  53. Show Me Impact
    Michael Stepankin. Advanced Web Application Fuzzing

    View Slide

  54. SQLi Detection Example
    GET /app/?id=1%20or%20true HTTP/1.1
    Host: example.com
    Connection: close
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    Accept-Encoding: gzip, deflate, sdch
    Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4

    View Slide

  55. SQLi Detection Example
    GET /app/?id=1%20or%20true HTTP/1.1
    Host: example.com
    Connection: close
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    Accept-Encoding: gzip, deflate, sdch
    Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4
    /*{"get_args":[{"id":"1 or true"}]}*/ select * from users where clientid = 1 or true

    View Slide

  56. SQLi Detection Example
    GET /app/?id=1%20or%20true HTTP/1.1
    Host: example.com
    Connection: close
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    Accept-Encoding: gzip, deflate, sdch
    Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4
    /*{"get_args":[{"id":"1 or true"}]}*/ select * from users where clientid = 1 or true
    select * from users where clientid = ""
    select * from users where clientid = 1 or true

    View Slide

  57. SQLi Detection Example
    select * from users where clientid = ""
    select * from users where clientid = 1 or true

    View Slide

  58. SQLi Detection Example
    select * from users where clientid = ""
    select * from users where clientid = 1 or true
    select
    *
    from
    users
    where
    id
    =
    ""
    Lexems
    select
    *
    from
    users
    where
    id
    =
    1
    or
    true
    Lexems

    View Slide

  59. SQLi Detection Example
    select * from users where clientid = ""
    select * from users where clientid = 1 or true
    select
    *
    from
    users
    where
    id
    =
    ""
    Lexems
    select
    *
    from
    users
    where
    id
    =
    1
    or
    true
    Lexems
    8 ≠ 10

    View Slide

  60. SQLi Detection Example
    select * from users where id = ""
    select * from users where id = 1 or true
    select
    *
    from
    users
    where
    clientid
    =
    ""
    Lexems
    select
    *
    from
    users
    where
    clientid
    =
    1
    or
    true
    Lexems
    8 ≠ 10

    View Slide

  61. Access Control

    View Slide

  62. Access Control
    • All types of application firewalls should have access control mechanisms
    • The main statement of any access policy: All entities must be identified
    • Entities identification in account-based systems: at least it is necessary to identify
    web application subjects (users) that initiate queries to DBMS
    • Approaches
     Many-to-many applications
     HTTP and SQL user tracking
     RASP
    • Angine - ABAC eNgine

    View Slide

  63. XACML Data Flow Model

    View Slide

  64. Angine Data Flow Model
    Subjects,
    resources,
    environment
    PAP
    (6) attribute query
    (1) ALFAScript
    policy
    (5) request context
    (12) response
    (4) request
    Compiler
    Access
    requester
    PDP
    (2) Lua code
    Context
    handler
    PEP
    (3) access
    request
    PIP
    (9) attribute
    (7) attribute
    query
    (11) response context
    (10) attributes
    (8) attributes
    Generator Data
    structures
    Language
    ALFAScript
    IDL policy
    spec

    View Slide

  65. ALFAScript Policy Example
    interface Entity {
    abstract id: String;
    }
    interface SQLEntity <: Entity {
    database: String;
    table: String;
    column: String;
    level: Number;
    tags: [String];
    }
    interface Subject <: Entity {
    name: String;
    level: Number;
    tags: [String];
    }

    View Slide

  66. ALFAScript Policy Example
    namespace example {
    export policy Main {
    target clause action == "select“
    apply denyUnlessPermit
    rule mls {
    permit
    target clause subject.level > entity.level
    }
    rule rbac {
    permit
    condition rbacCheckRoles() == true
    }
    }
    }

    View Slide

  67. Generated Classes
    class Entity(object):
    def __init__(self):
    super(Entity, self).__init__()
    @abstractproperty
    def id(self):
    pass
    class SQLEntity(Entity):
    def __init__(self, database=None, table=None, column=None, level=None, tags=None):
    super(SQLEntity, self).__init__()
    self.database = database
    self.table = table
    self.column = column
    self.level = level
    self.tags = tags
    class Subject(Entity):
    def __init__(self, name=None, level=None, tags=None):
    super(Subject, self).__init__()
    self.name = name
    self.level = level
    self.tags = tags

    View Slide

  68. Angine Example
    from Angine.policy import Policy
    from Angine.pip import PIP
    from Angine.pdp import PDP
    def pep():
    ...
    request = get_request(network)
    policy = Policy(alfa_mysql_policy)
    pip = PIP.init_data(mongo_connection)
    pdp = PDP(policy.get_lua_policy())
    ctx = pip.create_ctx(request)
    response = pdp.evaluate(ctx)
    if response["result"]["decision"] != "permit":
    return None
    else:
    return process(request)

    View Slide

  69. Angine Results
    • ALFAScript IDL to runtime language code generator
    • ALFAScript language
    • ALFAScript to Lua transcompiler
    • PDP and PIP implementations for runtime language
    • Common parsers (HTTP, mysql, tsql)

    View Slide

  70. Roadmap
    • Host agents for C#, Java
    • ANTLR-based C++ parser
    • Release MySQL grammar for ANTLR4
    • PT Application Firewall integration
    • SQL user tracking
    • Machine learning for sensitive data discovery
    • Inspected Application Module for DBFW

    View Slide

  71. Thank you!
    ptsecurity.com

    View Slide

  72. Inspected Application Module

    View Slide

  73. Inspected Application Module
    Vladimir Kochetkov. Do WAFs dream of static analyzers?
    Peculiarities
     Web-only IAM can not process non HTTP attack vectors
     There are some cases when CompFG is not adequate to detect attacks
    • Loops, recursion
    • Internal and external dependencies
    The idea is to build SQL profile based on application code, compile it to binary module
    and run on the DBFW
    This approach can be used to detect second order SQL injection attacks

    View Slide

  74. Inspected Application Module Flow
    Application
    Inspector
    Deployed
    application
    Configuration
    WAF IAM WAF Front-end
    Web client
    (2) HTTP request context
    Application
    CompFG
    (5) HTTP response context
    (11) SQL response context
    Computation flow
    model
    (8) SQL request context
    DBMS
    (7) SQL
    request
    (6) HTTP
    request
    (12) SQL
    request
    (13) SQL
    response
    (14) SQL
    response
    (15) HTTP
    response
    (16) HTTP
    response (1) HTTP
    request
    Compiled
    binary
    module
    Source code
    Application Firewall

    View Slide

  75. SQL IAM Example

    $sql = "select * from data where id= intval(".$_POST["id"])
    $result = mysql_query($sql, $connection)
    $row = mysql_fetch_row($result)
    $sql = "select * from data where fname=' ".$row[2]. " ' "

    View Slide

  76. SQL IAM Example

    $sql = "select * from data where id= intval(".$_POST["id"])
    $result = mysql_query($sql, $connection)
    $row = mysql_fetch_row($result)
    $sql = "select * from data where fname=' ".$row[2]. " ' "
    Untrusted data read from database.
    What if fname is ' or '1' = '1 ?
    Second order SQL injection

    View Slide

  77. SQL IAM Example

    $sql = "select * from data where id= intval(".$_POST["id"])
    $result = mysql_query($sql, $connection)
    $row = mysql_fetch_row($result)
    $sql = "select * from data where fname=' ".$row[2]. " ' "
    The main SQL injection feature:
    a number of tokens more that one

    View Slide

  78. SQL IAM Example

    $sql = "select * from data where id= intval(".$_POST["id"])
    $result = mysql_query($sql, $connection)
    $row = mysql_fetch_row($result)
    $sql = "select * from data where fname=' ".$row[2]. " ' "
    (concat "select * from data where fname=" (
    concat (index-access row 2) "'"))

    View Slide

  79. SQL IAM Example

    $sql = "select * from data where id= intval(".$_POST["id"])
    $result = mysql_query($sql, $connection)
    $row = mysql_fetch_row($result)
    $sql = "select * from data where fname=' ".$row[2]. " ' "
    (concat "select * from data where fname=" (
    concat (index-access row 2) "'"))
    (call mysql_fetch_row (call mysql_query (concat
    "select * from data where id=intval(" (concat (index-
    access POST, "id") ")")) connection))

    View Slide

  80. SQL IAM Example

    $sql = "select * from data where id= intval(".$_POST["id"])
    $result = mysql_query($sql, $connection)
    $row = mysql_fetch_row($result)
    $sql = "select * from data where fname=' ".$row[2]. " ' "
    № Query hash Index Tokens
    1 87248237482347 [(28,-1)] 1
    2 13475837458758 [(32,-1)] 1

    View Slide

  81. SQLi IAM Example
    GET /app/?id=1000 HTTP/1.1
    Host: example.com
    Connection: close
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    Accept-Encoding: gzip, deflate, sdch
    Accept-Language: ru-RU,ru;q=0.8,en-US;q=0.6,en;q=0.4
    select * from data where id=1000
    select * from data where fname='john' or '1'='1'
    1 ≠ 2

    View Slide