Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to Choose (Or write) your own source code scanner #appsecapac2014

OWASP Japan
March 20, 2014
330

How to Choose (Or write) your own source code scanner #appsecapac2014

OWASP Japan

March 20, 2014
Tweet

Transcript

  1. How  to  Choose  (Or  write)  your  own  source  code  

    scanner   Yu-­‐Lu  “Chris”  Liu  
  2. About  Me   •  Chris  Liu   •  security  team

     member  at  Rakuten,  Inc.   •  Pentester  /  Auditor   •  Malware  Analyst
  3. The  purpose  of  this  speak •  Is  NOT  to  discourage

     you  from  trying  to  use   source  code  scanner   •  Is  TO  SHARE  my  perspecMves  when  choosing  a   commercial  soluMon   •  Is  TO  SHARE  how  to  use  available  soluMons  to   build  your  own  scanner
  4. It  all  started  when •  We  decided  to  evaluated  some

     commercial   scanners   •  So  we  reached  out,  and  got  several  replies  
  5. What  we  hoped •  Have  a  soluMon  that  scans  different

     languages   •  Integrate  it  into  ConMnuous  IntegraMon  server     •  Have  it  scanned  ever  before  coming  to  us   •  Make  our  life  easier
  6. So  we  started  talking •  Talked  and  started  evaluate  several

     soluMons   in  actual  projects,  which  I  was  assigned  with  2   of  them   •  Comparing  its  result  against  the  current   countermeasures  we  have  
  7. Scan  Targets •  Scanned  PHP,  Java,  Ruby…   – 80%  of

     md5  issues,  and  maybe  19.9%  others   results  in  False  PosiMve   •  Not  much  supports  in  the  mobile  apps   – However  most  soluMons  have  funcMonaliMes   which  allows  us  to  do  some  customizaMon  
  8. A]er  customizing  it •  A]er  some  customizaMons,  the  result  of

      mobile  apps  became  be^er   •  But  all  I  did  was  increasing  the  keywords  to   look  for,  more  sophisMcated  checks  could  be   customized,  but  that  also  increases  our   learning  curve   – Kind  of  makes  me  feel  like  just  wriMng  a  grep  script
  9. AddiMonal  problems •  Some  PHP  app,  bundled  with  template  engine

      has  some  fairly  simple  XSS  that  yields  in  zero   detecMon   •  Suddenly  we  realized  that  framework  supports   are  also  very  crucial  
  10. Further  inquiries  implies •  Major  frameworks  like  Symfony  are  

    supported,  but  not  others   – Especially  frameworks  that  are  mostly  used  in   Japan,  like  Seasar2  or  the  Smarty  template  
  11. To  sum  things  up •  There  are  a  lot  of

     challenges   – High  False  PosiMve/NegaMve  Rate   – Framework  supports   – CustomizaMon  issues   – Need  more  support  on  Mobile  applicaMons   – And  maybe  more  …  
  12. One  last  try  though •  Looking  back  into  the  high

     false  posiMve/ negaMve  rate,  maybe  it  is  because  of  the  small   sample  space   •  So  I  just  googled  2  random  XSS  from   phpmyadmin  with  a  CVE  number  assigned     – CVE-­‐2008-­‐4775   – CVE-­‐2013-­‐1937
  13. Results  first •  They  cannot  find  either  of  the  XSS

     specified  in   the  CVE   •  But  wait,  maybe  it  is  only  because  the  source   code  of  phpmyadmin  are  too  complicated   already   – So  I  decided  to  scan  just  the  code  snippets  for   both  XSS
  14. So  let’s  simplify  things •  CVE-­‐2008-­‐4775:  echo  of  user  input

     assigned   into  PHP’s  $GLOBALS  variable   <?php   $GLOBALS[‘db’]  =  $_REQUEST[‘db’];   echo  $db;   ?>
  15. And  also  for  CVE-­‐2013-­‐1937 •  It’s  about  echoing  user  input

     saved  into  an  array   <?php   $visualizaMonSewngs  =  array();   $visualizaMonSewngs  =   $_REQUEST[‘visualizaMonSewngs’];   echo  $visualizaMonSewngs[‘width’];   ?>
  16. One  last  straw •  So  far  I  had  spent  Mme

     in  just  XSS,  so  what   about  one  of  the  most  criMcal  vulnerability,   code  injecMon   •  And  this  Mme  I  would  go  directly  with  just   scanning  code  snippet
  17. How  about  just  eval  in   preg_replace  ? •  Although

     the  eval  feature  is  deprecated,  and  thus   should  not  be  used,  but  sMll  it’s  everywhere  and   should  be  considered  as  a  serious  threat   <?php   $string  =  $_REQUEST[‘foo’];   $string  =  preg_replace('/(.*)/e',  'strtoupper($1)',   $string);   ?>
  18. What  we  have  so  far •  What  I  menMoned  

    –  False  PosiMve/NegaMve  Rate   –  Framework  support  issues   –  CustomizaMon  issues   –  Need  more  support  on  Mobile  applicaMons   –  Can’t  detect  simple  pa^erns  of  XSS   –  No  support  for  detecMng  eval  modifier   •  What  I  didn’t  menMon   –  Code  execuMon  flow  are  prone  to  mistake   •  Probably  more
  19. What  I  had  learned •  Scanners  have  more  problems  than

     we  had   expected,  and  will  probably  never  be  perfect   •  To  understand  more  about  it,  I  decided  to   write  my  own  scanner  and  see   – I  targeted  Android  because  that’s  what  I  was   mostly  doing  at  that  Mme
  20. But  then  how  do  we  write  a   code  scanner?

    •  Strategies   – Grep  ?   – Build  our  own  tree  and  parse  through  it  ?   – Write  our  own  compiler  ?   •  Or  just  take  a  look  at  what  people  are  doing  in   the  open  source  community
  21. And  I  ran  across  pmd •  It’s  basically  a  Java

     scanner,  but  hey,  Android  is   basically  wri^en  in  Java   •  Custom  ruleset  creaMon  can  be  achieved  by   using  a  W3C  defined  XML  Path  Language   called  XPath.
  22. So  what  is  XPath  ? •  A  query  language  to

     select  nodes  from  an  XML   document.  It  also  has  the  ability  to  compute   or  compare  values  which  the  XML  document   possesses.  
  23. A  li^le  tut  on  XPath  here •  Nodename  :  Select

     all  node  with  the  name   “Nodename”   •  /  :  Select  from  the  root  node   •  //  :  Select  node  from  the  current  node  no  ma^er   where  they  are   •  .  :  Selects  the  current  node   •  ..  :  Select  the  parent  of  the  current  node   •  []  :  Predicates,  used  to  find  specific  node  or  a   node  that  contains  a  specific  value
  24. A  simple  example •  Given  the  following  XML,  find  the

     node  with  Mtle  of  “Hello”   with  Xpath   <book>          <'tle>Hello</'tle>   </book>   <book>          <Mtle>World</Mtle>   </book>   •  /book/Mtle=“Hello”   •  //Mtle=“Hello”  
  25. An  simple  example  with   predicate •  Find  the  book

     Mtled  “Hello  World”,  whose  pricing  is  over  90   <book>          <Mtle>Hello  World</Mtle>          <price>10</price>   </book>   <book>          <'tle>Hello  World</'tle>          <price>99</price>   </book> •  /book[.//price>90]/Mtle=“Hello  World”  
  26. Ok…,  XPath  searches  for  node,   so  what? •  The

     pmd  Eclipse  plugin  uses  the  XML   presented  AST  (Abstract  Syntax  Tree)   generated  by  Eclipse,  coming  from  the  source   code   •  Scans  it  with  rulesets  created  by  XPath  :D
  27. Abstract  Syntax  Tree •  A  tree  representaMon  of  the  abstract

     syntacMc   structure  of  source  code  wri^en  in  a   programming  language   •  Can  be  represented  in  XML  
  28. But  this  nothing  new •  In  2013  Appsec,  there  is

     already  a  topic  about   using  the  PMD  plugin   – Teaching  an  old  dog  new  tricks  securing   development  with  PMD   – From  the  good  people  of  Gotham  Digital  Science
  29. Android  app,  here  I  come •  Let’s  use  pmd  plugin

     !!   •  So  I  started  study  XPath,  and  created  several   dozens  different  rules   – SSL  cerMficaMon  checks   – Insecure  storage   – And  more  ..   •  50%  in  Mme  could  be  saved  in  actual  projects,   and  sMll  able  to  find  most  of  the  issues  I  found   while  doing  it  manually  
  30. But  of  course  manual  tesMng   can’t  be  avoided • 

    There  are  needs  to  check  some  dynamic   generated  contents,  so  that  is  solved  by   wriMng  some  python  scripts   •  Won’t  discuss  this  here
  31. A]er  the  experience  in  Android •  I  am  now  certain

     that  if  we  can  have  a  XML   forma^ed  AST  structure,  we  can  XPath  it.  And   that  goes  with  every  languages.  
  32. Some  examples  includes •  Java:  Just  use  PMD   • 

    PHP:  PHP-­‐Parser  by  nikic  +  XPath   •  Ruby:  parser  by  whitequack  +  XPath   •  JavaScript:  Esprima  +  Xpath   •  I  decided  to  challenge  PHP,  because  I  want  to   know  how  hard  it  is  to  scan  the  2  XSS  I   menMoned
  33. This  is  how  AST  looks  like  in     PHP-­‐Parser

    •  <?php  echo  “Hello  World”;  ?>   <AST>          <node:Stmt_Echo>                  <scalar:string>Hello  World</scalar:string>          </node:Stmt_Echo>   <AST>  
  34. A  very  simple  XSS •  <?php  echo  $_GET[‘foo’];  ?>  

    <AST>          <node:Stmt_Echo>                  <scalar:string>_GET</scalar:string>                  <scalar:string>foo</scalar:string>          </node:Stmt_Echo>   <AST>  
  35. A  very  simple  XSS •  //node:Stmt_Echo//scalar:string=“_GET”   <AST>    

         <node:Stmt_Echo>                  <scalar:string>_GET</scalar:string>                  <scalar:string>foo</scalar:string>          </node:Stmt_Echo>   <AST>  
  36. A  very  simple  XSS •  What  if  we  want  to

     check  if  htmlspecialchars()  are  used  or   not.   •  <?php  htmlspecialchars($_GET[‘foo’]);  ?>          <node:Stmt_Echo>                  <node:Expr_FuncCall>                          <scalar:string>htmlspecialchars</scalar:string>                          <scalar:string>_GET</scalar:string>                          <scalar:string>foo</scalar:string>                  </node:Expr_FuncCall>          </node:Stmt_Echo>  
  37. A  very  simple  XSS //node:Stmt_Echo   [not(.//scalar:string=“htmlspecialchars”)]   //scalar:string=“_GET”  

    •  <?php  htmlspecialchars($_GET[‘foo’]);  ?>          <node:Stmt_Echo>                  <node:Expr_FuncCall>                          <scalar:string>htmlspecialchars</scalar:string>                          <scalar:string>_GET</scalar:string>                          <scalar:string>foo</scalar:string>                  </node:Expr_FuncCall>          </node:Stmt_Echo>  
  38. So  first  let’s  tackle   CVE-­‐2008-­‐4775 •  Remember  the  code

     that  echos  user  input   stored  as  global  variable  without  saniMzaMon?   <?php   $GLOBALS[‘db’]  =  $_REQUEST[‘db’];   echo  $db;   ?>
  39. Here  is  the  AST  for  that  snippet <node:Expr_Assign>    

         <scalar:string>GLOBAL</scalar:string>          <scalar:string>db</scalar:string>          <scalar:string>_REQUEST</scalar:string>          <scalar:string>db</scalar:string>   </node:Expr_Assign>   <node:Stmt_Echo>          <scalar:string>db</scalar:string>   </node:Stmt_Echo>
  40. XPath  strategy •  1.  We  know  that  $db  got  assigned

     before  it   got  echo-­‐ed   //node:Stmt_Echo   //scalar:string=.//scalar:string
  41. XPath  strategy •  2.  Next,  the  variable  $db  was  assigned

     from   $_REQUEST,  and  was  assigned  into  a  $GLOBAL   variable   //node:Stmt_Echo   [.//scalar:string=preceding-­‐sibling::node:*   [.//scalar:string="GLOBALS"][.//scalar:string="_REQUEST"]   //scalar:string]   //scalar:string=.//scalar:string  
  42. Also  CVE-­‐2013-­‐1937 •  This  one  is  about  echoing  user  input

     saved  into   an  array   <?php   $visualizaMonSewngs  =  array();   $visualizaMonSewngs  =   $_REQUEST[‘visualizaMonSewngs’];   echo  $visualizaMonSewngs[‘width’];   ?>
  43. Here  is  the  AST  for  that  snippet <node:Expr_Assign>    <scalar:string>visualizaMonSewngs</scalar:string>

       <node:Expr_ArrayDimFetch>      <scalar:string>_REQUEST</scalar:string>      <scalar:string>visualizaMonSewngs</scalar:string>    </node:Expr_ArrayDimFetch>   </node:Expr_Assign>   <node:Stmt_Echo>    <scalar:string>visualizaMonSewngs</scalar:string>    <scalar:string>width</scalar:string>   </node:Stmt_Echo>
  44. XPath  strategy   •  1.  Variable  got  echo-­‐ed  without  saniMzaMon,

      just  like  the  previous  example   //node:Stmt_Echo   //scalar:string=.//scalar:string
  45. XPath  strategy •  2.  Variable  are  from  the  $_REQUEST  variable

      //node:Stmt_Echo   [.//scalar:string=preceding-­‐sibling::node:*[.// scalar:string="_GET"]//scalar:string]   //scalar:string=.//scalar:string
  46. Of  course,  the  eval  modifier //node:Expr_FuncCall   [.//scalar:string[substring(.,  string-­‐length(.)-­‐1)="/e"]]  

    [.//scalar:string=//node:*[.// scalar:string="_REQUEST"]//scalar:string]   //scalar:string="preg_replace"
  47. The  designer •  To  aid  XPath  wriMng,  I  also  coded

     a  simple  web   page  called  the  designer
  48. So,  end  of  story? •  We  have  succeed  in  the

     following   – DetecMng  the  2  CVEs,  and  the  simple  code   injecMon   – Write  rules  in  a  W3C  defined  language   •  No  need  to  learn  a  new  set  of  API  even  with  other   languages  like  Java  nor  Ruby   – Almost  no  cost,  hey  it  is  all  online  
  49. Or  is  it  !? •  Actually  no!  Of  course  there

     are  constraints  in  the   current  soluMon   –  Specific  vs  coverage   –  Can’t  handle  dynamically  generated  contents  like  the     file  path  while  calling  include/require   –  Framework  support   –  Template  engine  support   •  Smarty:  scan  the  files  in  templates_c  folder   –  AST  largely  depends  on  how  good  the  parser  is   –  Well,  and  no  execuMon  flow  charts   –  Maybe  more  …
  50. But  hey,  defense  in  depth  !! •  Although  whitebox  scanners

     are  probably   never  going  to  be  perfect,  this  should  also  act   as  a  part  of  your  countermeasures