Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to Choose (Or write) your own source code s...

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for OWASP Japan OWASP Japan
March 20, 2014
340

How to Choose (Or write) your own source code scanner #appsecapac2014

Avatar for OWASP Japan

OWASP Japan

March 20, 2014
Tweet

Transcript

  1. How  to  Choose  (Or  write)  your  own  source  code  

    scanner   Yu-­‐Lu  “Chris”  Liu  
  2. About  Me   •  Chris  Liu   •  security  team

     member  at  Rakuten,  Inc.   •  Pentester  /  Auditor   •  Malware  Analyst
  3. The  purpose  of  this  speak •  Is  NOT  to  discourage

     you  from  trying  to  use   source  code  scanner   •  Is  TO  SHARE  my  perspecMves  when  choosing  a   commercial  soluMon   •  Is  TO  SHARE  how  to  use  available  soluMons  to   build  your  own  scanner
  4. It  all  started  when •  We  decided  to  evaluated  some

     commercial   scanners   •  So  we  reached  out,  and  got  several  replies  
  5. What  we  hoped •  Have  a  soluMon  that  scans  different

     languages   •  Integrate  it  into  ConMnuous  IntegraMon  server     •  Have  it  scanned  ever  before  coming  to  us   •  Make  our  life  easier
  6. So  we  started  talking •  Talked  and  started  evaluate  several

     soluMons   in  actual  projects,  which  I  was  assigned  with  2   of  them   •  Comparing  its  result  against  the  current   countermeasures  we  have  
  7. Scan  Targets •  Scanned  PHP,  Java,  Ruby…   – 80%  of

     md5  issues,  and  maybe  19.9%  others   results  in  False  PosiMve   •  Not  much  supports  in  the  mobile  apps   – However  most  soluMons  have  funcMonaliMes   which  allows  us  to  do  some  customizaMon  
  8. A]er  customizing  it •  A]er  some  customizaMons,  the  result  of

      mobile  apps  became  be^er   •  But  all  I  did  was  increasing  the  keywords  to   look  for,  more  sophisMcated  checks  could  be   customized,  but  that  also  increases  our   learning  curve   – Kind  of  makes  me  feel  like  just  wriMng  a  grep  script
  9. AddiMonal  problems •  Some  PHP  app,  bundled  with  template  engine

      has  some  fairly  simple  XSS  that  yields  in  zero   detecMon   •  Suddenly  we  realized  that  framework  supports   are  also  very  crucial  
  10. Further  inquiries  implies •  Major  frameworks  like  Symfony  are  

    supported,  but  not  others   – Especially  frameworks  that  are  mostly  used  in   Japan,  like  Seasar2  or  the  Smarty  template  
  11. To  sum  things  up •  There  are  a  lot  of

     challenges   – High  False  PosiMve/NegaMve  Rate   – Framework  supports   – CustomizaMon  issues   – Need  more  support  on  Mobile  applicaMons   – And  maybe  more  …  
  12. One  last  try  though •  Looking  back  into  the  high

     false  posiMve/ negaMve  rate,  maybe  it  is  because  of  the  small   sample  space   •  So  I  just  googled  2  random  XSS  from   phpmyadmin  with  a  CVE  number  assigned     – CVE-­‐2008-­‐4775   – CVE-­‐2013-­‐1937
  13. Results  first •  They  cannot  find  either  of  the  XSS

     specified  in   the  CVE   •  But  wait,  maybe  it  is  only  because  the  source   code  of  phpmyadmin  are  too  complicated   already   – So  I  decided  to  scan  just  the  code  snippets  for   both  XSS
  14. So  let’s  simplify  things •  CVE-­‐2008-­‐4775:  echo  of  user  input

     assigned   into  PHP’s  $GLOBALS  variable   <?php   $GLOBALS[‘db’]  =  $_REQUEST[‘db’];   echo  $db;   ?>
  15. And  also  for  CVE-­‐2013-­‐1937 •  It’s  about  echoing  user  input

     saved  into  an  array   <?php   $visualizaMonSewngs  =  array();   $visualizaMonSewngs  =   $_REQUEST[‘visualizaMonSewngs’];   echo  $visualizaMonSewngs[‘width’];   ?>
  16. One  last  straw •  So  far  I  had  spent  Mme

     in  just  XSS,  so  what   about  one  of  the  most  criMcal  vulnerability,   code  injecMon   •  And  this  Mme  I  would  go  directly  with  just   scanning  code  snippet
  17. How  about  just  eval  in   preg_replace  ? •  Although

     the  eval  feature  is  deprecated,  and  thus   should  not  be  used,  but  sMll  it’s  everywhere  and   should  be  considered  as  a  serious  threat   <?php   $string  =  $_REQUEST[‘foo’];   $string  =  preg_replace('/(.*)/e',  'strtoupper($1)',   $string);   ?>
  18. What  we  have  so  far •  What  I  menMoned  

    –  False  PosiMve/NegaMve  Rate   –  Framework  support  issues   –  CustomizaMon  issues   –  Need  more  support  on  Mobile  applicaMons   –  Can’t  detect  simple  pa^erns  of  XSS   –  No  support  for  detecMng  eval  modifier   •  What  I  didn’t  menMon   –  Code  execuMon  flow  are  prone  to  mistake   •  Probably  more
  19. What  I  had  learned •  Scanners  have  more  problems  than

     we  had   expected,  and  will  probably  never  be  perfect   •  To  understand  more  about  it,  I  decided  to   write  my  own  scanner  and  see   – I  targeted  Android  because  that’s  what  I  was   mostly  doing  at  that  Mme
  20. But  then  how  do  we  write  a   code  scanner?

    •  Strategies   – Grep  ?   – Build  our  own  tree  and  parse  through  it  ?   – Write  our  own  compiler  ?   •  Or  just  take  a  look  at  what  people  are  doing  in   the  open  source  community
  21. And  I  ran  across  pmd •  It’s  basically  a  Java

     scanner,  but  hey,  Android  is   basically  wri^en  in  Java   •  Custom  ruleset  creaMon  can  be  achieved  by   using  a  W3C  defined  XML  Path  Language   called  XPath.
  22. So  what  is  XPath  ? •  A  query  language  to

     select  nodes  from  an  XML   document.  It  also  has  the  ability  to  compute   or  compare  values  which  the  XML  document   possesses.  
  23. A  li^le  tut  on  XPath  here •  Nodename  :  Select

     all  node  with  the  name   “Nodename”   •  /  :  Select  from  the  root  node   •  //  :  Select  node  from  the  current  node  no  ma^er   where  they  are   •  .  :  Selects  the  current  node   •  ..  :  Select  the  parent  of  the  current  node   •  []  :  Predicates,  used  to  find  specific  node  or  a   node  that  contains  a  specific  value
  24. A  simple  example •  Given  the  following  XML,  find  the

     node  with  Mtle  of  “Hello”   with  Xpath   <book>          <'tle>Hello</'tle>   </book>   <book>          <Mtle>World</Mtle>   </book>   •  /book/Mtle=“Hello”   •  //Mtle=“Hello”  
  25. An  simple  example  with   predicate •  Find  the  book

     Mtled  “Hello  World”,  whose  pricing  is  over  90   <book>          <Mtle>Hello  World</Mtle>          <price>10</price>   </book>   <book>          <'tle>Hello  World</'tle>          <price>99</price>   </book> •  /book[.//price>90]/Mtle=“Hello  World”  
  26. Ok…,  XPath  searches  for  node,   so  what? •  The

     pmd  Eclipse  plugin  uses  the  XML   presented  AST  (Abstract  Syntax  Tree)   generated  by  Eclipse,  coming  from  the  source   code   •  Scans  it  with  rulesets  created  by  XPath  :D
  27. Abstract  Syntax  Tree •  A  tree  representaMon  of  the  abstract

     syntacMc   structure  of  source  code  wri^en  in  a   programming  language   •  Can  be  represented  in  XML  
  28. But  this  nothing  new •  In  2013  Appsec,  there  is

     already  a  topic  about   using  the  PMD  plugin   – Teaching  an  old  dog  new  tricks  securing   development  with  PMD   – From  the  good  people  of  Gotham  Digital  Science
  29. Android  app,  here  I  come •  Let’s  use  pmd  plugin

     !!   •  So  I  started  study  XPath,  and  created  several   dozens  different  rules   – SSL  cerMficaMon  checks   – Insecure  storage   – And  more  ..   •  50%  in  Mme  could  be  saved  in  actual  projects,   and  sMll  able  to  find  most  of  the  issues  I  found   while  doing  it  manually  
  30. But  of  course  manual  tesMng   can’t  be  avoided • 

    There  are  needs  to  check  some  dynamic   generated  contents,  so  that  is  solved  by   wriMng  some  python  scripts   •  Won’t  discuss  this  here
  31. A]er  the  experience  in  Android •  I  am  now  certain

     that  if  we  can  have  a  XML   forma^ed  AST  structure,  we  can  XPath  it.  And   that  goes  with  every  languages.  
  32. Some  examples  includes •  Java:  Just  use  PMD   • 

    PHP:  PHP-­‐Parser  by  nikic  +  XPath   •  Ruby:  parser  by  whitequack  +  XPath   •  JavaScript:  Esprima  +  Xpath   •  I  decided  to  challenge  PHP,  because  I  want  to   know  how  hard  it  is  to  scan  the  2  XSS  I   menMoned
  33. This  is  how  AST  looks  like  in     PHP-­‐Parser

    •  <?php  echo  “Hello  World”;  ?>   <AST>          <node:Stmt_Echo>                  <scalar:string>Hello  World</scalar:string>          </node:Stmt_Echo>   <AST>  
  34. A  very  simple  XSS •  <?php  echo  $_GET[‘foo’];  ?>  

    <AST>          <node:Stmt_Echo>                  <scalar:string>_GET</scalar:string>                  <scalar:string>foo</scalar:string>          </node:Stmt_Echo>   <AST>  
  35. A  very  simple  XSS •  //node:Stmt_Echo//scalar:string=“_GET”   <AST>    

         <node:Stmt_Echo>                  <scalar:string>_GET</scalar:string>                  <scalar:string>foo</scalar:string>          </node:Stmt_Echo>   <AST>  
  36. A  very  simple  XSS •  What  if  we  want  to

     check  if  htmlspecialchars()  are  used  or   not.   •  <?php  htmlspecialchars($_GET[‘foo’]);  ?>          <node:Stmt_Echo>                  <node:Expr_FuncCall>                          <scalar:string>htmlspecialchars</scalar:string>                          <scalar:string>_GET</scalar:string>                          <scalar:string>foo</scalar:string>                  </node:Expr_FuncCall>          </node:Stmt_Echo>  
  37. A  very  simple  XSS //node:Stmt_Echo   [not(.//scalar:string=“htmlspecialchars”)]   //scalar:string=“_GET”  

    •  <?php  htmlspecialchars($_GET[‘foo’]);  ?>          <node:Stmt_Echo>                  <node:Expr_FuncCall>                          <scalar:string>htmlspecialchars</scalar:string>                          <scalar:string>_GET</scalar:string>                          <scalar:string>foo</scalar:string>                  </node:Expr_FuncCall>          </node:Stmt_Echo>  
  38. So  first  let’s  tackle   CVE-­‐2008-­‐4775 •  Remember  the  code

     that  echos  user  input   stored  as  global  variable  without  saniMzaMon?   <?php   $GLOBALS[‘db’]  =  $_REQUEST[‘db’];   echo  $db;   ?>
  39. Here  is  the  AST  for  that  snippet <node:Expr_Assign>    

         <scalar:string>GLOBAL</scalar:string>          <scalar:string>db</scalar:string>          <scalar:string>_REQUEST</scalar:string>          <scalar:string>db</scalar:string>   </node:Expr_Assign>   <node:Stmt_Echo>          <scalar:string>db</scalar:string>   </node:Stmt_Echo>
  40. XPath  strategy •  1.  We  know  that  $db  got  assigned

     before  it   got  echo-­‐ed   //node:Stmt_Echo   //scalar:string=.//scalar:string
  41. XPath  strategy •  2.  Next,  the  variable  $db  was  assigned

     from   $_REQUEST,  and  was  assigned  into  a  $GLOBAL   variable   //node:Stmt_Echo   [.//scalar:string=preceding-­‐sibling::node:*   [.//scalar:string="GLOBALS"][.//scalar:string="_REQUEST"]   //scalar:string]   //scalar:string=.//scalar:string  
  42. Also  CVE-­‐2013-­‐1937 •  This  one  is  about  echoing  user  input

     saved  into   an  array   <?php   $visualizaMonSewngs  =  array();   $visualizaMonSewngs  =   $_REQUEST[‘visualizaMonSewngs’];   echo  $visualizaMonSewngs[‘width’];   ?>
  43. Here  is  the  AST  for  that  snippet <node:Expr_Assign>    <scalar:string>visualizaMonSewngs</scalar:string>

       <node:Expr_ArrayDimFetch>      <scalar:string>_REQUEST</scalar:string>      <scalar:string>visualizaMonSewngs</scalar:string>    </node:Expr_ArrayDimFetch>   </node:Expr_Assign>   <node:Stmt_Echo>    <scalar:string>visualizaMonSewngs</scalar:string>    <scalar:string>width</scalar:string>   </node:Stmt_Echo>
  44. XPath  strategy   •  1.  Variable  got  echo-­‐ed  without  saniMzaMon,

      just  like  the  previous  example   //node:Stmt_Echo   //scalar:string=.//scalar:string
  45. XPath  strategy •  2.  Variable  are  from  the  $_REQUEST  variable

      //node:Stmt_Echo   [.//scalar:string=preceding-­‐sibling::node:*[.// scalar:string="_GET"]//scalar:string]   //scalar:string=.//scalar:string
  46. Of  course,  the  eval  modifier //node:Expr_FuncCall   [.//scalar:string[substring(.,  string-­‐length(.)-­‐1)="/e"]]  

    [.//scalar:string=//node:*[.// scalar:string="_REQUEST"]//scalar:string]   //scalar:string="preg_replace"
  47. The  designer •  To  aid  XPath  wriMng,  I  also  coded

     a  simple  web   page  called  the  designer
  48. So,  end  of  story? •  We  have  succeed  in  the

     following   – DetecMng  the  2  CVEs,  and  the  simple  code   injecMon   – Write  rules  in  a  W3C  defined  language   •  No  need  to  learn  a  new  set  of  API  even  with  other   languages  like  Java  nor  Ruby   – Almost  no  cost,  hey  it  is  all  online  
  49. Or  is  it  !? •  Actually  no!  Of  course  there

     are  constraints  in  the   current  soluMon   –  Specific  vs  coverage   –  Can’t  handle  dynamically  generated  contents  like  the     file  path  while  calling  include/require   –  Framework  support   –  Template  engine  support   •  Smarty:  scan  the  files  in  templates_c  folder   –  AST  largely  depends  on  how  good  the  parser  is   –  Well,  and  no  execuMon  flow  charts   –  Maybe  more  …
  50. But  hey,  defense  in  depth  !! •  Although  whitebox  scanners

     are  probably   never  going  to  be  perfect,  this  should  also  act   as  a  part  of  your  countermeasures