Komei Nomura , Kenji Rikitake , Ryosuke Matsumoto 1. Pepabo R&D Institute GMO pepabo, Inc. / 2. KRPEO / 3. SAKURA Research Center, SAKURA Internet Inc. 2019.07.15 The 9th IEEE International Workshop on Network Technologies for Security, Administration and Protection Automatic Whitelist Generation for SQL Queries Using Web Application Tests 1 1,2 3
• Stealing confidential information from a database has become a severe vulnerability issue for web applications • e.g: SQL injection, OS command injection and so on • The attacks are caused by executing illegal queries to the database • The Illegal query is an unexpected query for web application developers • To prevent the attacks, illegal queries must be detected before they are executed in the database 4 Background
• Blacklist method • define illegal query pattern in a list and detect queries which matched the list • Whitelist method • define normal query pattern in a list and detect queries which doesn’t matched the list 5 Illegal query detection method Using only the blacklist can't detect unknown illegal query → Using the whitelist is required to detect Illegal query which has unknown patterns
• Developers manually create a whitelist of the queries issued by the web application • The large-scale web application issue enormous queries → Registering all queries in whitelist is difficult • Queries issued by the web application change with updating of the web application → Developers need to update the whitelist 6 Whitelist creation and its issue 5IFNFUIPEJNQPTFTBOJNQSBDUJDBMCVSEFOPOEFWFMPQFST
• Realization of a mechanism that • developers can create a whitelist without much effort • and detect illegal queries using it → The whitelist should be automatically generated according to changes of queries issued by a web application 7 Purpose of our research
• A method generates a whitelist by collecting queries issued while the web application is running • The method can create a whitelist independently of the web application implementation • Programming language, Framework 9 Generating a whitelist using issued queries %BUBCBTF 8FCBQQMJDBUJPO 2VFSZ 8IJUFMJTU %VSJOHXFCBQQMJDBUJPOJTSVOOJOH )551SFRVFTU
• The method can’t detect illegal queries immediately after running the web application • need a period to collect queries during running the web application • The period which can’t detect illegal queries occurs frequently • Queries change frequently because web services are frequently updated 10 5IFXIJUFMJTUHFOFSBUJPOTIPVMECFEPOFCFGPSFSVOOJOHUIFXFCBQQMJDBUJPO Generating a whitelist using issued queries
• A method generates a whitelist by analyzing the process of issuing the query in the web application source code • The method can generate a whitelist before web application running by using the source code as input 11 Generating a whitelist using static analysis "OBMZ[FS 4PVSDFDPEF 8IJUFMJTU #FGPSFXFCBQQMJDBUJPOSVOT
• The method can’t be commonly used in multiple web application with different implementations • Source code analysis depends on the implementation of the web application • If web service is constructed various languages and frameworks, implementing an analyzer for each application impose high workload 12 8IJUFMJTUHFOFSBUJPOTIPVMECFQFSGPSNFEJOEFQFOEFOUMZPGUIFXFC BQQMJDBUJPOJNQMFNFOUBUJPO Generating a whitelist using static analysis
1. The whitelist generation should be done before running the web application • to detect illegal queries immediately after running the web application 2. The whitelist generation should be performed independently of the web application implementation • to reduce the workload to implement for each web application 14 Requirements of proposed method
• Automatic whitelist generation method using queries issued during testing • The whitelist generation incorporates into the development process using an automatic test • Database proxy collects the queries issued during testing 15 Proposed method
3FHJTUFSUIFRVFSZTUSVDUVSFXJUIBXIJUFMJTU 2VFSZ 2VFSZ 4&-&$5'30.VTFST8)&3&JE 4&-&$5'30.VTFST8)&3&JE &YBNQMFPGRVFSZTUSVDUVSF Collecting queries using the database proxy realize whitelist generation independent of the web application implementation
• Define two indicators of detection accuracy • False positive means that normal query is determined as illegal • The normal query is an expected query issued by a web application receiving user input. • False negative means that illegal query is determined as normal • The illegal query is an unexpected query issued by attacks such as web application vulnerability attacks. 21 Indicator of detection accuracy
• The relation of queries issued during testing and running affect the detection accuracy 22 Queries that cause false positive / negative #2VFSJFTEVSJOHSVOOJOH "2VFSJFTEVSJOHUFTUJOH • The reason for queries issued only during testing • Registering test data • Deleting all test data • The reason for queries issued only during running • Test cases are a subset of usage during running 5IFDBVTFPGGBMTFOFHBUJWF 5IFDBVTFPGGBMTFQPTJUJWF
• We verified the queries that cause false positive/negative in production • We obtained query log in production for 3 days of holidays • to remove the changes of queries issued by the web application • We ran tests of the web application that was running during the query log period and obtained the queries issued during testing 23 Experiment in production
• All queries in this red area were issued by the normal process • These queries are not issued in the test by lacking test case or skipping access to the database • Complementing queries lacking in the whitelist is necessary • Applying the proposed method only to the database table with confidential information is important • Reducing false positive by reducing the queries of the detection target 25 Consideration of false positive cases #2VFSZTUSVDUVSFTJTTVFEJOQSPEVDUJPO "2VFSZTUSVDUVSFTJTTVFEJOUFTU
#2VFSZTUSVDUVSFTJTTVFEJOQSPEVDUJPO "2VFSZTUSVDUVSFTJTTVFEJOUFTU • Green area includes two categories of the query 1. Queries issued not issued during the query log period 2. Queries issued only in the test • Include a query that deletes all confidential data in the database table • The detection combined whitelist and blacklist is necessary • Registering queries handling a lot of data into the blacklist • e.g: query deleting all data in the database table 26 Consideration of false negative cases
• The existing methods of automatic whitelist generation have issues that • can’t detect illegal queries immediately after running the web application • can’t be commonly used in multiple web application with different implementations • The proposed method solves these issues • by incorporating whitelist generation into the development process • by collecting queries during testing using the database proxy 28 Conclusion
• The experimental results show that the proposed method causes false positive and false negative • Regarding false positive cases • Complementing queries lacking in the whitelist • Applying the proposed method only to the table with confidential information • Regarding false negative cases • Detection combining whitelist and blacklist for illegal queries 29 Conclusion