Group meeting: Identifying Information Disclosure in Web Applications with Retroactive Auditing

Identifying Information Disclosure in Web Applications with Retroactive Auditing Haogang
Chen, Taesoo Kim, Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek MIT CSAIL Computer Systems Security Group Parallel & Distributed Operating Systems Group

Outline • Introduction • Related Work • Using Rail •
System Overview • Replay • Evaluation • Conclusion

Introduction

What to do about it? • Before data breach: prevention
techniques • privilege separation • encryption • information ﬂow control • After data breach: damage control

Damage control is costly • Notify the victims • Data
Breach Notiﬁcation Laws (40/50 states) • Pay for credit monitoring & fraud protection

University of Maryland paid for 5 years of credit monitoring
for 309,079 potentially affected users

Opportunity: Some data might not be leaked • The vulnerability
might not have been exploited yet • Attackers might not steal all data that they can • Goal: precisely identify breached data items • Target damage control at real victims only

State of the art • Log all accesses to sensitive
data • Inspect logs after an intrusion • Problems • Need to know what is sensitive data beforehand • Hard to tell legal vs. illegal accesses • May take a long time

Solution: Rail • Goal: precisely identify previously breached data after
a vulnerability is ﬁxed

Solution: Rail • Goal: precisely identify previously breached data after
a vulnerability is ﬁxed • Contribution: apply record and replay to identify improper disclosures • State during replay can diverge from the original execution • Prior systems use record and replay for integrity • Rail focuses on conﬁdentiality • Provide APIs for application developers • For precision, Rail must match up state and minimize state divergence between the two executions

Related Work

Related Work • Focus on logging all accesses to conﬁdential
data • Keypad [EuroSys ’11], Pasture [OSDI ’12] • Information ﬂow control and taint tracking • TaintDroid [OSDI ’10], TightLip [NSDI ’07] • Record and replay • MIT CSAIL (see next slide)

Undo Computing • Intrusion Recovery Using Selective Re-execution [OSDI ’10]
• Intrusion Recovery for Database-backed Web Applications [SOSP ’11] • Efﬁcient patch-based auditing for web application vulnerabilities [OSDI ’12] • Asynchronous intrusion recovery for interconnected web services [SOSP ’13] • Identifying information disclosure in web applications with retroactive auditing [OSDI ’14’]

Ideas • Action history graph and selective replay • Retro
[OSDI ’10] • Comparison of normal execution and replay • Rad [APSys ’11], Poirot [OSDI ’12] • Prior replay systems were focused on restoring integrity

Using Rail

real world mistake Rail will reproduce the same date and
random number from the context

Basic approach • Record and replay the web application •
Compare the outputs of two executions

System Overview

Assumptions • Focus on web applications • prototype is based
on, but not limited to, Meteor • Trusts the software stack below the web application • Requests do not change during replay, except for ﬁxes • Deal only with data leaked through the web application, and assume mistakes lead to disclosures

Action APIs • Actions are the unit of dependency tracking
• event (RPC request) -> action -> handler • Each action has a timestamp

Object APIs • Developer must wrap all global objects in
the app using Rail’s object API • Most wrappers can be done once in the framework

Object APIs • For each shared object, Rail maintains •
A set of dependencies between actions and objects • Multiple versions of the object’s state at different points in time

Tracking data items in output • Rail maintains a view
object for every session • tracks all data items sent to the client • During replay, Rail reruns actions and re-compute the view objects for every session • if old_view − new_view ≠ ∅ ➜ Breach!

AHG: action history graph

Logs and dependency graph • Rail assumes that actions are
atomic • the web framework provides serializability • Rail stores AHG in a persistent log • Objects that do not store actual state (i.e. just a placeholder) in the Rail’s shared object must maintain their own versioning outside of Rail’s log • time-travel database [SOSP ’11]

Replay

Selective re-execution • Inspired by Retro [OSDI ’10] • Rail
replays each action sequentially in the time order • Replays an action if any of its inputs or outputs are changed • Replay is guaranteed to terminate

monotonically increase due to rollback, input changed => rerun reconstruct
output patched code, args optimization for future actions consider Cin rollback to “right before”

Context matching • application stability: assign context identiﬁers

Evaluation

Developer effort • Most of the changes are related to
non-deterministic inputs • Framework wrappers (422 lines in Meteor) ofﬂoad most burdens from the application developer

Benchmarks

Auditing precision

Technique effectiveness Changed code is on the critical path of
all login requests

Performance and overhead • Performance • 5% for an under-loaded
server • 22% for an over-loaded server • Storage • ~ 0.5KB / request

Conclusion • Rail can precisely identify breached data items after
a disclosure in web applications • Provide developers with APIs that help to identify data items, track dependencies, and match up states • Requires few changes to applications • Precise, efﬁcient, and practical

Group meeting: Identifying Information Disclosu...

Group meeting: Identifying Information Disclosure in Web Applications with Retroactive Auditing

More Decks by Yu-Hsin Hung

Other Decks in Research

Featured

Transcript