Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Déjà Vu: Uncovering Stolen Algorithms in Commercial Products

Déjà Vu: Uncovering Stolen Algorithms in Commercial Products

In an ideal world, members of a community work together towards a common goal or greater good. Unfortunately, we do not (yet) live in such a world.

In this talk, we discuss what appears to be a systemic issue impacting our cyber-security community: the theft and unauthorized use of algorithms by corporate entities. Entities who themselves may be part of the community.

First, we’ll present a variety of search techniques that can automatically point to unauthorized code in commercial products. Then we’ll show how reverse-engineering and binary comparison techniques can confirm such findings.

Next, we will apply these approaches in a real-world case study. Specifically, we’ll focus on a popular tool from a non-profit organization that was reverse-engineered by multiple entities such that its core algorithm could be recovered and used (unauthorized), in multiple commercial products.

The talk will end with actionable takeaways and recommendations, as who knows, this may happen to you too! For one, we'll present strategic approaches (and the challenges) of confronting culpable commercial entities (and their legal teams). Moreover, we’ll provide recommendations for corporations to ensure this doesn’t happen in the first place, thus ensuring that our community can remain cohesively focused on its mutual goals.

Patrick Wardle

August 13, 2022
Tweet

More Decks by Patrick Wardle

Other Decks in Technology

Transcript

  1. Déjà Vu
    Uncovering Stolen Algorithms in Commercial Products

    View Slide

  2. WHOIS
    PATRICK WARDLE


    (OBJECTIVE-SEE FOUNDATION)
    TOM MCGUIRE


    (JOHNS HOPKINS UNIVERSITY)

    View Slide

  3. The Hunt
    OUTLINE
    The Victim
    Proving Equivalence
    Describe approaches to uncovering, confirming, and resolving,

    the unauthorized use of "stolen" code in commercial products.
    Goals:
    Resolutions

    View Slide

  4. The Victim
    OverSight

    View Slide

  5. OVERSIGHT
    a mic & webcam monitor (macOS)
    Released @ Virus Bulletin 2016
    Mic & webcam monitor
    Identify process
    Detect piggy-backing
    ...prior to any macOS protections :)
    "killer" feature
    free, but close-sourced

    (open-sourced as of 2021)

    View Slide

  6. OVERSIGHT
    vs. malware
    AVFMediaRecorderControl::setupSessionForCapture(void) proc

    ...

    call AVFCameraSession::state(void)

    call AVFAudioInputSelectorControl::createCaptureDevice(void)

    lea rdx, "Could not connect the video recorder"

    ...

    call QMediaRecorderControl::error(int,QString const&)
    01


    02


    03


    04


    05


    06


    07


    malware: Mokes
    Mokes

    Crisis

    Eleanor

    FruitFly

    ... and many more!

    Webcam "aware" (macOS) malware:
    OverSight vs. Mokes
    Webcam capture via
    AVFoundation

    View Slide

  7. OVERSIGHT
    vs. 0days vulnerabilities
    "A [0day] vulnerability in the Mac Zoom
    Client allows any malicious website to
    enable your camera without your permission"
    bug credit: Jonathan Leitschuh
    "A zero-day [exploited by XCSSET malware]
    allows an attacker to bypass Apple’s TCC
    protections which safeguard privacy"
    analysis: Jamf
    OverSight vs. 0day
    OverSight vs. 0day

    View Slide

  8. OVERSIGHT
    vs. legitimate apps
    "Forget the NSA, it's Shazam that's always listening!"

    objective-see.org/blog/blog_0x13.html
    OverSight vs. Shazam
    Shazam, was always listening :/

    View Slide

  9. PROCESS IDENTIFICATION
    a must! but the APIs aren't helpful
    CMIODeviceID cameraID = ;

    UInt32 isRunning = 0;

    UInt32 propertySize = 0;

    CMIOObjectPropertyAddress propertyStruct = {0};


    propertySize = sizeof(isRunning);

    propertyStruct.mScope = kCMIOObjectPropertyScopeGlobal;

    propertyStruct.mSelector = kAudioDevicePropertyDeviceIsRunningSomewhere;


    CMIOObjectGetPropertyData(cameraID, &propertyStruct, 0, NULL,

    sizeof(kAudioDevicePropertyDeviceIsRunningSomewhere), &propertySize, &isRunning);
    01


    02


    03


    04


    05


    06


    07


    08


    09


    10


    11


    code for:


    "is camera on"
    The system APIs do not tell us:

    which process is accessing the webcam (or microphone)
    }

    View Slide

  10. OVERSIGHT'S PROCESS IDENTIFICATION
    step one: enumerate mach messages (ports)
    Camera/mic daemon

    (eg. VDCAssistant,

    AppleCameraAssistant, etc.)
    mach msg
    # lsmp -p


    ...


    0x00009f1f 0xbf7f289d send-once ->

    0x0000000000000000 0x0001f60b (34770) zoom.us
    lsmp
    /usr/bin/lsmp entitled:
    "com.apple.system-task-ports.read"

    (allows the listing of mach ports)
    Listing mach ports

    (via lsmp)
    Oversight
    Candidate
    processes

    View Slide

  11. programmatically enumerating "mach senders"
    -(NSMutableDictionary*)enumMachSenders:(pid_t)targetProcess {


    //exec 'lsmp' w/ pid of camera assistant to get mach ports

    results = [[NSString alloc] initWithData:execTask(LSMP,

    @[@"-p", @(targetProcess).stringValue], YES) encoding:NSUTF8StringEncoding];

    //parse results

    // split on new line (then look for () process name)

    for(NSString* line in [results componentsSeparatedByCharactersInSet:

    [NSCharacterSet characterSetWithCharactersInString:@"\n"]]) {


    //skip blank lines

    if(0 == line.length) continue;


    //parse on '()'

    subStrings = [line componentsSeparatedByCharactersInSet:

    [NSCharacterSet characterSetWithCharactersInString:@"()"]];


    //sanity check

    if(subStrings.count < 3) continue;


    //extract process id (insides '()', so will be second substring)

    processID = @([[subStrings objectAtIndex:0x1] integerValue]);


    //add/inc to dictionary

    senders[processID] = @([senders[processID] unsignedIntegerValue] + 1);
    01


    02


    03


    04


    05


    06


    07


    08

    09


    10


    11


    12


    13


    14


    15


    16


    17


    18


    19


    20


    21


    22


    23


    24


    25


    OVERSIGHT'S PROCESS IDENTIFICATION
    Executing lsmp (-p pid) & parsing its output
    parse output
    }

    View Slide

  12. step 2: query I/O registry (IOUserClientCreators)
    # ioreg -l


    ...


    | | +-o RootDomainUserClient




    | | | {


    | | | "IOUserClientCreator" = "pid 34770, zoom.us"


    | | | }
    Listing "user clients"

    (via ioreg)
    Candidate

    processes
    OVERSIGHT'S PROCESS IDENTIFICATION
    I/O registry

    View Slide

  13. -(NSMutableDictionary*)enumDomainUserClients {


    //get IOPMrootDomain obj

    matchingService = IOServiceGetMatchingService(kIOMasterPortDefault, IOServiceMatching("IOPMrootDomain"));

    ...


    //get iterator

    IORegistryEntryGetChildIterator(matchingService, kIOServicePlane, &iterator));


    //iterator over all children (looking for 'IOUserClientCreator')

    while((child = IOIteratorNext(iterator))) {


    //get 'IOUserClientCreator'

    creator = IORegistryEntryCreateCFProperty(child, CFSTR("IOUserClientCreator"), kCFAllocatorDefault, 0);


    //parse

    components = [creator componentsSeparatedByCharactersInSet:

    [NSCharacterSet characterSetWithCharactersInString:@" ,"]];


    //extract pid and save

    processID = [NSNumber numberWithShort:[components[0x1] intValue]];


    //add/inc to dictionary

    clients[processID] = @([clients[processID] unsignedIntegerValue] + 1);
    01


    02


    03


    04


    05


    06


    07


    08

    09


    10


    11


    12


    13


    14


    15


    16


    17


    18


    19


    20


    21


    22


    23


    24


    programmatically querying the I/O registry
    OVERSIGHT'S PROCESS IDENTIFICATION
    Querying the I/O registry

    ("IOPMrootDomain/IOService/IOUserClientCreator")
    parse results
    }

    View Slide

  14. #define SAMPLE @"/usr/bin/sample"


    //invoke 'sample' to confirm candidates are using CMIO/video/av inputs

    // note: audio/video invoke 'CMIOGraph::DoWork' (will be in call stack)

    -(NSMutableArray*)sampleCandidates:(NSArray*)currentSenders {


    ...


    //exec 'sample' to get stack trace of process

    results = [[NSString alloc] initWithData:execTask(SAMPLE,

    @[processID.stringValue, @"1"], YES) ...];


    //check for 'CMIOGraph::DoWork'

    // note: both audio/video invoke this

    if(YES != [results containsString:@"CMIOGraph::DoWork"]) {


    //save process

    }
    01


    02


    03


    04


    05


    06


    07


    08

    09


    10


    11


    12


    13


    14


    15


    16


    17


    18


    step 3: sample process and examine stack trace
    CMIOGraph::DoWork
    mic/webcam API's

    in stack trace?
    /usr/bin/sample: entitled with "task_for_pid-allow"

    (which allows it to read (sample) other processes' memory)
    OVERSIGHT'S PROCESS IDENTIFICATION
    Executing sample (w/ pid) to obtain stack trace

    View Slide

  15. RELIABLE PROCESS IDENTIFICATION
    ...for both the mic & webcam
    ...and popularity!

    View Slide

  16. The Hunt
    OverSight ...in other products !?

    View Slide

  17. HOW WOULD THIS EVEN BE POSSIBLE?
    ...by reverse-engineering the OverSight application
    -(NSMutableDictionary*)enumMachSenders:(pid_t)targetProcess {

    ...

    NSArray* arguments = @[@"-p", @(targetProcess).stringValue];

    NSMutableDictionary* results = [NSMutableDictionary dictionary];


    NSData* results = execTask(@"/usr/bin/lsmp", arguments, YES);

    01


    02


    03


    04


    05


    06


    Reconstructed code
    Free!

    (OverSight)
    Commercial

    (...not OverSight)
    Technically trivial,


    ...ethically & legally; questionable!
    OverSight's disassembly

    View Slide

  18. OVERSIGHT'S ALGORITHM IS QUITE UNIQUE
    ...and is a bit janky (+ as we'll see, broke)
    Approach is a bit "janky"

    (what, no regex?)
    Approach is very unique:
    No Google results

    (for "CMIOGraph::DoWork")
    No "great matches"

    (for I/O registry keys)

    View Slide

  19. HOW IT ALL BEGAN (2018)
    ...via malware analysis
    Flagged binary

    (on VirusTotal)
    "Riskware" / "PUP"

    (potential unwanted program)
    method "enumMachSendersForProcess:"


    which execs /usr/bin/lsmp !?


    ...just like OverSight !?
    Familiar code?

    View Slide

  20. HUNTING VIA GOOGLE
    bug(s) in OverSight ...in others too!?
    Same bug(s)!?

    ...but these aren't OverSight!
    Various changes in macOS triggered issues in
    other (non-Objective-See) programs too !?
    Bugs in OverSight

    View Slide

  21. HUNTING VIA YARA RULES
    ...and running across corpuses of binaries
    private rule Macho {

    condition:

    uint32(0) == 0xfeedface or uint32(0) == 0xcefaedfe ...

    }


    rule lsmp {


    strings:

    $a = "lsmp"

    condition:

    Macho and $a

    }
    A match, with method named

    "getVDCDictionaryWithPid:"

    (that invokes lsmp)

    View Slide

  22. Confirming Equivalency
    ...the code doesn't lie !

    View Slide

  23. PRODUCT #1
    use of lsmp & parsing
    }
    method: "enumMachSenders:"
    method: "enumMachSendersForProcess:"
    not OverSight
    }
    Invoke lsmp (-p )
    Same:
    Ignore if less than 3 substrings
    Check for "-" character
    Split line "()"
    OverSight

    View Slide

  24. PRODUCT #1
    I/O registry query & parsing
    method: "enumDomainUserClients:"
    method: "enumDomainUserClients:"
    not OverSight
    } }
    Query:

    "IOPMrootDomain/IOService/IOUserClientCreator"
    Same:
    Split on " ,"
    OverSight

    View Slide

  25. PRODUCT #1
    sample process & string matching
    not OverSight
    Sample (candidate) process
    Same:
    Look for "CMIOGraph::DoWork"
    Delete sample's output file
    OverSight

    View Slide

  26. PRODUCT #2
    use of lsmp & parsing
    Split (lsmp) output on newlines ("\n")
    Leave if no lines (count = 0)
    Split line "()"
    Ignore if less than 3 substrings
    not OverSight
    Same:
    OverSight

    View Slide

  27. PRODUCT #2
    I/O registry query & parsing
    Matching Service: "IOPMrootDomain"
    Get Child Iterator: "IOService"
    Parse result, by splitting on " ,"
    "IOUserClientCreator"
    not OverSight
    Same:
    OverSight
    Get Property Value: "IOUserClientCreator"

    View Slide

  28. PRODUCT #3
    use of lsmp & parsing
    Invoke lsmp (-p )
    Split on new lines ("\n")
    Same:
    not OverSight
    OverSight

    View Slide

  29. PRODUCT #3
    I/O registry query & parsing
    Matching Service: "IOPMrootDomain"
    Get Child Iterator: "IOService"
    Get Property Value: "IOUserClientCreator"
    not OverSight
    Same:
    OverSight

    View Slide

  30. PRODUCT #3
    Invoke sample
    Look for "CMIOGraph::DoWork"


    (in output from sample)
    sample process & string matching
    Same:
    not OverSight
    OverSight

    View Slide

  31. Resolutions
    and now what?

    View Slide

  32. APPROACH
    to resolve the matter
    Define your goals
    }
    Create Proof
    Reach out directly
    Professionally
    + your proof
    + goals for resolution
    Consult a lawyer
    eff.org
    optional, but recommended !
    Money? Talks? Fixes?

    View Slide

  33. REASONABLE RESOLUTIONS
    ...what corporations (generally) want
    A license agreement
    An amicable resolution
    The majority of cases, the infringement is the work of a

    single (naive?) developer vs. malice of the entire corporation.
    Non-disparagement
    ...and will pay!
    covering them legally "optics"

    View Slide

  34. WIN-WIN RESOLUTION
    ack'd + code removal & financial compensation
    Acknowledgement
    Code Removal
    Retroactive license (w/ compensation)

    View Slide

  35. ANOTHER WIN-WIN RESOLUTION
    ack'd + code removal & financial compensation
    Acknowledgement
    Code Removal
    Financial compensation

    View Slide

  36. Conclusions
    +takeaways

    View Slide

  37. FOR DEVELOPERS
    Assume your code will be stolen

    (regardless if it's closed or open source).
    Hunt, confirm, approach

    ...and resolve (+recover reparations)

    View Slide

  38. FOR CORPORATIONS
    Educate your employees (developers)
    tl;dr don't steal other ppl algorithms for commercial gain
    Implement internal procedures (scans, etc.)

    ("Is this original code? Or where did you get it from?")
    If approached, work to amicably resolve any issues
    legal issues
    "optics"

    View Slide

  39. MORE ON PROGRAM ANALYSIS
    ...topics of reversing, debugging, & malware
    Free: taomm.org

    For sale: amazon, etc...

    View Slide

  40. MAHALO!
    "Friends of Objective-See"
    Guardian Mobile Firewall
    SmugMug iVerify Halo Privacy

    View Slide

  41. RESOURCES:
    Déjà Vu
    "Zoom Zero Day: 4+ Million Webcams & maybe an RCE?"

    https://infosecwriteups.com/zoom-zero-day-4-million-webcams-maybe-an-rce-just-get-them-to-visit-your-website-
    ac75c83f4ef5


    "OverSight: Exposing Spies on macOS"

    https://speakerdeck.com/patrickwardle/hack-in-the-box-2017-oversight-exposing-spies-on-macos


    "Zero-Day TCC bypass discovered in XCSSET malware"

    https://www.jamf.com/blog/zero-day-tcc-bypass-discovered-in-xcsset-malware/


    Electronic Frontier Foundation

    https://eff.org
    Uncovering Stolen Algorithms in Commercial Products

    View Slide