Operating Operator

Operating Operator Jun Kokatsu @shhnjk

What is Operator? Computer-Use Agent (CUA) by OpenAI.

Operator Demo https://youtu.be/CSE77wAdDLg?t=162

Prompt Injection - Major threat to AI agents • AI
agents (and LLM in general) follow instructions in data sources fed to them. • For Operator, attacker-controlled data are mainly the URL and screenshot. • To mitigate Prompt Injection and other risks such as misaligned actions, Operator has (at least) 3 mitigations in place.

Safety/Security mitigations in Operator 1. Malicious instruction detection: evaluates the
screenshot image and checks if it contains adversarial content that may change the model's behavior. 2. Irrelevant domain detection: evaluates the current_url and checks if the current domain is considered relevant given the conversation history. 3. Sensitive domain detection: checks the current_url and raise a warning when it detects the user is on a sensitive domain. Reference: https://platform.openai.com/docs/guides/tools-computer-use#acknowledge-safety-checks

Restrictions in Operator’s browser The following features are disabled by
Operator’s Chrome enterprise policy. • Download of active contents (e.g. HTML, executables, etc). • Chrome devtools access. • Additional Chrome extension installation. • Navigation to certain URLs (e.g. javascript:, devtools:, most/all chrome:, etc).

Finding bugs in Operator • Session cookies are persisted across
tasks in Operator, so users will be authenticated to their frequently used websites. • Common exploitations of Prompt Injection are rogue actions and exﬁltration. • But exploits need to evade all 3 mitigations in Operator. • Or ﬁnd attacks outside of the mitigations’ threat model :)

Thinking deeply about browsing… When we want to read news
on the Web (as an example), we sometimes have to perform unrelated tasks, such as: • Solving CAPTCHA. • Allow or deny a cookie popup. • Dismissing promotion/subscription/notiﬁcation popups.

Prompt Inception • An attacker presents one or more sub-tasks
to an agent. • While these sub-tasks are required or relevant to the main task, performing the crafted sub-tasks would result in rogue actions or exﬁltration.

Prompt Inception • An attacker presents one or more sub-tasks
to an agent. • While these sub-tasks are required or relevant to the main task, performing the crafted sub-tasks would result in rogue actions or exﬁltration. • Let’s dive into real examples of this attack!

Sensitive cross-origin iframes Embeddable cross-origin resources (without X-Frame-Options) can sometime
contain secrets. • The page is read-only, and there is no threat of clickjacking (e.g. API endpoints). • The page has implemented other forms of mitigations against clickjacking, such as Intersection Observer API. Operator is not restricted to the SOP, and it can see such cross-origin resources.

Google One Tap • One such example is Google One
Tap. • It shows the user’s name and email address inside an accounts.google.com iframe (when set up without FedCM).

Data exfiltration from cross-origin iframes • Crafted a CAPTCHA-like page
with only showing the email address portion of Google One Tap iframe. • Operator successfully(?) solved the sub-task!

Detecting Operator from a website We only want to show
crafted pages to Operator, and not a user. • Operator’s browser comes with an unpublished chrome extension installed by default. • locale.js is exposed as a web accessible resource to all sites. • We can use the onload event in a script tag to detect Operator. <script src="chrome-extension://kcdongibgcplmaagnmgpjhpjgmmaaaaa/locale.js" onload="operatorDetected()"></script>

Demo & Details Video: https://youtu.be/wDVIvoaGZRQ Details: https://github.com/google/security-research/security/advisories/GHSA-5289-qv3f-x67g

Sensitive cross-origin URLs Cross-origin URLs sometimes contain sensitive information, such
as: • OAuth code in the URL parameter during an OAuth ﬂow. • Proﬁle redirection URL such as facebook.com/me. How can we can steal post-redirect cross-origin URLs through Operator 🤔

Exfiltration of cross-origin URLs Craft a page with a link
that: 1. Redirects to an OAuth ﬂow which stops when the OAuth code is present in the URL. 2. (When Operator returns to the main page) Tells Operator that there was an error and asks it to report the error by sharing the URL.

Demo & Details Video: https://youtu.be/i9zbeiw-gTo Details: https://github.com/google/security-research/security/advisories/GHSA-25j5-vvch-9rf3

Can we abuse a browser feature? Browsers are built to
be used by humans, not AI agents. And some critical decision makings are delegated to humans. Such as: • Permission prompts. • Fullscreen mode notiﬁcation. • etc. Can we craft a page to abuse these features?

Fullscreen mode notification • Any website can use fullscreen API
to trigger fullscreen mode with a user interaction (e.g. click). • When a website enters fullscreen mode, a notiﬁcation will appear for 5 seconds. • Operator actually notices this and exits fullscreen mode!!

Misdirection to the rescue • When the fullscreen notiﬁcation appears
on the screen, a malicious site can show more attention-drawing popup (e.g. cookie consent dialog). • When this happens, Operator is focused on closing popups, and forgets about fullscreen notiﬁcation. • A crafted page can show a fake browser when entered into fullscreen mode, and Operator will actuate inside the fake browser thereafter (within the same conversation). ◦ E.g. An attacker can show login screen of an arbitrary site, and a user won’t be able to tell it’s a fake website because everything looks legit. • This technique is called Misdirection in magic.

Demo & Details Video: https://youtu.be/vc8O5MylUUE Details: https://github.com/google/security-research/security/advisories/GHSA-mmgx-755h-wr74

Conclusion • As AI agents become more capable and personalized,
the nature of tasks assigned to AI agents will become more complex and vague. ◦ This will open more avenues for Prompt Inception in the future. • Users will demand more autonomy and less conﬁrmations. ◦ This might look doable from perspectives of evals, but we can only evaluate risks we know about. • World is built around humans, not AI agents. There maybe consequences of increasing autonomy of AI agents that we don’t realize until it is deployed.

Operating Operator

Operating Operator

Jun Kokatsu

More Decks by Jun Kokatsu

Other Decks in Technology

Featured

Transcript