Slide 1

Slide 1 text

● Michele Dallachiesa - Build & Derisk your ML/AI applications ● Warden Labs - Blockchain infrastructure for safe AI Proof of Inference: Verifying the Integrity of AI Predictions

Slide 2

Slide 2 text

“Everyone cheats if the incentives are right.” — The first rule of Freakonomics

Slide 3

Slide 3 text

Volkswagen emissions scandal (2015) ● Deliberately modified emissions software to cheat regulatory tests, reducing emissions during tests while exceeding legal limits in real driving conditions ● $33.3 billion in fines, penalties, settlements and buyback costs Source: https://en.wikipedia.org/wiki/Volkswagen_emissions_scandal

Slide 4

Slide 4 text

CMA investigation into Ticketmaster over Oasis concert sales (2024) ● Ticketmaster's failure to inform Oasis fans of dynamic pricing ● 2.2x increase in revenues from $200 million to $450 million ● UK’s Competition and Markets Authority investigation (CMA) CMA: UK’s Competition and Markets Authority Source: https://www.gov.uk/government/news/cma-launches-investigation-into-ticketmaster-over-oasis-concert-sales

Slide 5

Slide 5 text

GenAI - Cheating on text summarisation (near future) Source: https://aws.amazon.com/bedrock/pricing | https://openai.com/api/pricing/ | https://www.youtube.com/watch?v=YCKVxXrcZ-0 *Hypothetical! ● Monthly cost to summarise meeting notes at fireflies.ai* ● Financial incentive to reduce computational costs

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

How can we enforce transparency for ML/AI inference workloads? TEEML ZKML SPEX Our proposal

Slide 8

Slide 8 text

ML on Trusted Execution Environment (TEEML) ● Data confidentiality blocks external entities from reading data ● Code integrity prevents unauthorized code changes

Slide 9

Slide 9 text

2024.08.26 - Root Provisioning Key and Root Sealing Key compromised on Intel SGX Source: https://news.ycombinator.com/item?id=41359152 ● 20-30% time overhead ● Requires specialized HW ● Not all AI models supported

Slide 10

Slide 10 text

Zero-Knowledge Machine Learning (ZKML) ● Private inference by proving model predictions without revealing model or input data ● Ensures correctness of ML outputs without exposing underlying computations Prover: demonstrates knowledge of a secret without revealing it Verifier: confirms the proof's validity without learning the secret

Slide 11

Slide 11 text

I will show you a picture. Where is Stephen Hawking?

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

You can prove me you know its location without disclosing it …

Slide 14

Slide 14 text

Image folded and concealed behind sheet with hole, keeping the precise location hidden. Source: https://www.youtube.com/watch?v=fOGdb1CTu5c Prover: demonstrates knowledge of a secret without revealing it Verifier: confirms the proof's validity without learning the secret

Slide 15

Slide 15 text

● 1000x slower and more expensive ● Long setup time ○ Model adaptation ○ Compilation to ZK circuit ● No support for all ML/AI models (Gb)

Slide 16

Slide 16 text

Statistical Proof of Execution (SPEX)

Slide 17

Slide 17 text

● Five unique messages M0, M1, M2, M3, M4 ● Each message is hashed using hash functions H1, H2 ● H1, H2 functions map data of arbitrary size to fixed-size values Source: https://en.wikipedia.org/wiki/Hash_function Hashing functions

Slide 18

Slide 18 text

Bloom filters ● Space-efficient, probabilistic data structure to test set membership ● Example: Set {x, y, z} ○ Colored arrows show bit positions for each set element ○ Element w not in set because hashed to at least a zero Source: https://en.wikipedia.org/wiki/Bloom_filter H1, H2, H3 hash functions

Slide 19

Slide 19 text

Computational pipelines ● Computational pipeline transforms input state s into output state e passing by states a1 … b3 ● Flow can be sequential or parallel ● Computing a3 doesn’t require b1 ● Computing b3 requires first computing b2 and b1

Slide 20

Slide 20 text

ML/AI computational pipelines with parallel and sequential flows

Slide 21

Slide 21 text

● In ML/AI pipelines, states are inputs, model weights, intermediate results, outputs, … ● Bloom filter as “computation certificate” ● Answering questions like “Was a3 reached?” Hashing computation pipelines with Bloom filters

Slide 22

Slide 22 text

Protocol

Slide 23

Slide 23 text

Example: Batch inference (solver, Prover) y5 prediction depends only on x5 input

Slide 24

Slide 24 text

Example: Batch inference (Verifier) Verification requires partial rerun

Slide 25

Slide 25 text

Lazy Solver ● How can I construct a Bloom filter that always returns “found” for any input state? ● How can I figure out which predictions will the Validator verify?

Slide 26

Slide 26 text

Attacking and protecting Bloom filters 1. Estimate expected false positive rate from Bloom filter size and insertions count 2. If actual false positive rate exceeds the expected rate, always return Failed Attack: Easy to fabricate "full" Bloom filters that always return "found" - just fill them with ones Protect:

Slide 27

Slide 27 text

● Verifier randomly selects inputs with limited computational overhead ● Solver doesn’t know which predictions the Verifier will reproduce and validate Game-theoretic guarantees on verifiable inference ● Best with parallel flows, with Verifier bypassing prior independent states

Slide 28

Slide 28 text

Hashing floats 1. Apply scaling factor 2. Cast to numpy.int64 Minor variations due to floating-point precision limitations and differences in execution order of arithmetic operations

Slide 29

Slide 29 text

Semantic hashing ● What if scaling and casting floats is not sufficient? ● Equally correct outputs, LLMs, text with similarities yet differences, ….

Slide 30

Slide 30 text

Conclusion ● SPEX supports all ML/AI models, data pipelines ● SPEX is 10-20x faster for new models integration ● SPEX is 1000x faster and cheaper than ZKML ● SPEX is 20% faster and cheaper than TEEML ● SPEX no privacy on model and data ● SPEX game-theoretic probabilistic guarantees ● SPEX no dependency on circuit/VM or specialized HW Pros How

Slide 31

Slide 31 text

● Michele Dallachiesa - Build & Derisk your ML/AI applications ● Warden Labs - Blockchain infrastructure for safe AI Thank You! [email protected]

Slide 32

Slide 32 text

● Warden Labs, https://wardenprotocol.org ● When Bloom filters don't bloom, https://blog.cloudflare.com/when-bloom-filters-dont-bloom/ ● “Proof of Sampling: A Nash Equilibrium-Secured Verification Protocol for Decentralized Systems”, Hyperbolic Labs ● “Atoma Network Whitepaper”, Atoma ● “opML: Optimistic Machine Learning on Blockchain”, Hyper Oracle ● “Proof-of-Learning: Definitions and Practice”, University of Toronto / Vector Institute / University of Wisconsin-Madison ● “Experimenting with Zero-Knowledge Proofs of Training”, University of California, Berkeley / Meta AI / NTT Research / University of Wisconsin, Madison ● “ZKML: An Optimizing System for ML Inference in Zero-Knowledge Proofs”, UIUC / UC Berkeley / Stanford University ● “Freakonomics: A Rogue Economist Explores the Hidden Side of Everything”, https://en.wikipedia.org/wiki/Freakonomics References

Slide 33

Slide 33 text

Artificial Intelligence Blockchain Interface (AIBI) ● SPEX on blockchain ● Coordination layer ● No single point of failure ● Consensus ● Auditability ● Explainability ● Transparency ● Privacy (MPC over SPEX)

Slide 34

Slide 34 text

GenAI - Cheating on RAGs and embeddings (simulation) Source: https://aws.amazon.com/bedrock/pricing - https://openai.com/api/pricing/ ● Cost to embed 2.5 million arXiv papers ● Financial incentive to use cheaper models

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

No content