Proof of Inference: Verifying the Integrity of AI Predictions

• Michele Dallachiesa - Build & Derisk your ML/AI applications
• Warden Labs - Blockchain infrastructure for safe AI Proof of Inference: Verifying the Integrity of AI Predictions

“Everyone cheats if the incentives are right.” — The ﬁrst
rule of Freakonomics

Volkswagen emissions scandal (2015) • Deliberately modiﬁed emissions software to
cheat regulatory tests, reducing emissions during tests while exceeding legal limits in real driving conditions • $33.3 billion in ﬁnes, penalties, settlements and buyback costs Source: https://en.wikipedia.org/wiki/Volkswagen_emissions_scandal

CMA investigation into Ticketmaster over Oasis concert sales (2024) •
Ticketmaster's failure to inform Oasis fans of dynamic pricing • 2.2x increase in revenues from $200 million to $450 million • UK’s Competition and Markets Authority investigation (CMA) CMA: UK’s Competition and Markets Authority Source: https://www.gov.uk/government/news/cma-launches-investigation-into-ticketmaster-over-oasis-concert-sales

GenAI - Cheating on text summarisation (near future) Source: https://aws.amazon.com/bedrock/pricing
| https://openai.com/api/pricing/ | https://www.youtube.com/watch?v=YCKVxXrcZ-0 *Hypothetical! • Monthly cost to summarise meeting notes at ﬁreﬂies.ai* • Financial incentive to reduce computational costs

How can we enforce transparency for ML/AI inference workloads? TEEML
ZKML SPEX Our proposal

ML on Trusted Execution Environment (TEEML) • Data conﬁdentiality blocks
external entities from reading data • Code integrity prevents unauthorized code changes

2024.08.26 - Root Provisioning Key and Root Sealing Key compromised
on Intel SGX Source: https://news.ycombinator.com/item?id=41359152 • 20-30% time overhead • Requires specialized HW • Not all AI models supported

Zero-Knowledge Machine Learning (ZKML) • Private inference by proving model
predictions without revealing model or input data • Ensures correctness of ML outputs without exposing underlying computations Prover: demonstrates knowledge of a secret without revealing it Veriﬁer: conﬁrms the proof's validity without learning the secret

I will show you a picture. Where is Stephen Hawking?

You can prove me you know its location without disclosing
it …

Image folded and concealed behind sheet with hole, keeping the
precise location hidden. Source: https://www.youtube.com/watch?v=fOGdb1CTu5c Prover: demonstrates knowledge of a secret without revealing it Veriﬁer: conﬁrms the proof's validity without learning the secret

• 1000x slower and more expensive • Long setup time
◦ Model adaptation ◦ Compilation to ZK circuit • No support for all ML/AI models (Gb)

Statistical Proof of Execution (SPEX)

• Five unique messages M0, M1, M2, M3, M4 •
Each message is hashed using hash functions H1, H2 • H1, H2 functions map data of arbitrary size to ﬁxed-size values Source: https://en.wikipedia.org/wiki/Hash_function Hashing functions

Bloom ﬁlters • Space-efﬁcient, probabilistic data structure to test set
membership • Example: Set {x, y, z} ◦ Colored arrows show bit positions for each set element ◦ Element w not in set because hashed to at least a zero Source: https://en.wikipedia.org/wiki/Bloom_filter H1, H2, H3 hash functions

Computational pipelines • Computational pipeline transforms input state s into
output state e passing by states a1 … b3 • Flow can be sequential or parallel • Computing a3 doesn’t require b1 • Computing b3 requires ﬁrst computing b2 and b1

ML/AI computational pipelines with parallel and sequential ﬂows

• In ML/AI pipelines, states are inputs, model weights, intermediate
results, outputs, … • Bloom filter as “computation certificate” • Answering questions like “Was a3 reached?” Hashing computation pipelines with Bloom filters

Protocol

Example: Batch inference (solver, Prover) y5 prediction depends only on
x5 input

Example: Batch inference (Veriﬁer) Veriﬁcation requires partial rerun

Lazy Solver • How can I construct a Bloom ﬁlter
that always returns “found” for any input state? • How can I ﬁgure out which predictions will the Validator verify?

Attacking and protecting Bloom filters 1. Estimate expected false positive
rate from Bloom filter size and insertions count 2. If actual false positive rate exceeds the expected rate, always return Failed Attack: Easy to fabricate "full" Bloom filters that always return "found" - just fill them with ones Protect:

• Verifier randomly selects inputs with limited computational overhead •
Solver doesn’t know which predictions the Verifier will reproduce and validate Game-theoretic guarantees on verifiable inference • Best with parallel flows, with Verifier bypassing prior independent states

Hashing ﬂoats 1. Apply scaling factor 2. Cast to numpy.int64
Minor variations due to ﬂoating-point precision limitations and differences in execution order of arithmetic operations

Semantic hashing • What if scaling and casting ﬂoats is
not sufﬁcient? • Equally correct outputs, LLMs, text with similarities yet differences, ….

Conclusion • SPEX supports all ML/AI models, data pipelines •
SPEX is 10-20x faster for new models integration • SPEX is 1000x faster and cheaper than ZKML • SPEX is 20% faster and cheaper than TEEML • SPEX no privacy on model and data • SPEX game-theoretic probabilistic guarantees • SPEX no dependency on circuit/VM or specialized HW Pros How

• Michele Dallachiesa - Build & Derisk your ML/AI applications
• Warden Labs - Blockchain infrastructure for safe AI Thank You! [email protected]

• Warden Labs, https://wardenprotocol.org • When Bloom filters don't bloom,
https://blog.cloudflare.com/when-bloom-filters-dont-bloom/ • “Proof of Sampling: A Nash Equilibrium-Secured Verification Protocol for Decentralized Systems”, Hyperbolic Labs • “Atoma Network Whitepaper”, Atoma • “opML: Optimistic Machine Learning on Blockchain”, Hyper Oracle • “Proof-of-Learning: Definitions and Practice”, University of Toronto / Vector Institute / University of Wisconsin-Madison • “Experimenting with Zero-Knowledge Proofs of Training”, University of California, Berkeley / Meta AI / NTT Research / University of Wisconsin, Madison • “ZKML: An Optimizing System for ML Inference in Zero-Knowledge Proofs”, UIUC / UC Berkeley / Stanford University • “Freakonomics: A Rogue Economist Explores the Hidden Side of Everything”, https://en.wikipedia.org/wiki/Freakonomics References

Artiﬁcial Intelligence Blockchain Interface (AIBI) • SPEX on blockchain •
Coordination layer • No single point of failure • Consensus • Auditability • Explainability • Transparency • Privacy (MPC over SPEX)

GenAI - Cheating on RAGs and embeddings (simulation) Source: https://aws.amazon.com/bedrock/pricing
- https://openai.com/api/pricing/ • Cost to embed 2.5 million arXiv papers • Financial incentive to use cheaper models

Proof of Inference: Verifying the Integrity of ...

Proof of Inference: Verifying the Integrity of AI Predictions

Michele Dallachiesa

More Decks by Michele Dallachiesa

Featured

Transcript

• Michele Dallachiesa - Build & Derisk your ML/AI applications

“Everyone cheats if the incentives are right.” — The ﬁrst

Volkswagen emissions scandal (2015) • Deliberately modiﬁed emissions software to

CMA investigation into Ticketmaster over Oasis concert sales (2024) •

GenAI - Cheating on text summarisation (near future) Source: https://aws.amazon.com/bedrock/pricing

How can we enforce transparency for ML/AI inference workloads? TEEML

ML on Trusted Execution Environment (TEEML) • Data conﬁdentiality blocks

2024.08.26 - Root Provisioning Key and Root Sealing Key compromised

Zero-Knowledge Machine Learning (ZKML) • Private inference by proving model

I will show you a picture. Where is Stephen Hawking?

You can prove me you know its location without disclosing

Image folded and concealed behind sheet with hole, keeping the

• 1000x slower and more expensive • Long setup time

Statistical Proof of Execution (SPEX)

• Five unique messages M0, M1, M2, M3, M4 •

Bloom ﬁlters • Space-efﬁcient, probabilistic data structure to test set

Computational pipelines • Computational pipeline transforms input state s into

ML/AI computational pipelines with parallel and sequential ﬂows

• In ML/AI pipelines, states are inputs, model weights, intermediate

Protocol

Example: Batch inference (solver, Prover) y5 prediction depends only on

Example: Batch inference (Veriﬁer) Veriﬁcation requires partial rerun

Lazy Solver • How can I construct a Bloom ﬁlter

Attacking and protecting Bloom ﬁlters 1. Estimate expected false positive

• Veriﬁer randomly selects inputs with limited computational overhead •

Hashing ﬂoats 1. Apply scaling factor 2. Cast to numpy.int64

Semantic hashing • What if scaling and casting ﬂoats is

Conclusion • SPEX supports all ML/AI models, data pipelines •

• Michele Dallachiesa - Build & Derisk your ML/AI applications

• Warden Labs, https://wardenprotocol.org • When Bloom ﬁlters don't bloom,

Artiﬁcial Intelligence Blockchain Interface (AIBI) • SPEX on blockchain •

GenAI - Cheating on RAGs and embeddings (simulation) Source: https://aws.amazon.com/bedrock/pricing