[ACL 2026 Demo] Fast-MIA: Efficient and Scalable Membership Inference for LLMs

Hiromu Takahashi and Shotaro Ishihara ACL 2026 System Demonstrations Fast-MIA:
Efficient and Scalable Membership Inference for LLMs

uv run --with vllm python main.py \ --config config/sample.yaml 1.
High-throughput batch inference using vLLM (about 5 times faster individually) 2. Cross-method caching architecture (Reduce the total processing time for benchmarking multiple methods) https://github.com/Nikkei/fast-mia Fast-MIA: Efficient and Scalable 2 LLM LOSS vLLM backend batch inference Shared Cache Reuse across methods PPL/zlib Min-K% Prob DC-PDD Lowercase PAC ReCaLL Con-ReCall SaMIA ……

Membership Inference Attack (MIA) on LLMs 3 LLM Is this
text included? Text Pre-training Data • Calculate the log-likelihood, etc. • Various methods have been proposed.

Challenges in MIA on LLMs 4 LLM Is this text
included? Text Pre-training Data • Calculate the log-likelihood, etc. • Various methods have been proposed. 1. Growing computational demands for individual MIA methods. 2. Redundant computation across methods for benchmarking.

We introduce Fast-MIA 5 1. Growing computational demands for individual
MIA methods. 2. Redundant computation across methods for benchmarking. LLM LOSS vLLM backend batch inference Shared Cache Reuse across methods PPL/zlib Min-K% Prob DC-PDD Lowercase PAC ReCaLL Con-ReCall SaMIA …… 1. High-throughput batch inference using vLLM. 2. Cross-method caching architecture.

uv run --with vllm python main.py \ --config config/sample.yaml How
to Use: https://github.com/Nikkei/fast-mia 6 model: model_id: "huggyllama/llama-30b" data: data_path: "swj0419/WikiMIA" format: "huggingface" text_length: 32 methods: - type: "loss"

AUC Reproducibility and Speed 7 Left: Fast-MIA Right: Transformers-based implementations

Inference time (the number of inferences) The cache is working
8

uv run --with vllm python main.py \ --config config/sample.yaml 1.
High-throughput batch inference using vLLM (about 5 times faster individually) 2. Cross-method caching architecture (Reduce the total processing time for benchmarking multiple methods) https://github.com/Nikkei/fast-mia Contributions Welcome 9 LLM LOSS vLLM backend batch inference Shared Cache Reuse across methods PPL/zlib Min-K% Prob DC-PDD Lowercase PAC ReCaLL Con-ReCall SaMIA ……

[ACL 2026 Demo] Fast-MIA: Efficient and Scalabl...

[ACL 2026 Demo] Fast-MIA: Efficient and Scalable Membership Inference for LLMs

Shotaro Ishihara

More Decks by Shotaro Ishihara

Other Decks in Research

Featured

Transcript

Hiromu Takahashi and Shotaro Ishihara ACL 2026 System Demonstrations Fast-MIA:

uv run --with vllm python main.py \ --config config/sample.yaml 1.

Membership Inference Attack (MIA) on LLMs 3 LLM Is this

Challenges in MIA on LLMs 4 LLM Is this text

We introduce Fast-MIA 5 1. Growing computational demands for individual

uv run --with vllm python main.py \ --config config/sample.yaml How

AUC Reproducibility and Speed 7 Left: Fast-MIA Right: Transformers-based implementations

Inference time (the number of inferences) The cache is working

uv run --with vllm python main.py \ --config config/sample.yaml 1.