Add multi-GPU vLLM CAA steering hook by linmou · Pull Request #694 · zjunlp/EasyEdit

linmou · 2026-06-19T04:06:26Z

Summary

This PR adds a lightweight CAA activation-add hook for vLLM inference, including tensor-parallel worker support.

The hook lets EasyEdit apply already-computed CAA vectors to vLLM-loaded models by installing activation additions on selected decoder layers inside each worker process.

What Changed

Added steer/vllm_caa_hooks.py
- install and clear CAA vectors on vLLM model layers
- worker RPC helpers for tensor-parallel vLLM engines
- hook call/configuration stats
- support for common vLLM worker model layouts
Added tests/test_vllm_caa_hooks.py
- focused hook-only unit tests with fake vLLM-style models/workers
- no dataset dependency
- no LLM judge dependency
- no generated artifacts
Added examples/vllm_caa_gpu_e2e.py
- lightweight optional real-GPU smoke test
- records baseline, steered, and restored outputs
- records worker install/clear results and hook stats
Added docs/vllm_caa_multigpu_hook.md
- documents runtime API and lightweight validation commands

Validation

pytest -q tests/test_vllm_caa_hooks.py

Result:

9 passed

python -m compileall steer/vllm_caa_hooks.py examples/vllm_caa_gpu_e2e.py tests/test_vllm_caa_hooks.py

Result: passed.

git diff --check main...HEAD

Result: passed.

Optional GPU smoke command:

CUDA_VISIBLE_DEVICES=0,1 \
VLLM_USE_FLASHINFER_SAMPLER=0 \
VLLM_ALLOW_INSECURE_SERIALIZATION=1 \
python examples/vllm_caa_gpu_e2e.py \
  --model /path/to/model \
  --tensor-parallel-size 2 \
  --layer 12 \
  --multiplier 0.0 \
  --vector-value 0.0 \
  --output /tmp/vllm_caa_gpu_e2e.json \
  --monitor-output /tmp/vllm_caa_gpu_e2e.gpu.csv

Notes

The larger steering-effect consistency experiments used during development are not included in this PR. This keeps the runtime contribution focused on the multi-GPU vLLM CAA hook and lightweight validation.

Related Issue

Closes #695

Codex added 4 commits June 18, 2026 23:53

add vllm caa multigpu hook

df6f2cc

docs: finalize vllm caa hook note

6c2dde0

test: clean vllm caa hook pr polish

2e92b60

docs: use branch metadata for vllm caa note

f347eb5

linmou mentioned this pull request Jun 19, 2026

Add multi-GPU vLLM support for CAA steering hooks #695

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add multi-GPU vLLM CAA steering hook#694

Add multi-GPU vLLM CAA steering hook#694
linmou wants to merge 4 commits into
zjunlp:mainfrom
linmou:pr/vllm-caa-multigpu-hook

linmou commented Jun 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

linmou commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Validation

Notes

Related Issue

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

linmou commented Jun 19, 2026 •

edited

Loading