BPF cgroup-deny enforcement
Phase 4 of the BPF Incident Response Roadmap. Optional in-kernel denial of outbound connections that match a Phase 3 detection (direct SMTP egress is the only gate landed today). Defaults are all-safe; operators flip live denial only after Phase 3 telemetry review.
What it does
When bpf_enforcement.enabled=true, direct_smtp_egress=true, the
connection tracker is running on BPF, and all dry-run layers are false:
- The cgroup/connect4 + cgroup/connect6 BPF program inspects each outbound TCP connect.
- If destination port is in the protected set AND the source UID is not in the safe-UID map AND the gated detector matches, the program returns 0 (kernel denies the connect).
- Userspace observes the decision via the
decisionfield on the ringbuf event and emits an audit-log entry.
When any dry-run layer is true (the default), the program emits the decision but always returns 1 (allow). Operators can run dry-run for as long as they need to gather telemetry before flipping to live denial.
What it does NOT do
- It does NOT wait on remote verdict callbacks in-kernel. That would add HTTP latency to every connect. The verdict callback (if enabled) runs in userspace after the BPF decision and enriches the emitted finding; it cannot undo a kernel denial.
- It does NOT enforce on UDP, ICMP, or non-cgroup paths.
- It does NOT replace any Phase 3 detection. Detections still run regardless; enforcement is a separate, layered control.
Configuration
bpf_enforcement:
enabled: false # master switch; default off
dry_run: true # safety default; flip after telemetry review
direct_smtp_egress: false # gate enforcement on the Phase 3 detector
verdict_callback: false # userspace post-decision callback
bpf_enforcement.enabled=true requires at least one feature gate.
Today the only gate is direct_smtp_egress, which itself requires
detection.direct_smtp_egress.enabled=true. The connection tracker
backend must be auto or bpf, and the direct SMTP backend must be
auto or bpf.
Kernel requirements
- Linux >= 4.10 with
CONFIG_CGROUP_BPF=y. cgroup/connect4andcgroup/connect6BPF program types.- The capability surface
bpf_enforcement.available.v1is the wire signal that the binary supports the feature; combined withbpf_enforcement_activeon the health snapshot, operators can detect both feature presence and runtime state.
On older kernels or default builds without the BPF tag,
detection.connection_tracker_backend: auto falls back to the legacy
/proc/net/tcp[6] poller. In that state direct SMTP findings still
work when detection.direct_smtp_egress.backend is auto or
legacy, but BPF enforcement is inactive.
When CSM attempts BPF and cannot start it, it emits a
bpf_unavailable finding. The message reports whether the daemon is
running on a fallback backend or has no live fallback active.
Metrics
csm_bpf_enforcement_decisions_total{decision="allow|dry_run|deny"}csm_bpf_enforcement_uid_map_refresh_total– successful periodic refreshes of the safe-UID BPF map.csm_bpf_enforcement_uid_map_refresh_failures_total– failed refreshes (e.g. /etc/passwd unreadable).
Dry-run precedence
Three independent dry_run knobs interact:
auto_response.dry_run(global): suppresses every automatic action (firewall block, kill, etc.).detection.direct_smtp_egress.dry_run: detector-scoped action knob.bpf_enforcement.dry_run: kernel-side denial knob.
Rule: any dry_run=true wins. Live denial requires all three to be false at the layer they apply, plus a BPF runtime backend. Defaults are dry_run=true everywhere on first install.
Rollout recipe
- Phase 3 detector enabled, no Phase 4 wiring. Watch
csm_direct_smtp_egress_findings_totalfor a week. - Phase 4 enabled with
dry_run: true. Watchcsm_bpf_enforcement_decisions_total{decision="dry_run"}and confirm dry-run denials track expected hosted-account egress. - Phase 4 dry_run=false on a single canary host. Audit incidents for false positives.
- Roll out to fleet.