CSM - Continuous Security Monitor
Security monitoring and response for Linux web servers. Single Go binary that detects compromise, phishing, mail abuse, and suspicious activity - then auto-responds and alerts within seconds.
Originally designed as a full Imunify360 replacement for cPanel/WHM on CloudLinux/AlmaLinux. Also runs on plain Ubuntu/Debian + Nginx/Apache and on plain AlmaLinux/Rocky/RHEL + Apache/Nginx: the daemon auto-detects the OS, control panel, and web server at startup and picks the correct log paths, config candidates, and check set.
Includes nftables firewall (replaces LFD/fail2ban), ModSecurity management, email security, threat intelligence, hardening audit, performance monitoring, and a web dashboard.
See installation.md for supported platforms and how the check set differs between cPanel and non-cPanel hosts.
What CSM Does
csm daemon
+-- fanotify file monitor < 1s detection on /home, /tmp, /dev/shm
+-- inotify log watchers ~2s detection on auth, access, exim, FTP logs
+-- PAM brute-force listener Real-time login failure tracking
+-- PHP runtime shield auto_prepend_file protection
+-- critical scanner (10 min) Processes, network, tokens, logins, firewall
+-- deep scanner (60 min) WP/CMS integrity, package integrity, DB injection, phishing
+-- nftables firewall engine Kernel netlink API, IP sets, rate limiting
+-- threat intelligence IP reputation, attack scoring, GeoIP
+-- ModSecurity manager Rule deployment, overrides, escalation
+-- email security AV scanning, quarantine, password/forwarder audit
+-- challenge server Proof-of-work pages for suspicious IPs
+-- alert dispatcher Email, Slack, Discord, webhooks
+-- web UI HTTPS dashboard with authenticated operator pages
+-- hardening audit On-demand server hardening checks + scoring
+-- performance monitor PHP, MySQL, Redis, WordPress metrics
Built From Real Incidents
CSM was built after real attacks where GSocket reverse shells, LEVIATHAN webshell toolkits, credential-stuffed cPanel accounts, and phishing kits were found across production servers.
Installation
Supported Platforms
| Platform | Web server | Package | Notes |
|---|---|---|---|
| cPanel/WHM on CloudLinux / AlmaLinux / Rocky | Apache (EA4) or LiteSpeed | .rpm | Primary target. Full cPanel account, WordPress, Exim, and WHM plugin coverage. |
| Plain AlmaLinux / Rocky / RHEL 8+ / CentOS Stream 8+ | Apache (httpd) or Nginx | .rpm | Generic Linux + web server checks. cPanel-specific checks are skipped cleanly. |
| Plain Ubuntu 20.04+ / Debian 11+ | Apache (apache2) or Nginx | .deb | Same as above, with debsums/dpkg --verify in place of rpm -V. |
The daemon auto-detects the OS, control panel (cPanel/Plesk/DirectAdmin/none), and web server (Apache/Nginx/LiteSpeed) at startup. The detected platform is logged at startup as:
[2026-04-10 08:13:37] platform: os=ubuntu/24.04 panel=none webserver=nginx
Check it with journalctl -u csm.service | grep platform: after starting the daemon.
APT repository (Debian / Ubuntu) – recommended
The package repository at mirrors.pidginhost.com/csm/ is the preferred install method for Debian and Ubuntu. Future updates are picked up automatically via apt upgrade, and package metadata is GPG-signed so the trust chain is enforced by dpkg.
# 1. Install the signing key
curl -fsSL https://mirrors.pidginhost.com/csm/csm-signing.gpg | \
sudo gpg --dearmor -o /etc/apt/keyrings/csm.gpg
# 2. Add the repository
echo "deb [signed-by=/etc/apt/keyrings/csm.gpg] https://mirrors.pidginhost.com/csm/deb stable main" | \
sudo tee /etc/apt/sources.list.d/csm.list
# 3. Install
sudo apt update
sudo apt install csm
Works on Ubuntu 20.04+, Debian 11+, and any derivative. The single stable suite serves all Debian/Ubuntu releases – the Go binary is statically linked and has no per-release glibc dependency.
To upgrade later: sudo apt update && sudo apt upgrade csm.
DNF repository (AlmaLinux / Rocky / RHEL / CloudLinux / cPanel) – recommended
# 1. Import the signing key into the RPM keyring
sudo rpm --import https://mirrors.pidginhost.com/csm/csm-signing.gpg
# 2. Add the repository
sudo tee /etc/yum.repos.d/csm.repo >/dev/null <<'EOF'
[csm]
name=CSM - Continuous Security Monitor
baseurl=https://mirrors.pidginhost.com/csm/rpm/el$releasever/$basearch
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.pidginhost.com/csm/csm-signing.gpg
EOF
# 3. Install
sudo dnf install csm
The explicit rpm --import is important: without it, the first dnf install csm prompts “Is this ok [y/N]:” to trust the repo key, and dnf install -y answers package install prompts but not the key-trust prompt. If the prompt goes unanswered on a non-interactive install, dnf fails with repomd.xml GPG signature verification error: Signing key not found.
The $releasever variable auto-selects the matching EL major (8, 9, or 10). Both x86_64 and aarch64 are published. Works on AlmaLinux 8+, Rocky 8+, RHEL 8+, CloudLinux 8+, and cPanel-managed hosts.
To upgrade later: sudo dnf upgrade csm.
Quick Install (all platforms, one-shot)
For situations where you can’t add a package repository (disconnected hosts, air-gapped mirrors, Docker base images):
curl -sSL https://raw.githubusercontent.com/pidginhost/csm/main/scripts/install.sh | bash
Auto-detects hostname, email, and generates a WebUI auth token. Prompts for confirmation before applying. Works on Debian/Ubuntu and RHEL-family distros. Non-interactive mode:
curl -sSL https://raw.githubusercontent.com/pidginhost/csm/main/scripts/install.sh | bash -s -- --email admin@example.com --non-interactive
Manual .rpm / .deb download
If you need a specific version or want to install without adding the repository:
# RHEL family
curl -LO https://github.com/pidginhost/csm/releases/latest/download/csm-VERSION-1.x86_64.rpm
sudo dnf install -y ./csm-VERSION-1.x86_64.rpm
# Debian/Ubuntu
curl -LO https://github.com/pidginhost/csm/releases/latest/download/csm_VERSION_amd64.deb
sudo apt install -y ./csm_VERSION_amd64.deb
Replace VERSION with a real version (e.g. 2.2.2). Both files are also available at https://mirrors.pidginhost.com/csm/deb/pool/main/c/csm/ and https://mirrors.pidginhost.com/csm/rpm/elN/ARCH/ if you prefer to pin versions from the mirror.
Filesystem layout
The package uses FHS paths for config, state, drop-ins, and shipped profiles. Upgrades keep /opt/csm/csm.yaml as a compatibility link for older scripts:
| Concern | Current path |
|---|---|
| Main config | /etc/csm/csm.yaml |
| Legacy config link | /opt/csm/csm.yaml |
| Drop-in fragments | /etc/csm/conf.d/*.yaml |
| State directory | /var/lib/csm/state/ |
| Shipped profiles | /usr/lib/csm/profiles/ |
| Audit log | /var/log/csm/audit.jsonl |
| Binary | /opt/csm/csm |
| Quarantine | /opt/csm/quarantine/ |
| YARA / signature rules | /opt/csm/rules/ |
The systemd unit declares StateDirectory=csm and ConfigurationDirectory=csm so systemd manages permissions for the FHS directories. On upgrade, the package copies a real legacy main config into /etc/csm/csm.yaml when needed and points /opt/csm/csm.yaml at it. On first start the daemon copies a non-empty legacy /opt/csm/state/ into /var/lib/csm/state/ (only when the new directory is empty), then continues using the FHS state path. See Upgrading - FHS migration for the manual-binary-swap case.
Post-install (all methods)
sudo vi /etc/csm/csm.yaml # Set hostname, alert email, infra IPs
sudo csm validate # Check config syntax (validates merged conf.d too)
sudo systemctl enable --now csm.service
sudo csm baseline # Record current state as known-good via the daemon
Rollback to an older version
Both the APT and DNF repositories retain the last 5 tagged releases at any time. To downgrade:
# Debian/Ubuntu
sudo apt-cache policy csm # Show available versions
sudo apt install csm=2.2.0-1
# RHEL family
sudo dnf --showduplicates list csm # Show available versions
sudo dnf downgrade csm
Verifying platform auto-detection
After systemctl start csm.service, the first line after “CSM daemon starting” reports what CSM detected:
[2026-04-10 08:13:37] CSM daemon starting
[2026-04-10 08:13:37] platform: os=almalinux/10.0 panel=none webserver=apache
[2026-04-10 08:13:37] Watching: /var/log/secure
[2026-04-10 08:13:37] Watching: /var/log/httpd/error_log
[2026-04-10 08:13:37] Watching: /var/log/httpd/access_log
If any field shows none or unknown when you expect something, the auto-detect missed it. File a bug with the output of cat /etc/os-release, systemctl is-active nginx apache2 httpd, and which nginx apache2 httpd.
Optional system dependencies
CSM runs as a single static Go binary and has no hard dependencies beyond systemd, but a few host packages enable additional checks:
| Package | Platforms | Enables |
|---|---|---|
auditd | All | Shadow file / SSH key tamper detection via auditd |
debsums | Debian/Ubuntu | Cleaner system binary integrity output vs. dpkg --verify fallback |
logrotate | All | Rotation of /var/log/csm/monitor.log |
wp-cli | Optional | WordPress core integrity check |
| ModSecurity | All | WAF enforcement checks (see platform-specific install below) |
Installing ModSecurity
CSM detects ModSecurity but doesn’t install it for you. Platform-specific commands:
# Ubuntu/Debian + Nginx
sudo apt install libnginx-mod-http-modsecurity modsecurity-crs
# Ubuntu/Debian + Apache
sudo apt install libapache2-mod-security2 modsecurity-crs && sudo a2enmod security2
# AlmaLinux/Rocky/RHEL + Apache (requires EPEL)
sudo dnf install -y epel-release
sudo dnf install -y mod_security
sudo systemctl restart httpd
# AlmaLinux/Rocky/RHEL + Nginx (requires EPEL)
sudo dnf install -y epel-release
sudo dnf install -y nginx-mod-http-modsecurity
sudo systemctl restart nginx
After installing ModSecurity, run csm check and the waf_status finding should disappear.
Manual (deploy.sh)
/opt/csm/deploy.sh install
vi /etc/csm/csm.yaml # set hostname, alert email, infra IPs
csm validate
systemctl enable --now csm.service
csm baseline
Post-Install
- Edit
/etc/csm/csm.yaml– set hostname, alert email, infrastructure IPs - Run
csm validateto check config syntax (add--deepfor connectivity probes) - Start the daemon:
systemctl enable --now csm.service - Run
csm baselineto record current state for change tracking (see below) - Open the Web UI:
https://<server>:9443/login
All installation methods produce the same installed state. RPM/DEB packages auto-detect hostname and email, and generate the auth token.
Baseline Scan
The csm baseline command scans the entire server and records the current state for change tracking. This is required on first install so CSM knows what’s “normal” for your server. Findings that should never be silently trusted, such as non-standard MySQL superusers or WHM root API tokens, can still be reported on this first scan.
What it does:
- Scans all cPanel accounts for malware, permissions, and configuration issues
- Records file hashes, email forwarder hashes, and plugin versions
- Stores everything in the bbolt database (
/var/lib/csm/state/csm.db)
How long it takes: Depends on server size. A server with 100+ cPanel accounts and thousands of WordPress sites can take 5-10 minutes. The daemon must be running because the baseline is coordinated through the control socket.
When to re-run:
- After a fresh install
- After restoring from backup
- After an intentional state reset approved by the operator
- You do NOT need to re-run for normal deploys/upgrades – the daemon handles incremental state
Important: Start csm.service before running csm baseline. If existing history would be cleared, rerun with csm baseline --confirm only after verifying that reset is intended.
Configuration
CSM is configured via /etc/csm/csm.yaml, with --config <path> to override. Legacy installs that only have /opt/csm/csm.yaml keep working; packaged upgrades migrate that file into /etc/csm/csm.yaml and leave the old path as a compatibility link. Optional drop-in fragments under /etc/csm/conf.d/*.yaml are merged on top of the main file at startup; see conf.d drop-ins below.
Platform & Web Server
CSM auto-detects the host OS (Ubuntu, Debian, AlmaLinux, Rocky, RHEL, CloudLinux), control panel (cPanel, Plesk, DirectAdmin, or none), and web server (Apache, Nginx, LiteSpeed, or none) at daemon startup. The detected platform is logged as:
[2026-04-10 08:13:37] platform: os=ubuntu/24.04 panel=none webserver=nginx
The daemon then chooses the correct log paths, config candidates, and check set without any configuration from you. Verify with:
journalctl -u csm.service | grep platform:
Web server overrides
For hosts with a custom layout (reverse proxy, non-standard package locations, chroot), add a web_server: section to csm.yaml. Every field is optional – anything left blank falls back to auto-detection.
web_server:
type: "nginx" # apache | nginx | litespeed -- overrides auto-detect
config_dir: "/etc/nginx" # for info/diagnostics only
access_logs: # tried in order until one exists
- "/var/log/nginx/access.log"
- "/srv/logs/nginx/access.log"
error_logs: # used by ModSecurity deny watcher
- "/var/log/nginx/error.log"
modsec_audit_logs:
- "/var/log/nginx/modsec_audit.log"
modsec_error_log (legacy single-path override) is still honored and takes precedence over web_server.error_logs for the ModSecurity watcher only:
modsec_error_log: "/opt/myapp/logs/modsec_audit.log"
Account roots (plain Linux web-scan coverage)
By default, the account-scan based checks (perf_error_logs, perf_wp_config, perf_wp_transients, and related) iterate /home/*/public_html which is the cPanel layout. On plain Ubuntu / AlmaLinux with Nginx or Apache, point CSM at your actual web roots:
account_roots:
- "/var/www/*/public" # e.g. Laravel/Symfony sites
- "/srv/http/*" # Arch / generic layouts
- "/home/*/public_html" # add if you also have cPanel-style accounts
Each entry is a glob pattern expanded at scan time. Non-existent matches are silently dropped. If account_roots is empty and CSM is not on a cPanel host, the account-scan checks return no findings (they run but find nothing, which is the correct behavior for a plain-Linux host with no configured web roots).
Today, three checks consume this: perf_error_logs, perf_wp_config, perf_wp_transients. The remaining account-scan checks (WordPress core integrity, phishing kit detection, htaccess tampering, fileindex, etc.) still assume the cPanel /home/*/public_html layout and will be migrated in a follow-up release.
Minimal Config
hostname: "csm.example.com"
alerts:
email:
enabled: true
to: ["admin@example.com"]
disabled_checks: [] # optional: suppress these checks from email only
smtp: "localhost:25"
webui:
enabled: true
listen: "0.0.0.0:9443"
auth_token: "your-secret-token"
infra_ips: ["10.0.0.0/8"]
Full Reference
hostname: "csm.example.com"
# --- Alerts ---
alerts:
email:
enabled: true
to: ["admin@example.com"]
from: "csm@csm.example.com"
smtp: "localhost:25"
disabled_checks: [] # check names to keep in web/history but exclude from email
webhook:
enabled: false
url: ""
type: "slack" # slack, discord, generic, phpanel
hmac_secret: "" # phpanel webhook signing secret
hmac_secret_env: "" # env var containing phpanel signing secret
per_finding: false # phpanel sends one signed POST per finding
heartbeat:
enabled: false
url: "" # healthchecks.io, cronitor, dead man's switch
max_per_hour: 10 # default: 10
audit_log: # SIEM-friendly per-finding stream
file:
enabled: false
path: /var/log/csm/audit.jsonl # default; logrotate fragment ships with the package
syslog:
enabled: false
network: udp # udp | tcp | unix | unixgram | tls
address: 127.0.0.1:514 # host:port, or filesystem path for unix variants
facility: local0 # default: local0
tls_ca: "" # optional CA cert for tls transport
# --- Integrity ---
integrity:
binary_hash: "" # auto-populated by install/rehash
config_hash: "" # auto-populated by install/rehash
confd_hash: "" # auto-populated by install/rehash
immutable: false # prevent config changes at runtime
# --- Thresholds ---
thresholds:
mail_queue_warn: 500 # default: 500
mail_queue_crit: 2000 # default: 2000
state_expiry_hours: 24 # default: 24
deep_scan_interval_min: 60 # minutes between deep scans (default: 60)
wp_core_check_interval_min: 60 # WordPress core checksum interval (default: 60)
webshell_scan_interval_min: 30 # webshell scan interval (default: 30)
filesystem_scan_interval_min: 30 # filesystem scan interval (default: 30)
multi_ip_login_threshold: 3 # IPs per account before alert (default: 3)
multi_ip_login_window_min: 60 # time window for multi-IP check (default: 60)
cred_stuffing_distinct_accounts: 5 # failed accounts from one IP before credential_stuffing (default: 5)
plugin_check_interval_min: 1440 # WordPress plugin check interval (default: 1440)
brute_force_window: 5000 # failed auth attempts window (default: 5000)
domlog_max_files: 500 # per-domain access logs per WP brute-force scan (default: 500)
domlog_tail_lines: 500 # trailing lines tailed from each domlog per scan (default: 500)
domlog_max_age_min: 30 # skip per-domain access logs untouched in this many minutes (default: 30)
mail_log_tail_lines: 500 # trailing lines of /var/log/exim_mainlog read by the mail-per-account scanner (default: 500)
syslog_messages_tail_lines: 200 # trailing lines of /var/log/messages read by the FTP login scanner (default: 200)
account_scan_max_files: 10000 # account and mail-domain paths per scanner cycle (default: 10000)
# If this cap clips /home/<account>/ paths, account_scan_truncated names the affected account.
crontab_base64_blob_max_bytes: 16384 # encoded bytes per crontab base64 candidate before decoded-content matching; must be a multiple of 4 (default: 16384)
# HTTP request flood, User-Agent spoof, and distributed HTTP detection.
# These detectors scan the same per-vhost access-log stream as the WP
# brute-force scanner; no extra log tailer is needed.
#
# http_flood_threshold: minimum per-IP request count inside the window
# that emits http_request_flood. 0 disables the detector. The detector
# ships disabled so operators can sample local baseline traffic first.
# Adjust up for CDNs or CGNAT-heavy visitor pools before enabling.
http_flood_threshold: 0 # 0 = disabled; set after sampling baseline traffic
http_flood_window_min: 5 # rate window in minutes (default: 5)
# http_ua_spoof_threshold: per-IP per-window count for non-browser UA
# kinds before http_ua_spoof fires. Claimed search-engine bots (Googlebot,
# Bingbot, Applebot) that fail reverse-DNS confirmation fire regardless of
# this threshold once the rDNS cache confirms the IP is not the real bot.
http_ua_spoof_threshold: 30 # default: 30
# http_distributed_min_ips: distinct already-abusive source IPs that hit
# the same vhost in one scan window before a per-vhost distributed flood
# finding fires. 0 disables the rollup for existing configs that do not
# opt in.
http_distributed_min_ips: 10 # sample setting; omit or set 0 to disable
# These three opt-in flags extend UA spoof detection to additional UA
# classes. Leave disabled on busy shared hosts; scripting-language agents
# and headless browsers appear on many legitimate monitoring stacks.
http_ua_scripting_enabled: false # flag curl/wget/python-requests/Go-http style UAs
http_ua_headless_enabled: false # flag Puppeteer/Playwright/PhantomJS UAs
http_ua_empty_enabled: false # flag requests with no UA at all
# SMTP brute-force tracker (Exim mainlog, dovecot SASL on submission ports)
smtp_bruteforce_threshold: 5 # per-IP failed auths before block (default: 5)
smtp_bruteforce_window_min: 10 # sliding window in minutes (default: 10)
smtp_bruteforce_suppress_min: 60 # cooldown between repeat findings (default: 60)
smtp_bruteforce_subnet_threshold: 8 # unique IPs per /24 before subnet block (default: 8)
smtp_account_spray_threshold: 12 # unique IPs targeting one mailbox before visibility finding (default: 12)
smtp_bruteforce_max_tracked: 20000 # soft cap on tracked entries; oldest evicted (default: 20000)
# SMTP probe-abuse tracker (raw connect-rate per IP; catches scanners that
# never reach AUTH). Threshold sized well above any legitimate MUA usage.
smtp_probe_threshold: 100 # per-IP connects before block (default: 100; explicit 0 disables)
smtp_probe_window_min: 5 # sliding window in minutes (default: 5)
smtp_probe_suppress_min: 60 # cooldown between repeat findings (default: 60)
smtp_probe_max_tracked: 20000 # soft cap on tracked entries; oldest evicted (default: 20000)
# Mail brute-force tracker (IMAP/POP3/ManageSieve via mail_logs source)
mail_bruteforce_threshold: 5 # per-IP failed auths before block (default: 5)
mail_bruteforce_window_min: 10 # sliding window in minutes (default: 10)
mail_bruteforce_suppress_min: 60 # cooldown between repeat findings (default: 60)
mail_bruteforce_subnet_threshold: 8 # unique IPs per /24 before subnet block (default: 8)
mail_account_spray_threshold: 12 # unique IPs targeting one mailbox before visibility finding (default: 12)
mail_bruteforce_max_tracked: 20000 # soft cap on tracked entries; oldest evicted (default: 20000)
mail_brute_account_key: "builtin:dovecot-user" # builtin:dovecot-user | builtin:postfix-sasl | regex:<capture>
modsec_escalation_hits: 3 # denies from one IP before ModSecurity escalation (default: 3)
modsec_escalation_window_min: 10 # ModSecurity escalation window in minutes (default: 10)
# --- Web server overrides ---
# Leave these empty to use auto-detected paths for the running platform.
web_server:
# Override the per-vhost access-log glob patterns. Empty uses the
# auto-detected default for the panel (cPanel, Plesk, DirectAdmin,
# bare Apache, or bare Nginx).
domlog_globs: []
# IPs or CIDRs whose X-Forwarded-For header is trusted for client-IP
# extraction. Leave empty to ignore XFF and use RemoteIP as-is.
trusted_proxies: []
# --- Infrastructure ---
infra_ips: [] # management IPs/CIDRs/hostnames - never blocked
# --- Mail Logs ---
# Packaged releases include journald support. Custom builds need
# `make JOURNAL=1 build-yara` before `source: journal` can be selected.
mail_logs:
source: auto # auto | file | journal
file: "" # optional path override for file source
units: ["postfix", "dovecot"] # journal units for source=journal or auto fallback
# --- State ---
state_path: "/var/lib/csm/state" # bbolt DB and state files
# --- Suppressions ---
suppressions:
upcp_window_start: "00:30" # cPanel nightly update window start
upcp_window_end: "02:00" # cPanel nightly update window end
known_api_tokens: [] # API tokens to ignore in auth logs (e.g. ["phclient"])
ignore_paths: # glob patterns to skip in filesystem scans
- "*/cache/*"
- "*/vendor/*"
suppress_webmail_alerts: true # don't alert on webmail logins
suppress_cpanel_login_alerts: false # don't alert on cPanel direct logins
suppress_blocked_alerts: true # don't alert on IPs that were auto-blocked
trusted_countries: ["RO"] # ISO 3166-1 alpha-2 - suppress cPanel login alerts from these
# --- Auto-Response ---
auto_response:
enabled: false
kill_processes: false # kill malicious processes
quarantine_files: false # move malware to quarantine
block_ips: false # block attacker IPs via firewall
block_expiry: "24h" # duration for temp blocks (e.g. "24h", "12h")
max_blocks_per_hour: 50 # per-IP blocks per hour; 0/omitted uses default
enforce_permissions: false # auto-chmod 644 world/group-writable PHP files
block_cpanel_logins: false # block IPs on cPanel/webmail/FTP/API thresholded brute findings (multi-IP login, webmail/API brute, FTP brute). Single direct cPanel form logins stay audit-only regardless of this flag.
netblock: false # auto-block IPv4 /24 or IPv6 /64 subnets
netblock_threshold: 3 # IPs from same IPv4 /24 or IPv6 /64 before subnet block
permblock: false # promote temp blocks to permanent
permblock_count: 4 # temp blocks before promotion
permblock_interval: "24h" # window for counting temp blocks
clean_database: false # auto-drop confirmed malicious DB objects after backup
clean_htaccess: false # auto-clean .htaccess directives flagged by hardened detectors (backups under /opt/csm/quarantine/pre_clean/)
disable_enforce_af_alg: false # suspend periodic AF_ALG hardening re-assertion
copy_fail_kill_process: false # kill processes caught opening AF_ALG sockets via the live listener
dry_run: true # safe default; logs intended IP blocks without mutating nftables
verdict_callback:
enabled: false # call panel before each auto-block
url: "" # POST target for verdict requests
hmac_secret: "" # signing secret, or use hmac_secret_env
hmac_secret_env: "" # env var read at call time
allow_unsigned: false # true only for staged unsigned rollouts
require_response_signature: true # reject unsigned callback replies
timeout_sec: 2 # callback request timeout
# PHP-relay auto-freeze. Off by default; only kicks in on cPanel hosts
# where email_protection.php_relay.enabled is true. dry_run defaults to
# true even when freeze is true, so an operator who enables freeze
# without thinking gets a dry-run rather than a live exim -Mf storm.
# Override at runtime with `csm phprelay dry-run on|off|reset`.
php_relay:
freeze: false # opt in to wire the exim -Mf hook into the alert pipeline
dry_run: true # safe default; flip with `csm phprelay dry-run off [--persist]`
max_actions_per_minute: 60 # rolling 60s cap on exim -Mf invocations
# --- Detection ---
detection:
# db_object_scanning is tri-state: omit for the default (on),
# `false` to explicitly disable. When off, the MySQL persistence
# scanner emits no findings; the manual `csm db-clean --drop-object`
# CLI keeps working for operator-driven cleanup.
# db_object_scanning: true
db_object_allowlist: [] # entries: <account>:<schema>:<type>:<name> -- suppresses db_unexpected_* warnings only
admin_overlap_min_accounts: 2 # raise only if routine shared-admin accounts are expected on this host
admin_overlap_trusted_emails: [] # exact reviewed admin emails that may manage multiple cPanel accounts
admin_overlap_trusted_domains: [] # exact reviewed email domains for developer or reseller admin accounts
# rescan_on_signature_update: true # tri-state; omit for default-on, false to disable retroactive sweeps
af_alg_backend: "auto" # auto | bpf | auditd | none
connection_tracker_backend: "auto" # auto | bpf | legacy | none
connection_poll_interval: 30s # legacy connection tracker interval
exec_monitor_backend: "auto" # auto | bpf | legacy | none
exec_monitor_poll_interval: 30m # legacy process monitor interval
sensitive_files_backend: "auto" # auto | bpf | legacy | none
sensitive_files_poll_interval: 5m # sensitive-file poll/watchset refresh interval
direct_smtp_egress:
enabled: false # detect non-MTA local processes opening outbound SMTP
backend: "auto" # auto | bpf | legacy | none
dry_run: true # safe default for detector-scoped action
ports: [25, 465, 587] # destination ports to inspect
# --- BPF Enforcement ---
bpf_enforcement:
enabled: false # master switch for in-kernel denial
dry_run: true # log intended denials, allow the connect
direct_smtp_egress: false # gate enforcement on direct SMTP egress matches
verdict_callback: false # userspace advisory callback after the BPF decision
# --- Challenge Pages ---
challenge:
enabled: false # enable PoW challenge pages instead of hard block
listen_addr: 127.0.0.1 # bind address; use 0.0.0.0 for public direct redirects
listen_port: 8439 # port for challenge server; must fit the TCP port range
tls_cert: "" # optional HTTPS cert for direct/public challenge listener
tls_key: "" # optional HTTPS key for direct/public challenge listener
public_url: "" # required by webserver-integration, e.g. https://host:8439/challenge
secret: "" # HMAC secret for tokens (auto-generated if empty)
difficulty: 2 # SHA-256 proof-of-work difficulty 0-5 (default: 2)
trusted_proxies: [] # IPs/CIDRs allowed to supply X-Forwarded-For
port_gate:
enabled: false # nftables gate for non-loopback challenge listener
captcha_fallback: # widget for JS-disabled visitors (default off)
provider: "" # "turnstile" | "hcaptcha" | "" (off)
site_key: "" # public key embedded in the widget
secret_key: "" # verified server-side
timeout: 10s
verified_session: # signed-cookie bypass for authenticated operators
enabled: false
cookie_name: csm_admin_session
ttl: 4h
admin_secret: "" # POST'd to /challenge/admin-token to mint cookie
verified_crawlers: # reverse-DNS forward-confirm for search crawlers
enabled: false
providers: [] # names: googlebot | bingbot
cache_ttl: 15m
# --- PHP Shield ---
php_shield:
enabled: false # watch the PHP Shield event log for alerts
# --- Reputation ---
reputation:
abuseipdb_key: "" # AbuseIPDB API key for IP reputation lookups
whitelist: [] # IPs to never flag as malicious
# Async PTR + forward-A verification for IPs that claim search-engine
# bot UAs (Googlebot, Bingbot, Applebot). When an IP claims a bot UA
# but reverse DNS does not confirm it, the request counts toward
# http_ua_spoof. Transient DNS lookup failures fail open and are
# retried later. Set false only if your resolver is unreliable. See
# docs/src/auto-response.md for the always-block behavior.
bot_verify_enabled: true # default: true
rspamd:
enabled: false # include rspamd rolling history in IP reputation
url: "http://127.0.0.1:11334" # rspamd controller URL
token: "" # controller password, or use token_env
token_env: "" # env var read at query time
upstream:
enabled: false # include panel-side threat-intel cache scores
url: "" # HTTPS base URL; HTTP only allowed for loopback
token: "" # bearer token, or use token_env
token_env: "" # env var read at query time
cache_ttl_min: 15 # local cache TTL for upstream scores
timeout_sec: 5 # upstream request timeout
report:
enabled: false # opt-in abuse report delivery; restart required
classes: [] # bruteforce | php_relay | credential_stuffing | bad_asn_egress
spool_path: "" # default: <state_path>/abuse_reports.db
spool_max: 10000 # max queued reports per target
targets:
- name: "" # stable target name
url: "" # HTTPS collector URL; HTTP only allowed for loopback
transport: "hmac" # hmac | ed25519
node_id: "" # sender node ID
key_id: "" # receiver key ID
key_env: "" # HMAC secret or Ed25519 private key env var
token_env: "" # optional bearer token env var for HMAC targets
central:
enabled: false # opt-in central scored-set consume; restart required
set_url: "" # HTTPS scored-set endpoint; HTTP only for loopback
pubkey_env: "" # env var with Ed25519 public key hex
refresh_interval: 6h # pull interval; default 6h
action: "challenge" # off | challenge | block_if_local_corroborated
block_threshold: 80 # score needed before local corroboration can block
# --- Signatures ---
signatures:
rules_dir: "/opt/csm/rules" # YAML signature rules directory
update_url: "" # remote URL to fetch rule updates
auto_update: false # auto-download rules on schedule
update_interval: "" # how often to check (e.g. "24h")
signing_key: "" # required for any remote rule update path; 64-char hex Ed25519 public key
yara_forge:
enabled: false # auto-fetch YARA Forge community rules
tier: "core" # "core", "extended", "full" (default: "core")
update_interval: "168h" # how often to check for updates (default: weekly)
download_url: "" # signed ZIP URL/template; supports {tier} and {version}
disabled_rules: [] # YARA rule names to exclude from Forge downloads
# yara_worker_enabled: true # tri-state: omit for the default (on), `false` to explicitly disable
# signatures.signing_key is mandatory whenever either signatures.update_url
# is set or signatures.yara_forge.enabled is true. It must be the hex
# Ed25519 public key used to verify detached .sig files for rule bundles.
# Remote update URLs must use HTTP or HTTPS and must not point at localhost,
# loopback, link-local, unspecified, or RFC1918 / ULA private addresses.
#
# YARA Forge upstream GitHub releases do not publish CSM detached signatures.
# To enable automatic Forge updates, mirror the ZIPs, sign each ZIP, publish
# the signature at the ZIP URL plus .sig, and set yara_forge.download_url to
# that signed mirror. Otherwise leave update_url empty and yara_forge.enabled
# false.
# --- Web UI ---
webui:
enabled: true
listen: "0.0.0.0:9443" # address:port for HTTPS server
auth_token: "" # Bearer/cookie auth token (auto-generated on install)
tokens: [] # optional scoped tokens: name/token/scope (admin or read)
metrics_token: "" # optional Bearer token for /metrics only
tls_cert: "" # path to TLS certificate PEM file
tls_key: "" # path to TLS private key PEM file
ui_dir: "" # path to UI files on disk (default: /opt/csm/ui)
# --- Email AV ---
email_av:
enabled: false
clamd_socket: "/var/run/clamd.scan/clamd.sock" # path to ClamAV daemon socket
scan_timeout: "30s" # per-attachment scan timeout
max_attachment_size: 26214400 # max single attachment size in bytes (25MB)
max_archive_depth: 1 # max nested archive extraction depth
max_archive_files: 50 # max files extracted from a single archive
max_extraction_size: 104857600 # max total extraction size in bytes (100MB)
quarantine_infected: true # quarantine emails with infected attachments
scan_concurrency: 4 # parallel scan workers
# --- Email Protection ---
email_protection:
password_check_interval_min: 1440 # how often to audit email passwords (default: 1440)
high_volume_senders: [] # accounts expected to send high volume (skip rate alerts)
rate_warn_threshold: 50 # emails per window before warning (default: 50)
rate_crit_threshold: 100 # emails per window before critical (default: 100)
rate_window_min: 10 # rate check window in minutes (default: 10)
known_forwarders: [] # expected plain mail forwarders
# PHP-relay detector (cPanel only; gated by platform.IsCPanel at startup).
# Off by default. When enabled, the daemon spawns the inotify spool
# watcher, runs a startup spool walk, and starts the Path 2b retro scan
# on /var/log/exim_mainlog. See docs/src/detection-realtime.md#php-relay
# for what each path actually triggers on.
php_relay:
enabled: false # opt in to start the watcher
rate_window_min: 5 # Path 1 rolling window
header_score_volume_min: 5 # Path 1: don't score until script has emitted N msgs
absolute_volume_per_hour: 30 # Path 2 threshold per script
account_volume_per_hour: 0 # Path 2b operator override; 0 = auto-derive from cpanel.config maxemailsperhour
reputation_failures_per_24h: 3 # Path 3 threshold (Stage 2)
fanout_distinct_scripts: 3 # Path 4 threshold
fanout_window_min: 5 # Path 4 window
baseline_sigma: 3.0 # Path 5 (Stage 3)
baseline_observation_days: 7 # Path 5 (Stage 3)
policies_dir: "/opt/csm/policies/php_relay" # mailer_classes.yaml + http_proxy_ranges.yaml; SIGHUP-reloadable
cloud_relay:
allow_users: [] # full mailbox opt-outs for cloud-relay detection
allow_domains: [] # domain-wide opt-outs for cloud-relay detection
# Email forward guard (cPanel only). Opt-in MTA-native enforcement for
# external forward copies. Enforce mode can hold null-sender backscatter and
# bad-sender-IP copies before they relay to an external provider, while the
# local mailbox copy still delivers. Spam, malware, and auth-fail signals are
# accounted in dry-run until Exim content scanning is wired. CSM is not in the
# live mail path; an installed Exim rule can keep holding matching copies even
# if the daemon is down. Held copies can be released or deleted from the Email page.
forward_guard:
enabled: false # master switch (default off)
dry_run: true # account/log only, do not actually hold (default true)
quarantine_retention_days: 14 # held-copy retention window
skip_forwarders: [] # reserved forwarder exemptions; not enforced yet
hold_signals: # signal toggles, each default true
bounce_backscatter: true # null-sender bounce backscatter (enforceable)
spam_flagged: true # message flagged as spam (dry-run/accounting only)
malware: true # message carries malware (dry-run/accounting only)
bad_sender_ip: true # originating IP has bad reputation (enforceable)
auth_fail: true # sender failed SPF/DKIM/DMARC auth (dry-run/accounting only)
# --- Firewall ---
firewall:
enabled: false
# Open ports (IPv4). SSH (22) is intentionally absent; uncomment in
# the YAML lists if sshd listens on 22. TCP 853 is DNS-over-TLS;
# UDP 853 is DNS-over-QUIC.
# 6277/24441 are DCC/Pyzor network checks used by SpamAssassin.
tcp_in: [20,21,25,26,53,80,110,143,443,465,587,853,993,995,2077,2078,2079,2080,2082,2083,2091,2095,2096]
tcp_out: [20,21,25,26,37,43,53,80,110,113,443,465,587,853,873,993,995,2082,2083,2086,2087,2089,2195,2325,2703]
udp_in: [53,443,853]
udp_out: [53,113,123,443,853,873,6277,24441]
# IPv6
ipv6: false
tcp6_in: [] # if empty, uses tcp_in
tcp6_out: [] # if empty, uses tcp_out
udp6_in: [] # if empty, uses udp_in
udp6_out: [] # if empty, uses udp_out
# Restricted ports (infra IPs only)
restricted_tcp: [2086,2087,2325] # WHM ports
# Passive FTP range
passive_ftp_start: 49152
passive_ftp_end: 65534
# Infra IPs/CIDRs/hostnames for firewall rules
infra_ips: []
# Rate limiting
conn_rate_limit: 200 # new connections/min per IP (CGNAT-tolerant)
syn_flood_protection: true
conn_limit: 400 # max concurrent connections per IP (0 = disabled)
# Per-port flood protection: rate-limit new connections per source IP and IP family.
# Defaults are sized for a busy mail host: 600/300s = 120 new conns/min/IP,
# which tolerates a Thunderbird/iPhone client opening 5-15 parallel sessions
# while still capping single-IP flood storms.
port_flood:
- port: 25
proto: tcp
hits: 600
seconds: 300
- port: 465
proto: tcp
hits: 600
seconds: 300
- port: 587
proto: tcp
hits: 600
seconds: 300
# UDP flood protection
udp_flood: true
udp_flood_rate: 100 # packets per second
udp_flood_burst: 500 # burst allowance
# Country blocking
country_block: [] # ISO country codes to block
country_db_path: "" # path to MaxMind DB (uses geoip config if empty)
# Silent drop (no logging)
drop_nolog: [23,67,68,111,113,135,136,137,138,139,445,500,513,520]
# IP limits
deny_ip_limit: 3000 # max permanent blocked IPs
deny_temp_ip_limit: 500 # max temporary blocked IPs
# Outbound SMTP restriction
smtp_block: false # block outgoing mail except allowed users
smtp_allow_users: [] # usernames allowed to send
smtp_ports: [25,465,587]
# Dynamic DNS
dyndns_hosts: [] # hostnames to resolve and whitelist periodically
# Logging
log_dropped: true # log dropped packets
log_rate: 5 # log entries per minute
# --- GeoIP ---
geoip:
account_id: "" # MaxMind account ID
license_key: "" # MaxMind license key
editions: # MaxMind database editions
- GeoLite2-City
- GeoLite2-ASN
auto_update: true # auto-update GeoIP databases (default: true when credentials set)
update_interval: "24h" # update check interval
# --- ModSecurity ---
modsec_error_log: "" # path to Apache/LiteSpeed error log for ModSec parsing
modsec:
rules_file: "" # path to modsec2.user.conf
overrides_file: "" # path to csm-overrides.conf
reload_command: "" # command to reload web server (e.g. "/usr/sbin/apachectl graceful")
# --- Performance ---
performance:
enabled: true
load_high_multiplier: 1.0 # load average / CPU cores multiplier for warning (default: 1.0)
load_critical_multiplier: 2.0 # load average / CPU cores multiplier for critical (default: 2.0)
php_process_warn_per_user: 20 # per-user PHP process count warning (default: 20)
php_process_critical_total_multiplier: 5 # total PHP processes / CPU cores for critical (default: 5)
error_log_warn_size_mb: 50 # error log size warning threshold (default: 50)
mysql_join_buffer_max_mb: 64 # MySQL join_buffer_size warning threshold (default: 64)
mysql_wait_timeout_max: 3600 # MySQL wait_timeout warning threshold (default: 3600)
mysql_max_connections_per_user: 10 # per-user MySQL connections warning (default: 10)
redis_bgsave_min_interval: 900 # minimum seconds between Redis BGSAVE (default: 900)
redis_large_dataset_gb: 4 # Redis dataset size warning threshold in GB (default: 4)
wp_memory_limit_max_mb: 512 # WordPress memory_limit warning threshold (default: 512)
wp_transient_warn_mb: 1 # WordPress transient data warning in MB (default: 1)
wp_transient_critical_mb: 10 # WordPress transient data critical in MB (default: 10)
# --- Cloudflare ---
cloudflare:
enabled: false # auto-whitelist Cloudflare IP ranges
refresh_hours: 6 # how often to refresh Cloudflare IPs (default: 6)
# --- Threat Intel ---
c2_blocklist: [] # known C2 server IPs to block permanently
backdoor_ports: [4444,5555,55553,55555,31337] # ports indicating backdoor activity
# --- Update check ---
updates:
check_enabled: true # notify only; CSM never downloads or applies updates
interval: "24h" # release check interval
github_api_url: "" # optional release API mirror or test endpoint
package_name: "csm" # apt/dnf package name for package-manager fallback
# --- Incidents ---
incidents:
auto_close:
enabled: true # auto-close idle open/contained incidents
dry_run: false # log decisions without writing status changes
by_kind:
mailbox_takeover: 24h
credential_spray: 24h
web_account_compromise: 168h
spray_suppression:
enabled: false # collapse one-source credential spray into one incident
dry_run: true
distinct_mailboxes: 10
severity_escalate_at: 50
per_check: [email_auth_failure_realtime, pam_auth_failure, ssh_bruteforce]
max_tracked_ips: 10000
block_at_severity: "" # "" | high | critical
auto_block:
enabled: false # block source IPs from incident correlations
block_at_severity: "" # "" | high | critical
kinds: [] # empty means all non-spray kinds with remote_ip
# --- Disabled checks (skip whole categories per host) ---
# Listed finding names disable the scheduled check runner(s) that emit them,
# including sibling findings from the same runner. Realtime findings are not
# affected. Use for whole categories that don't apply to a host (e.g. WAF/web
# checks on DNS-only cPanel servers, where httpd is installed but no virtual
# hosts serve traffic).
# For email-only suppression, use `alerts.email.disabled_checks` instead.
disabled_checks: [] # e.g. [waf_status, waf_rules, waf_detection_only]
# --- Retention (bbolt growth control) ---
retention:
enabled: false # opt-in; when true, a daily sweep prunes old entries and compacts bbolt
findings_days: 90 # keep active findings this long (0 disables the findings sweep)
history_days: 30 # keep findings-history entries this long
reputation_days: 180 # keep IP reputation/attack entries this long
sweep_interval: "24h" # how often the retention goroutine runs
compact_min_size_mb: 128 # don't consider compaction below this file size
compact_fill_ratio: 0.5 # compact when used_bytes / file_size drops below this
# --- Sentry (error reporting) ---
sentry:
enabled: false # ship panics and selected errors to a Sentry server
dsn: "" # Sentry project DSN
environment: "production" # e.g. "production", "staging"
sample_rate: 1.0 # 0.0 -> 1.0 (capture all errors)
debug: false # SDK debug logs to stderr
TLS Certificates
The Web UI serves over HTTPS. Configure TLS certificates under webui:
webui:
tls_cert: "/var/cpanel/ssl/cpanel/mycpanel.pem" # certificate PEM file
tls_key: "/var/cpanel/ssl/cpanel/mycpanel.pem" # private key PEM file
On cPanel servers, you can reuse the cPanel self-signed certificate (both cert and key are in the same PEM file). For production, use a proper certificate from Let’s Encrypt or your CA.
If tls_cert and tls_key are empty, the Web UI will not start.
Validation
csm validate # syntax check
csm validate --deep # syntax + connectivity probes (SMTP, webhooks)
csm config show # display config with secrets redacted
Editing csm.yaml by hand
CSM stores a sha256 of the main config in integrity.config_hash and
a separate digest of loaded drop-ins in integrity.confd_hash. It
refuses to start if the on-disk files disagree with those values. This
is a tamper-detection feature. There are two supported edit workflows
depending on which fields you touch.
Fast path: SIGHUP reload (safe fields only)
For fields tagged as hot-reload-safe (alerts, thresholds,
detection, suppressions, auto_response, bpf_enforcement,
reputation, email_protection, disabled_checks), the daemon can
accept the change without a restart:
sudo cp /etc/csm/csm.yaml /etc/csm/csm.yaml.bak-$(date +%s)
# edit /etc/csm/csm.yaml with your favourite editor
sudo systemctl reload csm
sudo journalctl -u csm -n 20 --no-pager
systemctl reload sends SIGHUP (wired via ExecReload= in the unit
file). The daemon re-reads the file, validates it, diffs it against
the running config, and if every change is on a field tagged
hotreload:"safe" it swaps the new values into
the live config and re-signs the integrity hashes on disk. The
next check tick sees the new thresholds; fanotify marks are not
dropped.
The tagged-safe top-level fields are alerts, thresholds,
detection, suppressions, auto_response, bpf_enforcement,
reputation, email_protection, and disabled_checks. The Settings
API derives its restart hints from the same manifest that drives
config.Diff, so UI hints and SIGHUP behavior cannot drift silently.
Changes to their sub-keys are picked up on the next tick by the
periodic scanners, the auto-response helpers
(block/kill/quarantine/challenge/permission-fix), alert dispatch, and
the heartbeat.
Two sub-keys are exceptions. They live under a safe-tagged parent but seed a long-lived in-memory structure at daemon startup; the reload accepts the edit and re-signs the hash, but the running structure keeps the old value until the next restart:
reputation.whitelist– seeded into the threat database at startup. The threat database exposes its own runtime API for adding and removing whitelist entries (via the Threat Intelligence page in the Web UI or the/api/v1/threat/*endpoints); those paths survive restarts because the threat database persists the runtime list to disk. Reloadingreputation.whitelistfrom csm.yaml does not automatically propagate to the running threat database.email_protection.known_forwarders– captured by the forwarder watcher at startup and read by scheduled forwarder and mail-filter checks. No runtime API yet; send a restart if you edit this list.
If you change either of the above, send systemctl restart csm
instead of a reload. The rest of the sub-keys in every safe-tagged
section are read per-call (inside check functions, auto-response
helpers, alert dispatchers) and hot-reload cleanly on the next
tick.
Look for one of three log shapes in the journal:
SIGHUP: config reloaded; safe fields updated: [thresholds]– success. The new values are live.config_reload_restart_required: SIGHUP reload: restart-required fields changed: [hostname ...]; live config unchanged– the edit touched a field that cannot be hot-swapped. A Warningconfig_reload_restart_requiredfinding is also emitted. Fall back to the restart path below.config_reload_error: SIGHUP reload: parse failed ...or... validation error ...– the file on disk is not loadable or failscsm validate. A Criticalconfig_reload_errorfinding is emitted. The live config is unchanged; fix the file and repeat.
Restart path: unsafe fields
Fields not tagged hotreload:"safe" (the majority, including
hostname, state_path, webui.listen, firewall.*, email_av.*
and anything that survives only one re-init per daemon lifetime)
require a full restart. The integrity check must be re-signed first:
sudo cp /etc/csm/csm.yaml /etc/csm/csm.yaml.bak-$(date +%s)
# edit /etc/csm/csm.yaml with your favourite editor
sudo /opt/csm/csm rehash # re-signs integrity hashes
sudo /opt/csm/csm validate # syntax + value sanity
sudo systemctl restart csm
sudo systemctl status csm # confirm active, no crash-loop
If the restart fails (most commonly because rehash was skipped),
roll back with
sudo cp <backup> /etc/csm/csm.yaml && sudo systemctl restart csm.
The backup carries its own matching hash so no second rehash is
needed.
Config-management tools
Config-management workflows (Ansible, Puppet, Chef) should:
- For safe changes, notify
systemctl reload csminstead ofrestart. The daemon re-signs the hash itself; no separatecsm rehashstep is required. - For any change that may touch a restart-required field, run
csm rehashbefore the restart notify fires. Or always sendreloadfirst, read the journal, and promote torestartonly when the reload logsrestart-required.
conf.d drop-ins
Files matching /etc/csm/conf.d/*.yaml are loaded after the main config and deep-merged on top of it. Override with --config-dir <path> or CSM_CONFIG_DIR; the flag wins when both are set.
- Order: lexicographic by filename. Scalar keys in
20-overrides.yamloverride the same keys in10-base.yaml. Use a numeric prefix. - Merge semantics: maps merge recursively; scalars replace the value from the main file; lists append in fragment order. All-scalar lists drop duplicate entries while keeping the first occurrence; structured lists such as
webui.tokenskeep every entry. - Trust: override directories must be absolute, must exist, and must be owned by root or the running process. The directory and every loaded fragment must not be group- or world-writable. Safe symlinked fragments are allowed, so packaged profiles can still be linked into
/etc/csm/conf.d/. - Integrity ownership: drop-ins cannot set the
integrityblock. Integrity metadata is stored only in the main config. - Hash:
integrity.config_hashcovers the main file andintegrity.confd_hashcovers loaded drop-ins. After editing a drop-in by hand, runcsm rehashbefore restarting, or usesystemctl reload csmso the daemon can re-sign after validating the merged config. Web settings saves refuse to bless a drop-in change that has not already been re-signed. - Use cases: packaged integration profiles (e.g.
/usr/lib/csm/profiles/phpanel-agent.yamlsymlinked intoconf.d/), per-host automation that should not touch the operator’scsm.yaml, secret material rendered from a vault.
ls /etc/csm/conf.d/
# 10-phpanel-agent.yaml 20-tenant-overrides.yaml
csm validate # validates the merged config
csm config show # prints the merged, redacted config
csm config schema --json # JSON Schema for editor / CI validation
csm validate and csm config show always operate on the merged config so you can audit the effective state without grepping fragments.
detection.direct_smtp_egress
Phase 3 detector. backend accepts auto, bpf, legacy, or none;
ports must contain TCP ports in the 1-65535 range. See
Direct SMTP egress.
bpf_enforcement
Phase 4 enforcement. Requires a BPF-capable connection tracker at
runtime; auto falls back to legacy detection on older servers. See
BPF enforcement.
Upgrading
deploy.sh (recommended)
/opt/csm/deploy.sh upgrade
This will:
- Stop the daemon
- Back up the current binary
- Download the new version
- Verify SHA256 checksum
- Extract UI assets and rules
- Rehash config
- Restart the daemon
Rolls back automatically on failure.
Troubleshooting
“store: opening bbolt: timeout” – Most operator commands that need live state now route through the control socket at /var/run/csm/control.sock. This error should only appear from commands that intentionally open the bbolt file directly, such as csm store compact, csm store import, csm store reset-bot-verify, csm db-clean --drop-object, or a second daemon start while one daemon already owns the database.
Fix: stop the daemon before direct-store maintenance commands, then retry:
systemctl stop csm
csm store compact
systemctl start csm
If systemctl says CSM is stopped but bbolt still times out, find the process holding /var/lib/csm/state/csm.db and stop that process after review. Do not delete csm.lock; it is only the daemon instance guard and does not release bbolt’s file lock.
“csm: daemon not running” – CLI commands that talk to the daemon exit 2 with this message when the control socket is missing. This includes csm run*, csm check*, csm baseline, csm status, csm firewall ..., csm store export, csm export --since, and csm phprelay .... Start the daemon with systemctl start csm. Bootstrap commands that run before the daemon exists (csm install, csm validate, csm config schema, csm verify, csm rehash) do not require it.
Never delete csm.db – it contains all historical findings, firewall state, email forwarder baselines, and per-account data. If you delete it, the web UI will show empty data until the next full scan cycle (up to 60 minutes for deep scan findings). Restore from backup when possible; for an intentional reset, run csm baseline --confirm rather than removing the database by hand.
Config changes require rehash – After editing csm.yaml, run csm rehash twice (the config hash is stored inside the config file, creating a circular dependency – the second run stabilizes it). Or just restart via systemctl restart csm.
RPM/DEB
yum update csm # RPM
dpkg -i csm_NEW.deb # DEB
Package managers handle stop/start automatically.
FHS migration (state, config, drop-ins, and profiles)
Current packages use FHS paths for state, config, drop-ins, and shipped profiles. Legacy main configs continue to work during the transition.
| Concern | Legacy path | Current path |
|---|---|---|
| Drop-in fragments | n/a | /etc/csm/conf.d/*.yaml |
| State directory | /opt/csm/state | /var/lib/csm/state |
| Shipped profiles | n/a | /usr/lib/csm/profiles |
| Binary | /opt/csm/csm | /opt/csm/csm (unchanged) |
| Main config | /opt/csm/csm.yaml | /etc/csm/csm.yaml |
| Legacy config path | n/a | /opt/csm/csm.yaml symlink |
The package postinstall creates the FHS directories with the right ownership. If /opt/csm/csm.yaml is a real file and /etc/csm/csm.yaml is absent or still the shipped placeholder, the package copies the legacy config into /etc/csm/csm.yaml and then replaces the old path with a symlink. If both paths are real files with different operator content, CSM refuses the implicit default path until you move one aside or pass --config <path>.
The daemon copies a non-empty legacy /opt/csm/state/ into the new state directory on first start, but only when the new directory is empty (so a partial migration cannot corrupt it). The legacy directory is left in place; remove it after you have verified the new install.
Operators upgrading by manual binary swap (without re-running the package postinstall) keep the legacy state path if state_path: /opt/csm/state is pinned in the existing csm.yaml. To move state to the FHS layout, either reinstall the package or create the directories by hand and remove the state_path: override.
systemd Type=notify drop-in
The packaged unit file is Type=notify with WatchdogSec=300. The daemon signals READY=1 after watchers attach and pings WATCHDOG=1 on schedule, so systemctl is-active reflects truth and the watchdog kills a hung daemon.
Older units shipped Type=simple. The watchdog still functions because the daemon pings regardless of unit type, but systemctl status only sees the process, not “watchers attached.” If you need the new behavior on an older unit, drop in:
# /etc/systemd/system/csm.service.d/notify.conf
[Service]
Type=notify
NotifyAccess=main
Then systemctl daemon-reload && systemctl restart csm. Verify with systemctl show csm -p Type -p StatusText.
Auto-response dry-run safety default
auto_response.dry_run defaults to true when the key is absent. The daemon records every IP it would have blocked but does not touch nftables. If your auto_response: block sets enabled: true and block_ips: true but does not set dry_run, add dry_run: false explicitly before relying on auto-block. Verify with:
csm status --json | jq '.capabilities, .severities'
csm firewall status # check that "Recently Blocked" picks up new entries after the restart
Manual csm firewall ... operations bypass dry-run and always apply.
CLI Commands
Global flags
| Flag | Description |
|---|---|
--config <path> | Override the main config path. Default: /etc/csm/csm.yaml, with fallback to /opt/csm/csm.yaml on legacy installs. |
--config-dir <path> | Override the conf.d directory. Default: /etc/csm/conf.d. Wins over CSM_CONFIG_DIR when both are set. Override paths must be absolute, trusted, and not group- or world-writable; loaded fragments must meet the same write-safety check. |
Daemon
| Command | Description |
|---|---|
csm daemon | Run as persistent daemon (fanotify + inotify + PAM + periodic checks). Signals systemd READY=1 after watchers attach and pings WATCHDOG=1 on the configured interval. |
Checks
| Command | Description |
|---|---|
csm run | Run all checks now via the daemon, send alerts |
csm run-critical | Critical checks now via the daemon (the daemon also schedules critical checks internally every 10 min) |
csm run-deep | Deep checks now via the daemon (the daemon also schedules deep checks internally every 60 min) |
csm check | Run all checks via the daemon, print findings to stdout, no alerts / auto-response |
csm check-critical | Test critical checks only (dry-run via daemon) |
csm check-deep | Test deep checks only (dry-run via daemon) |
csm scan <user> | Scan single cPanel account |
Management
| Command | Description |
|---|---|
csm install | Deploy config, systemd, auditd rules, logrotate, WHM plugin |
csm uninstall | Clean removal |
csm baseline | Full server scan via the daemon, records current state for change tracking. Dangerous privileged accounts or WHM root tokens can still be reported on first scan. Takes 5-10 min on large servers. Required on first install. Add --confirm when existing history would be cleared. The daemon must be running. |
csm rehash | Update binary/config hashes without scanning. Use after config edits. Run twice (circular hash). |
csm status | Show current state, last run, active findings, and automation rollout state. Add --json for the full health snapshot (watchers, severity counts, store health, blocklist size, capabilities, version, hashes, automation). |
csm doctor | Config + daemon + watchers + store sanity check. csm doctor challenge checks challenge public URL, TLS, port gate, webserver snippets, configtest, and the live /challenge/gate endpoint. Add --json for machine-readable output. |
csm validate | Validate config (--deep for connectivity probes) |
csm config show | Display config with secrets redacted |
csm config schema --json | Print a JSON Schema reflected from the Config struct. Use for CI validation of conf.d drop-ins or panel-side editor schemas. |
csm verify | Verify binary and config integrity |
csm version | Version and build info |
Backup & restore
| Command | Description |
|---|---|
csm backup <path> | Bundle csm.yaml, /etc/csm/conf.d/, and the state directory into a tar.gz at <path>. Use for clean DR snapshots. Daemon may be running. |
csm restore <archive> | Extract a backup archive into the live csm.yaml + conf.d + state directory. Rejects path-traversal entries and pre-existing symlinks under restore targets. Stop the daemon first. |
csm store export / csm store import (below) is the lower-level alternative: tar+zstd, sha256-verified, finer-grained --only= flags. csm backup/restore is the convenience wrapper most operators want.
Hardening
Operator-driven mitigations applied to the host. Run csm harden with no arguments to print the available subcommands on the current host (the audit detects kernel build, panel, and existing mitigations and only offers what’s relevant). Background, full list, and live-detection details: CVE Mitigations.
| Command | Description |
|---|---|
csm harden | Print the hardening menu for this host. |
csm harden --copy-fail | Apply the CVE-2026-31431 (Copy Fail) modprobe mitigation: blacklist algif_aead + af_alg, unload them. Refuses on built-in-AF_ALG kernels. |
csm harden --copy-fail-seccomp | Apply the CVE-2026-31431 seccomp mitigation: write systemd RestrictAddressFamilies=~AF_ALG drop-ins for LiteSpeed, Apache/Nginx, every PHP-FPM pool, cron, and mail units. The right path on built-in-AF_ALG kernels (typical cPanel/CloudLinux 8). |
Remediation
| Command | Description |
|---|---|
csm clean <path> | Clean infected PHP file (backs up original) |
csm db-clean --option <account> <option_name> [--preview] | Sanitize malicious WordPress option values (e.g. injected siteurl / home) |
csm db-clean --revoke-user <account> <user_id> [--demote] [--preview] | Revoke or demote a compromised WordPress admin and invalidate their sessions |
csm db-clean --delete-spam <account> [--preview] | Purge spam comments and trackbacks from a WordPress account |
csm db-clean --drop-object <account> <schema> <type> <name> [--preview] | Drop a MySQL trigger / event / stored procedure / stored function, capturing its CREATE SQL into the db_object_backups bbolt bucket first. <type> must be trigger, event, procedure, or function. <schema> must match a database discovered for <account>. Daemon must be stopped. |
csm enable --php-shield | Enable PHP runtime protection |
csm disable --php-shield | Disable PHP runtime protection |
State database
| Command | Description |
|---|---|
csm store compact | Reclaim unused space in the bbolt state file (atomic rename over the live DB). Requires the daemon to be stopped (systemctl stop csm) because bbolt holds an exclusive file lock while running. |
csm store compact --preview | Snapshot into a temp file next to the live DB and print src/dst sizes without replacing anything. Use to estimate reclaim before scheduling a maintenance window. |
csm store export <path> | Write a tar+zstd backup containing the bbolt store, the state directory, and the signature-rules cache. A sibling <path>.sha256 companion file holds the archive hash for verification. Daemon must be running. |
csm store import <path> | Restore from a backup archive. Daemon must be stopped. Default restores everything; --only=baseline restores only state JSON files (file hashes); --only=firewall merges only firewall buckets into the existing bbolt; --force-platform-mismatch allows restoring an archive captured on a different OS / panel / web server. |
csm store reset-bot-verify | Drop cached bot PTR verification results so the next scan re-runs reverse DNS checks. Requires the daemon to be stopped because bbolt holds an exclusive file lock while running. |
csm export --since <when> | Dump audit-log events for SIEM backfill. <when> is RFC 3339 (2026-04-01T00:00:00Z) or a duration relative to now (24h, 7d). One JSON event per line on stdout, in the same v=1 schema the live audit_log sinks emit. Pipe to a file or directly into a log shipper. Daemon must be running. |
Updates
| Command | Description |
|---|---|
csm update-rules | Download latest signature rules |
csm update-geoip | Update MaxMind GeoLite2 databases |
PHP-relay (mail abuse, cPanel only)
Operator controls for the email PHP-relay detector. Talks to the daemon’s control socket; the daemon must be running. See Real-time detection for what the detector fires on, and Auto-response for the freeze action.
| Command | Description |
|---|---|
csm phprelay status | Print the detector’s current state as JSON: enabled, platform, effective dry-run + source (runtime/bbolt/csm.yaml), Path 2b effective account limit, scripts/IPs/accounts tracked, msgID-index size, active ignores. Use to confirm the watcher is wired on a fresh install. |
csm phprelay ignore-script <scriptKey> [--for-hours N] [--persist] [--reason ...] | Suppress all 4 paths for a host:/path scriptKey. Default TTL 168h (7d). --persist writes to the bbolt phprelay:ignore bucket so the suppression survives daemon restarts; without it the entry is in-memory only. <scriptKey> is the value the daemon prints in email_php_relay_abuse findings (e.g. shop.example.com:/wp-admin/admin-ajax.php). |
csm phprelay unignore <scriptKey> [--persist] | Remove an active ignore. --persist also deletes the bbolt row. |
csm phprelay ignore-list | List all active ignores as JSON: scriptKey, expiresAt, addedBy, reason. |
csm phprelay dry-run on|off|reset [--persist] | Override the auto-freeze dry-run state at runtime. on = freeze findings emitted but no exim -Mf runs; off = live freezes; reset clears the runtime override and falls back to bbolt or csm.yaml. Precedence: runtime > bbolt > yaml. --persist writes the on/off choice to the bbolt phprelay:settings bucket so it survives restarts; on reset --persist the bbolt row is also deleted. |
csm phprelay thaw <msgID> | Manually thaw a frozen Exim message. Wraps exim -Mt with msgID validation (rejects anything that isn’t [A-Za-z0-9-]{16,32}) and writes a thaw entry to the auto-freeze JSONL audit at /var/log/csm/php_relay_audit.jsonl. |
Firewall
See Firewall for the full reference.
csm firewall status
csm firewall deny <ip> [reason]
csm firewall allow <ip> [reason]
csm firewall tempban <ip> <dur> [reason]
csm firewall deny-subnet <cidr> [reason]
csm firewall grep <pattern>
csm firewall flush
csm firewall rollback status|confirm|revert
# ...
Real-Time Detection
CSM detects threats in under 2 seconds using three kernel-level watchers running inside the daemon.
fanotify File Monitor (< 1 second)
Monitors /home, /tmp, /dev/shm for filesystem events.
Detects:
- Webshell creation (PHP files in web directories)
- PHP in uploads, languages, upgrade directories
- PHP in
.ssh,.cpanel, mail directories (critical escalation) - Executable drops in
.config .htaccessinjection (auto_prepend, eval, base64 handlers).user.initampering- Obfuscated PHP (encoded, packed, concatenated)
- Fragmented base64 evasion (
$a="base"; $b="64_decode"– function name split across variables) - Concatenation payloads (hundreds of
$z .= "xxxx"lines with eval at end) - Tail scanning: payloads appended to the end of large legitimate PHP files (beyond the 32KB head window)
- CGI backdoors: Perl, Python, Bash, Ruby scripts in web directories (e.g., LEVIATHAN toolkit)
- SEO spam: gambling/togel dofollow link injection in PHP/HTML files
- Phishing pages and credential harvest logs
- Phishing kit ZIP archives
- YAML signature matches (PHP, HTML, .htaccess, .user.ini)
- YARA-X rule matches (if built with
-tags yara)
Features:
- Per-path alert deduplication (30s cooldown)
- Process info enrichment (PID, command, UID)
- Auto-quarantine on high-confidence matches (category + entropy validation)
inotify Log Watchers (~2 seconds)
Tails auth, access, and mail logs in real-time. The exact file paths are chosen per platform at daemon startup – see the platform: ... line in the daemon log.
| Log | Platforms | What it detects |
|---|---|---|
cPanel session log (/usr/local/cpanel/logs/session_log) | cPanel only | Logins from non-infra IPs, password changes, File Manager uploads |
cPanel access log (/usr/local/cpanel/logs/access_log) | cPanel only | cPanel-API auth patterns |
| Auth log | All | SSH logins and failures. /var/log/auth.log on Debian/Ubuntu, /var/log/secure on RHEL family and cPanel |
Exim mainlog (/var/log/exim_mainlog) | cPanel; non-cPanel when the file exists | Mail anomalies, queue issues, SMTP brute force, probe abuse, and cloud relay abuse |
| Apache/LiteSpeed/Nginx access log | All | WordPress brute force (wp-login.php, xmlrpc.php), real-time. Paths: /var/log/apache2/access.log (Debian), /var/log/httpd/access_log (RHEL), /var/log/nginx/access.log (Nginx), /usr/local/apache/logs/access_log (cPanel) |
| Mail log (platform file or journal) | All hosts with Postfix/Dovecot logs | IMAP/POP3/ManageSieve account compromise and mail brute-force |
FTP log (/var/log/messages) | cPanel only | FTP logins and failures |
| ModSecurity error log | All (if ModSec installed) | WAF blocks and attacks. Auto-discovered from the detected web server |
Nginx error log (/var/log/nginx/error.log) | Nginx hosts | General web errors, ModSecurity denies |
cPanel-only log watchers are not registered on non-cPanel hosts, so you will not see “not found, retrying every 60s” warnings for them on plain Ubuntu or AlmaLinux.
SMTP / Dovecot Brute-Force Tracker
Detects credential stuffing, password spray, and raw SMTP probe storms. Runs as part of the Exim mainlog watcher on cPanel hosts and on non-cPanel Exim hosts where /var/log/exim_mainlog exists.
Four attack patterns:
| Signal | What triggers it | Auto-response |
|---|---|---|
smtp_bruteforce | A single attacker IP exceeds the per-IP failed-auth threshold within the configured window | IP blocked via nftables |
smtp_probe_abuse | A single attacker IP exceeds the raw SMTP connect-rate threshold before AUTH | IP blocked via nftables |
smtp_subnet_spray | Multiple distinct attacker IPs from the same /24 subnet exceed the subnet threshold | Entire /24 subnet blocked via nftables |
smtp_account_spray | Many distinct attacker IPs targeting the same mailbox exceed the account threshold | Visibility finding only. No auto-block, because attackers span many subnets and no single-IP action helps |
Tunable via the thresholds.smtp_bruteforce_* and thresholds.smtp_probe_* keys in csm.yaml. Infrastructure IPs (from infra_ips) are never counted or blocked.
Cloud-Relay Credential Abuse
Detects authenticated outbound Exim deliveries where the same mailbox is sending through public-cloud relay sources. The realtime Exim mainlog watcher evaluates new accepted deliveries, and a bounded startup replay covers recent lines already on disk.
The finding is email_cloud_relay_abuse. Auto-response actions follow the global dry-run and block settings plus the email hold path. Operators with legitimate cloud mailers can opt out specific mailboxes or domains under email_protection.cloud_relay, or use email_protection.high_volume_senders for known high-volume senders.
Mail Auth Brute-Force Tracker
Detects credential stuffing and password spray against IMAP, POP3, and ManageSieve. Runs through the mail_logs reader: file source uses /var/log/mail.log on Debian-family hosts and /var/log/maillog on RHEL-family and cPanel hosts, while journal source reads configured Postfix/Dovecot units. The wrapper composes with the existing geo-based login monitor, so email_suspicious_geo keeps firing for successful logins from novel countries.
Four attack patterns:
| Signal | What triggers it | Auto-response |
|---|---|---|
mail_bruteforce | A single attacker IP exceeds the per-IP failed-auth threshold within the configured window | IP blocked via nftables |
mail_subnet_spray | Multiple distinct attacker IPs from the same /24 subnet exceed the subnet threshold | Entire /24 subnet blocked via nftables |
mail_account_spray | Many distinct attacker IPs targeting the same mailbox exceed the account threshold | Visibility finding only. No auto-block, because attackers span many subnets and no single-IP action helps |
mail_account_compromised | A successful login comes from an IP that just failed auth against the same account | IP blocked immediately. Rotate the password and revoke sessions |
Tunable via the thresholds.mail_bruteforce_* keys in csm.yaml. Independent from the SMTP tracker so the Dovecot noise floor can be tuned separately. Infrastructure IPs are never counted or blocked.
Admin-Panel Brute-Force Tracker
Counts repeated POST requests to high-value non-WordPress admin login endpoints. Runs as part of the web access-log watcher.
Covered endpoints (tight set to avoid false positives on shared hosting):
- phpMyAdmin:
/phpmyadmin/index.php,/pma/index.php,/phpMyAdmin/index.php - Joomla:
/administrator/index.php
When an IP crosses the POST-rate threshold, admin_panel_bruteforce fires and the attacker IP is auto-blocked.
Drupal /user/login and Tomcat Manager /manager/html are intentionally out of scope here. Drupal’s path is too generic on shared hosting, and Tomcat Manager uses HTTP Basic auth (repeated GET requests with 401 responses), not POST form submissions. Both need different detectors and are tracked as follow-up work.
PHP-Relay (Mail Abuse, cPanel Only)
Real-time inotify watcher on /var/spool/exim/input catches WordPress contact-form spam relays where an attacker uses PHPMailer (or similar) with a spoofed From, an external Reply-To, and a script URL that doesn’t belong to the cPanel account. The occonsultingcy incident (2026-04) drove the design: a legitimate site running a vulnerable contact-form plugin became a per-message spam relay through the operator’s own mail account.
The detector runs four paths and only fires email_php_relay_abuse (Critical) when one of them crosses threshold. Paths 1 and 2 are scoped per-script, using the host:/path from the X-PHP-Script Exim header. Path 2b is per cPanel user. Path 4 is per HTTP source IP across distinct scripts.
| Path | What triggers it | Why it exists |
|---|---|---|
| Path 1: header score | Per-script: From domain not in the account’s authorised domains AND additional signal (PHPMailer / suspicious Reply-To / suspicious User-Agent), evaluated over a rolling 5-min window once the script has emitted at least header_score_volume_min messages | The shape that matched the original incident: spoofed sender, contact-form-style. FromMismatch is a HARD precondition – the score never accumulates without it |
| Path 2: absolute volume per script | A single script emits more than absolute_volume_per_hour messages in the last hour | Catches a compromised script even if the headers themselves are legit-shaped |
| Path 2b: account log-tail volume | Per cPanel user: more than effective_account_limit outbound messages through the redirect_resolver router in the last hour. The effective limit is auto-derived from /var/cpanel/cpanel.config’s maxemailsperhour (60% of it, clamped to 20-60), capped at 95% of the cPanel limit when an operator override is set | Backstop for when Path 2 misses the window. Reads /var/log/exim_mainlog directly; only fires on lines tagged B=redirect_resolver so forwarders don’t trip it |
| Path 4: HTTP-IP fanout | Per HTTP source IP: one source IP appears in more than fanout_distinct_scripts distinct script keys in fanout_window_min minutes, after excluding loaded HTTP-proxy ranges, loopback, and the host’s own interface addresses | Catches one client walking many scripts while avoiding CDN/proxy traffic and local cron or panel callbacks |
Path 5 (behavioural baseline) is deferred to Stage 2.
The detector starts a one-shot retrospective scan of exim_mainlog at daemon startup so Path 2b can fire on history already on disk. IN_Q_OVERFLOW triggers a bounded recovery walk of the spool (capped at 1000 files; if more were skipped, a email_php_relay_overflow_scan_truncated Critical fires too – Path 2b backstops the missed messages).
Operator suppressions (csm phprelay ignore-script <host:/path>) short-circuit the pipeline before any path scoring runs, so a known-noisy contact form can be opted out individually without disabling the detector. See PHP-relay CLI for the full operator surface.
PAM Brute-Force Listener
Real-time authentication monitoring across all PAM-enabled services.
- SSH login tracking with geolocation
- cPanel, FTP, and webmail authentication
- Credential stuffing / password spray breadth: one source IP failing against many distinct accounts inside
thresholds.multi_ip_login_window_min. The finding iscredential_stuffing; tune the account floor withthresholds.cred_stuffing_distinct_accounts(default 5). - Blocks IPs within seconds of threshold breach
- Integrates with the nftables firewall for instant blocking
Process Context
Exec and outbound-connection findings carry an optional process object with
PID, PPID, UID, user, cPanel account (when known), comm, exe, sanitized cmdline,
and a parent chain up to depth 5. The chain is materialized from an in-memory
LRU+TTL cache (cap 16384 entries, 30-minute TTL) populated from BPF exec
events. Cache misses trigger a bounded async /proc read, so process-context
enrichment does not add blocking work to the connection event loop. When
neither cache nor enricher has data (e.g., a process that exited before
userspace reads its event), the process field is omitted entirely and the
finding still emits.
Counters exposed at /metrics:
csm_process_context_cache_entriescsm_process_context_cache_evictions_total(LRU)csm_process_context_cache_ttl_purges_totalcsm_process_context_cache_misses_total(includes TTL purges)csm_process_context_enrich_queue_drops_totalcsm_process_context_enrich_reads_totalcsm_process_context_enrich_errors_totalcsm_process_context_enrich_stale_totalcsm_process_context_enrich_latency_seconds
Caveats:
started_atis emitted only when the event source supplies a trustworthy start timestamp. Phase 1 does not infer process start time from procfs directory metadata. A future refinement may add/proc/<pid>/statfield 22/proc/statbtime for kernel-tick precision.
- After daemon restart, the
csm_process_context_enrich_*counters may show a smallenqueued - readsdelta. Pending requests in the enricher queue are dropped on shutdown by design. - Hosts without BPF support fall back to
/proc/net/tcp[6]polling. That path has no PID, so emitted findings do not carry aprocessfield. A future refinement could resolve the socket inode to a PID via/proc/<pid>/fd, but that is out of scope for Phase 1.
HTTP Flood, UA Spoof, and Distributed Flood
http_request_flood, http_ua_spoof, and http_distributed_flood are periodic, not real-time. They run inside the same wp_bruteforce scheduled check that scans per-vhost access logs every 10 minutes. A real-time inotify tailer would need to hold per-IP state across log rotations and is out of scope for the initial release (see the plan non-goals). For attack types where sub-minute response matters, the access-log inotify watcher already covers wp_login_bruteforce and xmlrpc_abuse; the periodic scan adds volume-based rate enforcement and per-vhost distributed attack rollups on top.
Direct SMTP Egress
Outbound connections to SMTP ports from non-MTA local processes
emit a direct_smtp_egress finding. See
Direct SMTP egress for the full rule set,
config schema, and metric.
Direct SMTP egress
CSM watches the local mail stack via spool + log scanning. Non-MTA processes that open outbound SMTP connections directly bypass that path. The direct SMTP egress detector catches that at connect time and feeds the incident correlator from Phase 2.
What fires
A finding with check: "direct_smtp_egress" is emitted when:
- A non-root process opens an outbound TCP connection.
- Destination port is one of the configured SMTP ports (default 25, 465, 587).
- Destination IP is not loopback, infra, or in the operator’s
infra_ipslist. - The process user is NOT a known MTA user (mail, mailnull, postfix, dovecot, dovenull, mailman, plus exim on cPanel).
Process names are never a standalone allow condition. A hosted account
renaming malware to smtp or smtpd still emits a finding.
The detector always emits findings when enabled. The dry_run knob does not suppress findings; it participates in the Phase 4 BPF enforcement gate, where any dry_run=true layer keeps kernel denial in observe-only mode.
Configuration
detection:
direct_smtp_egress:
enabled: true
backend: auto # auto / bpf / legacy / none
dry_run: true # safety default for detector-scoped action
ports: # each value must be 1-65535
- 25
- 465
- 587
Backends
auto– allow both BPF and legacy scan paths. Live backend choice still followsdetection.connection_tracker_backend.bpf– emit only from the cgroup/connect4,6 consumer.legacy– emit only from the/proc/net/tcp[6]polling path (live poller or scheduled critical scan). This path lacks PID/comm; MTA matching is user-only.none– detector disabled even whenenabled: trueis set elsewhere; useful for staged rollout.
The generic outbound connection tracker is still governed by
detection.connection_tracker_backend; this setting only gates
direct_smtp_egress findings.
Metric
csm_direct_smtp_egress_findings_total – monotonic counter,
incremented per finding emitted by the BPF connection consumer. The
legacy poller does not bump this counter today; operators who run
backend=legacy should track findings via the audit log.
rDNS enrichment
When the BPF backend is active, finding details include a Domain field populated from a TTL-cached reverse lookup (30 min TTL, 1 second per-lookup deadline). The lookup runs only after the cheap direct-SMTP filters match. On resolver miss or timeout the field is omitted; the finding still fires.
Caveats
2525is intentionally NOT in the default port list. Many operators run unrelated services on it. Add it toportsif your infra uses it for submission.- The detector emits regardless of the dry_run knob. Kernel denial
requires
auto_response.dry_run, this dry_run key, andbpf_enforcement.dry_runto all be explicitly false.
BPF cgroup-deny enforcement
Phase 4 of the BPF Incident Response Roadmap. Optional in-kernel denial of outbound connections that match a Phase 3 detection (direct SMTP egress is the only gate landed today). Defaults are all-safe; operators flip live denial only after Phase 3 telemetry review.
What it does
When bpf_enforcement.enabled=true, direct_smtp_egress=true, the
connection tracker is running on BPF, and all dry-run layers are false:
- The cgroup/connect4 + cgroup/connect6 BPF program inspects each outbound TCP connect.
- If destination port is in the protected set AND the source UID is not in the safe-UID map AND the gated detector matches, the program returns 0 (kernel denies the connect).
- Userspace observes the decision via the
decisionfield on the ringbuf event and emits an audit-log entry.
When any dry-run layer is true (the default), the program emits the decision but always returns 1 (allow). Operators can run dry-run for as long as they need to gather telemetry before flipping to live denial.
What it does NOT do
- It does NOT wait on remote verdict callbacks in-kernel. That would add HTTP latency to every connect. The verdict callback (if enabled) runs in userspace after the BPF decision and enriches the emitted finding; it cannot undo a kernel denial.
- It does NOT enforce on UDP, ICMP, or non-cgroup paths.
- It does NOT replace any Phase 3 detection. Detections still run regardless; enforcement is a separate, layered control.
Configuration
bpf_enforcement:
enabled: false # master switch; default off
dry_run: true # safety default; flip after telemetry review
direct_smtp_egress: false # gate enforcement on the Phase 3 detector
verdict_callback: false # userspace post-decision callback
bpf_enforcement.enabled=true requires at least one feature gate.
Today the only gate is direct_smtp_egress, which itself requires
detection.direct_smtp_egress.enabled=true. The connection tracker
backend must be auto or bpf, and the direct SMTP backend must be
auto or bpf.
Kernel requirements
- Linux >= 4.10 with
CONFIG_CGROUP_BPF=y. cgroup/connect4andcgroup/connect6BPF program types.- The capability surface
bpf_enforcement.available.v1is the wire signal that the binary supports the feature; combined withbpf_enforcement_activeon the health snapshot, operators can detect both feature presence and runtime state.
On older kernels or default builds without the BPF tag,
detection.connection_tracker_backend: auto falls back to the legacy
/proc/net/tcp[6] poller. In that state direct SMTP findings still
work when detection.direct_smtp_egress.backend is auto or
legacy, but BPF enforcement is inactive.
When CSM attempts BPF and cannot start it, it emits a
bpf_unavailable finding. The message reports whether the daemon is
running on a fallback backend or has no live fallback active.
Metrics
csm_bpf_enforcement_decisions_total{decision="allow|dry_run|deny"}csm_bpf_enforcement_uid_map_refresh_total– successful periodic refreshes of the safe-UID BPF map.csm_bpf_enforcement_uid_map_refresh_failures_total– failed refreshes (e.g. /etc/passwd unreadable).
Dry-run precedence
Three independent dry_run knobs interact:
auto_response.dry_run(global): suppresses every automatic action (firewall block, kill, etc.).detection.direct_smtp_egress.dry_run: detector-scoped action knob.bpf_enforcement.dry_run: kernel-side denial knob.
Rule: any dry_run=true wins. Live denial requires all three to be false at the layer they apply, plus a BPF runtime backend. Defaults are dry_run=true everywhere on first install.
Rollout recipe
- Phase 3 detector enabled, no Phase 4 wiring. Watch
csm_direct_smtp_egress_findings_totalfor a week. - Phase 4 enabled with
dry_run: true. Watchcsm_bpf_enforcement_decisions_total{decision="dry_run"}and confirm dry-run denials track expected hosted-account egress. - Phase 4 dry_run=false on a single canary host. Audit incidents for false positives.
- Roll out to fleet.
Critical Checks
Critical checks run every 10 minutes. Typical wall-clock cost on a busy shared host is a few seconds; the runner enforces the 10-minute cadence even when a tick takes longer.
Process & System
| Check | Description |
|---|---|
fake_kernel_threads | Non-root processes masquerading as kernel threads (rootkit indicator) |
suspicious_processes | Reverse shells, interactive shells, GSocket, suspicious executables |
php_processes | PHP process execution, working dirs, environment variables |
shadow_changes | /etc/shadow modification outside maintenance windows |
uid0_accounts | Unauthorized root (UID 0) accounts |
kernel_modules | Kernel module loading (post-baseline) |
af_alg_socket_use | AF_ALG socket use that may indicate Copy Fail exploit activity |
af_alg_enforcement | AF_ALG hardening policy drift and correction status |
SSH & Access
| Check | Description |
|---|---|
ssh_keys | Unauthorized entries in /root/.ssh/authorized_keys |
sshd_config | SSH hardening (PermitRootLogin, PasswordAuthentication, etc.) |
ssh_logins | SSH access anomalies with geolocation |
api_tokens | cPanel/WHM API token usage |
whm_access | WHM/root login patterns, multi-IP access |
cpanel_logins | cPanel login anomalies, multi-IP correlation |
cpanel_filemanager | File Manager usage for unauthorized access |
Network
| Check | Description |
|---|---|
outbound_connections | Root-level outbound to non-infra IPs (C2, backdoor ports) |
user_outbound | Per-user outbound connections (non-standard ports) |
bad_asn_outbound | Outbound connection whose destination resolves (via GeoLite2-ASN) to a bad or unexpected autonomous system. Config detection.bad_asn_outbound: blocked_asns (always bad) and/or allowed_asns (allowlist mode – anything outside is bad). Classified for every process including root (the periodic connection scan); non-root connections are also flagged in real time by the live BPF tracker. Off by default; the third leg of the host_takeover incident chain |
dns_connections | DNS exfiltration and suspicious queries |
firewall | Firewall status and rule integrity |
Brute Force & Auth
| Check | Description |
|---|---|
wp_bruteforce | WordPress login brute force (wp-login.php, xmlrpc.php) |
http_ua_spoof | IP claiming a search-engine bot UA (Googlebot, Bingbot, Applebot) that fails reverse-DNS verification, or exceeding the per-IP spoof threshold for scripting/headless/empty UAs when those opt-in flags are enabled |
http_distributed_flood | Many already-abusive HTTP source IPs hitting the same vhost in one scheduled scan window |
ftp_logins | FTP access patterns and failed auth |
webmail_logins | Roundcube/Horde access anomalies |
api_auth_failures | API authentication failure patterns |
| Check | Description |
|---|---|
mail_queue | Mail queue buildup (spam outbreak indicator) |
mail_per_account | Per-account email volume spikes |
Data & Integrity
| Check | Description |
|---|---|
crontabs | Suspicious cron jobs and scheduled commands |
mysql_users | MySQL user accounts and privileges |
database_dumps | Database exfiltration attempts |
exfiltration_paste | Connections to pastebin/code-sharing sites |
Threat Intelligence
| Check | Description |
|---|---|
ip_reputation | IPs against external threat databases and optional rspamd history |
local_threat_score | Aggregated score from internal attack database |
modsec_audit | ModSecurity audit log parsing |
Performance
| Check | Description |
|---|---|
perf_load | CPU load average thresholds |
perf_php_processes | PHP process count and memory |
perf_memory | Swap usage and OOM killer activity |
Health
| Check | Description |
|---|---|
health | Daemon health, binary integrity, required services |
Platform Support
Runs on every supported platform unless noted below. The daemon auto-detects OS and panel at startup and silently skips cPanel-specific checks on plain Linux hosts (no “not found” spam).
cPanel-only (skipped on plain Ubuntu/AlmaLinux):
api_tokens,whm_access,cpanel_logins,cpanel_filemanager– read WHM API and cPanel session logswp_bruteforce– iterates/home/*/public_html/*/wp-login.phpand per-domain access logs. The domlog pass ranks recent logs first and honorsthresholds.domlog_max_files,thresholds.domlog_tail_lines, andthresholds.domlog_max_age_min.webmail_logins– parses cPanel Roundcube/Horde logsmail_queue,mail_per_account– read Exim queue and/var/log/exim_mainlog
Plain Linux equivalents that still provide coverage:
- Access log brute-force detection (
wp_login_bruteforce,xmlrpc_abuse) runs against the detected web server’s access log (/var/log/nginx/access.logor/var/log/httpd/access_log), so WordPress brute-force alerts still fire on non-cPanel hosts – they just rely on the live log watcher rather than per-domain domlog scanning. modsec_auditruns on any host with ModSecurity installed.ssh_logins, SSH brute force, PAM listener, firewall, kernel modules, RPM/DEB integrity, and threat intelligence all run on every supported platform.
Deep Checks
Deep checks run every 60 minutes and cover thorough filesystem, CMS, email, and database scans.
Filesystem
| Check | Description |
|---|---|
filesystem | Backdoors, hidden executables, suspicious SUID binaries |
webshells | Known webshell patterns (c99, r57, b374k, etc.) |
htaccess | .htaccess injection (auto_prepend_file, eval, base64 handlers) plus seven hardened per-pattern detectors – htaccess_php_in_uploads, htaccess_auto_prepend, htaccess_user_agent_cloak, htaccess_spam_redirect, htaccess_filesmatch_shield, htaccess_header_injection, htaccess_errordocument_hijack. Auto-cleaning gated by auto_response.clean_htaccess. |
file_index | Indexed file listing to detect new/unauthorized files |
php_content | Suspicious PHP functions (exec, eval, system, passthru) |
group_writable_php | World/group-writable PHP files (privilege escalation) |
symlink_attacks | Symlink-based privilege escalation attempts |
WordPress
| Check | Description |
|---|---|
wp_core | Core file integrity via official WordPress.org checksums |
nulled_plugins | Cracked/nulled plugin detection |
outdated_plugins | Plugins with known CVEs |
db_content | Database injection, siteurl hijacking, rogue admins, spam. Multisite-aware: when wp-config.php declares define('MULTISITE', true), secondary blogs (wp_<N>_options / wp_<N>_posts for active blog IDs from wp_blogs) are scanned alongside the unprefixed main-site tables. |
db_content_joomla | Joomla database content scanning. Discovers installs via configuration.php containing class JConfig, parses credentials from public $...; assignments. Scans <prefix>extensions params, <prefix>content article bodies, and joins <prefix>users with <prefix>user_usergroup_map for Super User detection (group_id=8). Findings: joomla_extensions_injection, joomla_content_injection, joomla_admin_injection. |
db_content_drupal | Drupal 8+ database content scanning. Discovers installs via sites/default/settings.php plus the core/lib/Drupal.php marker. Credentials parsed from the $databases array. Scans config, node_revision__body, and users_field_data joined with user__roles (administrator role). Findings: drupal_settings_injection, drupal_content_injection, drupal_admin_injection. Drupal 7 not yet covered. |
db_content_magento | Magento 1.x and 2.x database content scanning. Discovers installs via app/etc/env.php (M2, preferred) or app/etc/local.xml (M1). Credentials parsed via encoding/xml for M1 (CDATA-aware) or field-level regex for M2. Scans core_config_data, catalog_product_entity_text, cms_block, cms_page, and admin_user (with the configured db.prefix). Findings: magento_settings_injection, magento_content_injection, magento_admin_injection. |
db_content_opencart | OpenCart database content scanning. Discovers installs via the config.php + admin/config.php pair both containing define('DB_DRIVER'. Credentials parsed from DB_HOSTNAME / DB_USERNAME / DB_PASSWORD / DB_DATABASE / DB_PREFIX defines. Scans <prefix>setting (config_url / config_ssl are canonical hijack targets), <prefix>product_description, <prefix>information_description, and <prefix>user (admin/staff). Findings: opencart_settings_injection, opencart_content_injection, opencart_admin_injection. |
db_objects | MySQL persistence mechanisms: triggers, events, stored procedures, stored functions. Critical when the body matches known-malware patterns (sys_+exec, INTO OUTFILE, LOAD_FILE, etc.); Warning when an object exists at all (vanilla CMSes ship none). Toggle with detection.db_object_scanning; suppress Warnings via detection.db_object_allowlist. Manual drop via csm db-clean --drop-object. |
admin_overlap | WordPress administrator email overlap across cPanel accounts. Reports when the same admin email appears on the configured number of accounts, with reviewed emails and domains suppressible in detection. |
credential_reuse | WordPress administrator password-hash reuse across cPanel accounts. Groups identical hashes with an in-memory fingerprint and reports only the affected accounts and count. |
supply_chain | Composer and npm lockfile advisory matching against the local advisory database. Silent when no advisory file is present. |
CMS Scanner Support Policy
New CMS scanner work targets upstream-supported major versions. EOL versions are best-effort when the existing scanner covers them through the same low-risk layout or schema. Adding a new EOL-only scanner needs operator fleet data and an explicit security reason.
Current scanner scope:
- WordPress single-site and multisite.
- Joomla installs using the common
configuration.php/JConfiglayout and standard content/user tables used by supported Joomla releases. - Drupal 8 and newer. Drupal 7 is not a planned support target.
- Magento 1 and 2.
- OpenCart installs using the standard storefront and admin config pair.
Phishing & Malware
| Check | Description |
|---|---|
phishing | 8-layer phishing detection (kit directories, credential harvesting) |
email_content | Outbound email body scanning for credentials and suspicious URLs |
System Integrity
| Check | Description |
|---|---|
rpm_integrity | System binary verification via rpm -V |
open_basedir | open_basedir restriction validation |
php_config_changes | php.ini modifications |
DNS & SSL
| Check | Description |
|---|---|
dns_zones | DNS zone file changes (MX record hijacking) |
ssl_certs | SSL certificate issuance (subdomain takeover) |
waf_status | WAF mode, staleness, bypass detection |
Email Security
| Check | Description |
|---|---|
email_weak_password | Email accounts with weak passwords |
email_forwarder_audit | Forwarders redirecting to external addresses |
email_mail_filters | Exim mail filters that intercept mail (copy to an external address while keeping a local copy), forward externally, pipe to a command, or blackhole all mail |
Performance
| Check | Description |
|---|---|
perf_php_handler | PHP handler configuration (DSO vs CGI vs FPM) |
perf_mysql_config | MySQL my.cnf optimization |
perf_redis_config | Redis configuration |
perf_error_logs | Error log file growth (bloat) |
perf_wp_config | WordPress wp-config.php settings |
perf_wp_transients | WordPress database transient bloat |
perf_wp_cron | WordPress cron scheduling (missed crons) |
Platform Support
The deep checks are the most cPanel-biased part of CSM because they iterate account home directories and per-user public_html trees. On plain Ubuntu/AlmaLinux the account-scan based checks do not run today:
cPanel-only (skipped on plain Linux):
htaccess,file_index,php_content,group_writable_php,symlink_attacks– iterate/home/*/public_html/**wp_core,nulled_plugins,outdated_plugins,db_content– find WordPress installs under/home/*/public_htmlsupply_chain– scanscomposer.lockandpackage-lock.jsonunder/home/*and/home/*/public_htmlphishing,email_content– scan user home directories and Exim spooldns_zones,ssl_certs– read cPanel’s DNS zone store and SSL installation recordsemail_weak_password,email_forwarder_audit– read/etc/valiases, Dovecot/Courier auth databasesemail_mail_filters– read per-mailbox Exim filters under/home/*/etc/<domain>/<localpart>/filterand domain filters under/etc/vfiltersopen_basedir,php_config_changes– read EA-PHPphp.iniunder/opt/cpanel/ea-php*/perf_wp_config,perf_wp_transients,perf_wp_cron,perf_php_handler– WordPress and PHP handler introspection via cPanel’s EA-PHP layout
Runs on every platform:
filesystem,webshells– fanotify and file-tree scans over/home,/tmp,/dev/shmrpm_integrity– dispatches torpm -Von RHEL family ordebsums/dpkg --verifyon Debian familywaf_status– detects ModSecurity on Apache, Nginx, and LiteSpeed across all supported distrosperf_mysql_config,perf_redis_config,perf_error_logs– rely on standard service locations
Operators on plain Linux can opt a subset of the account-scan perf checks (perf_error_logs, perf_wp_config, perf_wp_transients) into scanning generic webroots by configuring the account_roots glob list in csm.yaml (see configuration.md). The remaining account-scan checks still assume the cPanel /home/*/public_html layout.
Auto-Response
When enabled, CSM automatically responds to detected threats. All actions are logged in the audit trail.
Actions
| Action | Description |
|---|---|
| Kill processes | Fake kernel threads, reverse shells, GSocket. Never kills root or system processes. |
| Quarantine files | Moves webshells, backdoors, phishing to /opt/csm/quarantine/ with full metadata (owner, permissions, mtime). Restoreable from the web UI. |
| Block IPs | Adds attacker IPs to the nftables firewall with configurable expiry. Rate-limited by auto_response.max_blocks_per_hour (default 50/hour). |
| Clean malware | 7 strategies: @include removal, prepend/append stripping, inline eval removal, base64 chain decoding, chr/pack cleanup, hex injection removal, confirmed database cleanup. |
| Drop malicious DB objects | When clean_database is on, confirmed-malicious stored triggers/events/procedures/functions are dropped after a SHOW CREATE backup is recorded, so the drop is reversible. Detection runs regardless; the drop is gated on the operator opt-in. |
| PHP shield | Blocks PHP execution from uploads/tmp directories, detects webshell parameters. |
| PAM blocking | Instant IP block on brute force threshold breach. |
| Subnet blocking | Auto-blocks IPv4 /24 or IPv6 /64 when 3+ IPs from the same range attack. |
| Permblock escalation | Promotes temporary blocks to permanent after N repeated offenses. |
| Auto-freeze (PHP relay) | When the email PHP-relay detector fires (Path 1 / 2 / 4), runs exim -Mf against the message IDs the offending script is currently sending. Snapshots activeMsgs from the per-script window first, falls back to a spool walk if the snapshot was capped or if the finding is a late reputation event. Default dry-run; flip to live with csm phprelay dry-run off. Skips volume_account (per-cpuser, no scriptKey). Rate-limited to auto_response.php_relay.max_actions_per_minute (default 60). cPanel only. See PHP-relay CLI. |
Configuration
auto_response:
enabled: true
kill_processes: true
quarantine_files: true
block_ips: true
block_expiry: "24h" # default temp block duration
max_blocks_per_hour: 50 # per-IP blocks per hour; 0/omitted uses default
netblock: true # enable subnet blocking
netblock_threshold: 3 # IPs from same IPv4 /24 or IPv6 /64 before subnet block
permblock: true # promote temp blocks to permanent
permblock_count: 4 # temp blocks before promotion
# SAFETY DEFAULT: dry_run defaults to TRUE when this key is absent.
# In dry-run, BlockIP records the intended block to bbolt but does
# NOT touch nftables. Manual operator commands (`csm firewall ...`)
# bypass via BlockIPForce and always apply. Flip to false only after
# verifying the policy in dry-run.
dry_run: false
# Advisory verdict callback. CSM POSTs each impending auto-block
# to the panel before applying. The panel can downgrade to "allow"
# (audit-only), attach `tenant_id` for downstream correlation, or
# add a reason. CSM fails open on hook errors. Wire contract:
# docs/verdict-callback-contract.md.
verdict_callback:
enabled: false
url: "" # POST target
hmac_secret: "" # signing secret, or use hmac_secret_env
hmac_secret_env: ""
allow_unsigned: false # true only for staged unsigned rollouts
require_response_signature: true # reject unsigned callback replies
timeout_sec: 2
# PHP-relay auto-freeze (cPanel only). Off by default; opt in
# explicitly. dry_run defaults to true even when freeze=true so an
# operator who enables freeze without thinking gets a dry-run.
php_relay:
freeze: true # enable the exim -Mf hook
dry_run: true # safe default; flip with `csm phprelay dry-run off`
max_actions_per_minute: 60 # rolling 60s window cap on exim -Mf invocations
Dry-run safety default
auto_response.dry_run defaults to true when the key is absent. This is deliberate: an operator who turns on block_ips: true without thinking through policy gets recorded-but-not-applied blocks. The dry-run count surfaces in csm status --json and /api/v1/status so dashboards can verify the policy before flipping live. CSM clears those records when auto-response starts or reloads in live mode, and ages out records older than a week while dry-run remains enabled.
IP auto-blocking still requires firewall.enabled: true. The firewall engine owns both live nftables mutations and dry-run block records; with the firewall disabled there is no engine to call, so csm validate warns on auto_response.enabled: true plus block_ips: true.
Verify dry-run state explicitly:
csm status --json | jq '.severities, .blocklist_size'
csm firewall status # "Recently Blocked" entries with timestamps after the restart confirm live mode
To go live: set dry_run: false, run csm rehash (twice, due to the circular hash), then restart or SIGHUP-reload (the field is hot-reload-safe).
Verdict callback (advisory)
When verdict_callback.enabled: true, every auto-block call POSTs a
signed JSON request to the panel before mutating nftables. CSM refuses
to start without hmac_secret or a non-empty hmac_secret_env value
unless allow_unsigned: true is set for a staged unsigned rollout.
Without that opt-in, an unsigned allow response is rejected and the
default block continues.
When a secret is configured, CSM also requires the panel to sign the
response body unless require_response_signature: false is set for a
staged rollout. With that opt-out, CSM still checks any echoed nonce
or timestamp when a secret is configured; a legacy response that
omits both keeps working. The panel can return {"verdict": "block"}
(apply), {"verdict": "allow"} (audit-only; CSM logs the decision and
skips nftables), or attach metadata (tenant_id, note). The callback
runs after local validation and infra-IP safety checks, and before the
dry-run gate, so panels can observe dry-run decisions too.
CSM fails open on hook errors (timeout, non-2xx, malformed body): the block continues as if the hook were disabled, or is recorded as dry-run when dry-run is active. The failure is written to the daemon log. Full request/response schema: docs/verdict-callback-contract.md.
Infrastructure IP DNS guard
Hostnames listed in top-level infra_ips or firewall.infra_ips are resolved every 5 minutes and their current addresses feed the infra auto-block guard. If a hostname stops resolving, the daemon emits an infra_ips_unresolvable Warning finding and keeps the last known addresses protected during the grace period (default 10 min). The finding auto-clears when resolution recovers.
Findings that always trigger IP block
When auto_response.block_ips: true and the firewall is enabled, the source IP is blocked for every finding in this list. The dry-run gate still applies if dry_run: true.
| Finding | Description |
|---|---|
wp_login_bruteforce | WordPress login flood via wp-login.php |
xmlrpc_abuse | XML-RPC endpoint flood |
http_request_flood | Per-IP HTTP request volume exceeds threshold (disabled by default; enable by setting thresholds.http_flood_threshold > 0) |
http_ua_spoof | IP spoofing a search-engine bot UA or exceeding the UA anomaly threshold (periodic; see configuration.md for opt-in flags) |
ftp_bruteforce | FTP authentication flood |
smtp_bruteforce | SMTP authentication flood |
smtp_probe_abuse | Raw SMTP connect-rate flood before AUTH |
mail_bruteforce | IMAP/POP3/ManageSieve authentication flood |
mail_account_compromised | Successful login from an IP that just failed auth on the same mailbox |
admin_panel_bruteforce | phpMyAdmin or Joomla admin POST flood |
ssh_login_unknown_ip | SSH login from an IP with no prior history |
ssh_login_realtime | SSH login anomaly detected by realtime watcher |
c2_connection | Outbound connection to a known C2 server |
ip_reputation | IP flagged by AbuseIPDB / rspamd / upstream threat-intel |
local_threat_score | IP crosses the aggregated internal attack-history threshold |
modsec_block_escalation | ModSecurity deny escalation |
waf_attack_blocked | WAF high-volume attacker |
email_compromised_account | Email account compromise indicator |
email_cloud_relay_abuse | Cloud relay abuse |
Distributed HTTP flood rollups do not trigger a direct IP block because they describe one targeted vhost, not one source IP. The per-IP findings that feed the rollup still drive normal block decisions.
Safety Guards
- Never kills root processes, system daemons, or cPanel services
- Infrastructure IPs (
infra_ipsin config) are never blocked - Subnet blocks refuse the default route and any range that covers infrastructure, local host, allowed, or port-specific allowed IPs
- Quarantined files preserve full metadata for restoration
- Auto-quarantine requires high confidence: category match (webshell/backdoor/dropper) + entropy >= 4.8 or hex density > 20%. This prevents legitimate WordPress plugins from being quarantined.
- IP block rate limited by
auto_response.max_blocks_per_hour(default 50/hour) to prevent runaway blocking - CRITICAL alerts always bypass the email rate limit (default 30/hour)
- Trusted countries (
trusted_countries) suppress login alerts from expected geolocations
What CSM Detects in Real-Time
Beyond standard malware patterns, CSM detects advanced evasion techniques:
- Fragmented function names: attackers split
base64_decodeacross variables ($a="base"; $b="64_decode") to evade simple string matching - Appended payloads: malicious code added to the end of large legitimate files, beyond typical scan windows. Realtime PHP checks scan the first and last 32KB, and periodic PHP content analysis scans a larger head window plus the tail.
- Non-PHP backdoors: Perl, Python, Bash CGI scripts in web directories (detects toolkits like LEVIATHAN)
- SEO spam injection: gambling/togel dofollow link injection into theme files
- WordPress brute force: real-time access log monitoring for wp-login.php and xmlrpc.php floods (blocks within seconds, not the 10-minute periodic scan)
- Admin-panel brute force: same access-log path, tracks POSTs to
/phpmyadmin/index.php,/pma/index.php,/phpMyAdmin/index.php, and Joomla/administrator/index.php. Emitsadmin_panel_bruteforceand auto-blocks the IP. Path matcher is intentionally tight to avoid false positives on shared hosting; Drupal and Tomcat Manager use different attack shapes and need separate detectors. - SMTP brute force and probes: tails
/var/log/exim_mainlogon cPanel and non-cPanel Exim hosts where the file exists. Emitssmtp_probe_abuseandsmtp_bruteforce(per-IP, auto-blocks),smtp_subnet_spray(per-/24, auto-blocks the whole subnet), andsmtp_account_spray(per-mailbox, visibility only). - Mail brute force: tails
/var/log/maillogfor direct IMAP, POP3, and ManageSieve auth failures. Composes with the existing geo-login monitor soemail_suspicious_geokeeps working. Emitsmail_bruteforce,mail_subnet_spray,mail_account_spray, andmail_account_compromised(the last one fires when a successful login arrives from an IP that just failed auth against the same mailbox; auto-blocks with no false positives by construction).
Dry-run precedence (Phase 4)
CSM has three independent dry_run knobs after Phase 4. Any dry_run that is true wins; live actions require all applicable knobs to be false.
| Layer | Knob | Default | Effect when true |
|---|---|---|---|
| Global | auto_response.dry_run | true | Suppress all automatic actions |
| Detector | detection.direct_smtp_egress.dry_run | true | Suppress detector-scoped action |
| Kernel | bpf_enforcement.dry_run | true | BPF program emits decision but allows traffic |
The kernel knob is consulted by the BPF program itself; the others gate userspace action paths. All three default to true on a first install so a configuration mistake cannot start blocking traffic.
Incidents
CSM groups related findings into Incident objects so operators see one escalating story per account, mailbox, or process instead of a stream of unrelated findings. Original findings are not mutated or suppressed – the Incident is layered on top.
Lifecycle
| Status | Meaning |
|---|---|
open | Active. New findings for the same correlation key keep merging in. |
contained | Operator marked under control. Findings still merge in window. |
resolved | Closed. Future findings start a new incident. |
dismissed | False positive. Future findings start a new incident. |
Resolved and dismissed incidents are pruned 30 days after their last update. Open and contained incidents are never auto-pruned by the retention loop, but they may be auto-resolved by the per-kind idle threshold described under “Auto-close” below.
Auto-close
To stop the open-incident backlog from growing without bound on busy
hosts, the daemon scans Open / Contained incidents shortly after startup
and then once an hour, auto-resolving any whose updated_at exceeds the
per-kind idle threshold. A live sweep closes at most 1000 stale incidents
at a time; if more stale incidents remain, follow-up sweeps run every 30
seconds until the backlog drains. Dry-run sweeps still scan the full set
so the counters show every would-close decision. Auto-resolved incidents
carry closed_by: "auto:stale" and an incident_auto_closed action in
their timeline so reporting can distinguish them from operator closes.
Defaults (configurable in csm.yaml):
incidents:
auto_close:
enabled: true # set false to disable
dry_run: false # set true to log decisions without writing back
by_kind:
mailbox_takeover: 24h
credential_spray: 24h
web_account_compromise: 168h
Kinds absent from by_kind are never auto-closed. The default map
omits host_integrity_risk, host_takeover, and post_exploit_process
because those host-level incidents should stay open until an operator
reviews them. host_takeover is the compound escalation raised when any
two of three host-takeover legs (a new uid-0 account, a planted suid
binary, an outbound connection to a bad ASN) are correlated for the same
host inside the merge window.
If a fresh finding for the same correlation key arrives after the auto-close, the merge-window stale-binding logic creates a new open incident – nothing about auto-close blocks re-detection. History is preserved on the closed record.
Tuning on high-volume hosts. Each by_kind threshold is the idle
time before a kind auto-resolves; they are independent and operator-set.
A host under sustained brute-force keeps a large open set mostly from the
longer-lived kinds (web_account_compromise defaults to 168h). If the
open-incident count is higher than you want to triage, shorten the
relevant by_kind entry (e.g. web_account_compromise: 72h) rather than
disabling auto-close. The closed records are retained 30 days regardless,
measured from when the incident resolves, so shortening the threshold also
moves the eventual prune point earlier relative to the last finding.
Auto-close still keeps a resolved record for follow-up instead of deleting
history at close time.
Metrics: csm_incidents_auto_closed_total and
csm_incidents_auto_close_dry_run_total.
Credential-spray suppression
Without this path, an attacker IP that brute-forces 6500 distinct
usernames produces 6500 mailbox_takeover incidents because the
correlator keys on the mailbox, not the source IP. The
spray-suppression detector tracks the distinct-mailbox set per source
IP across the merge window and, once an IP exceeds distinct_mailboxes,
opens a single credential_spray super-incident keyed on the IP.
Subsequent findings from that IP attach to the spray incident’s
timeline instead of opening per-mailbox incidents.
Defaults (configurable in csm.yaml):
incidents:
spray_suppression:
enabled: false # default OFF; opt-in
dry_run: true # default ON; counters move, routing unchanged
distinct_mailboxes: 10 # threshold to trip
severity_escalate_at: 50 # bump severity to CRITICAL at this many
per_check:
- email_auth_failure_realtime
- pam_auth_failure
- ssh_bruteforce
max_tracked_ips: 10000
block_at_severity: "" # "" detection-only, "high" block on open,
# "critical" block on escalation
Setting block_at_severity hands the source IP to the firewall as soon
as the spray detector trips at the chosen tier, once
spray_suppression.dry_run is false. The detector also requires
auto_response.enabled and auto_response.block_ips; the firewall still
honors auto_response.dry_run, so a dry-run host logs the would-be block
without applying nftables rules. Live accepted requests are recorded on
the incident timeline as a credential_spray_block_requested action.
Non-live outcomes (dry-run, verdict-allow, already blocked) and failed
attempts do not latch the incident, so a later finding can retry after
blocking is live again. Concurrent findings for the same incident share
one in-flight firewall call, and resolved or dismissed spray incidents do
not make new block decisions.
Whitelisted IPs (entries in reputation.whitelist and the live bbolt
whitelist updated via the Web UI) are skipped from spray detection so
internal mail relays, NAT egresses, and known-good infrastructure
never produce a spray incident.
Choosing block_at_severity:
""(default) – detection-only. Spray incidents open, no firewall hand-off. Use during dry-run validation and on hosts where blocking is owned by a separate system.high– block at thedistinct_mailboxestrip. Recommended once the dry-run counter looks clean. Trips on the first sustained burst before the source IP goes idle for longer than the merge window.critical– block only after severity escalates, i.e. one IP hitsseverity_escalate_atdistinct mailboxes before the source IP is idle for more than the merge window. A low-and-slow attacker that stays below that count before each idle reset never escalates and never blocks. Pick this only when you have strong shared-NAT exposure and accept that slow sprayers evade the gate.
Rollout:
- Ship the daemon with
enabled: false, dry_run: true. The detector tracks per-IP mailbox sets and incrementscsm_credential_spray_dry_run_totalwhenever the threshold would have tripped, but routing stays on the legacy per-mailbox path. - Validate the counter on your own infrastructure for 24h. If a
trusted IP shows up in the dry-run trips, add it to
reputation.whitelist. - Flip
enabled: true, dry_run: false. New attacker IPs route through the spray path; existing per-mailbox backlog drains via the auto-close path. - After another 24h, set
block_at_severity: high. The firewall hand-off runs on every spray decision (open + merge), so an incident opened before the flag was armed still blocks on the next finding from the same IP.
Metrics: csm_credential_spray_opened_total,
csm_credential_spray_suppressed_mailbox_takeover_total,
csm_credential_spray_dry_run_total,
csm_credential_spray_tracked_ips.
Incident auto-block
spray_suppression only handles the credential_spray super-incident
kind. Low-and-slow scanners that never trip a per-detector window
(modsec escalation, mail brute-force, smtp probe) still produce
mailbox_takeover or web_account_compromise incidents but never get
firewalled. The incidents.auto_block block adds a generic
incident-driven firewall hand-off:
incidents:
auto_block:
enabled: false # default OFF; opt-in
block_at_severity: "" # "" / "high" / "critical"
kinds: [] # empty = any non-spray kind with one source IP
When the gate trips, the correlator hands the source IP to the firewall
through the same dry-run / block_ips gate as the spray path. A live
accepted request records incident_block_requested; non-live outcomes
(dry-run, verdict-allow, already blocked) do not latch the incident, so
an operator who arms auto_block AFTER an incident has already crossed
the gate still gets a block on the next finding while the incident is
open or contained. Incidents with multiple source IPs are left for manual
review.
If a long-running incident’s timeline was truncated and the source IP is
not part of the incident key, auto-block also stays off because the
remaining visible timeline may not contain every source IP.
credential_spray is explicitly excluded from this path; the dedicated
spray hand-off owns it. Set kinds to narrow the surface (e.g. only
web_account_compromise) if you do not want every CRITICAL
mailbox_takeover incident to block its source IP.
This pairs naturally with the ModSecurity escalation thresholds
(thresholds.modsec_escalation_hits,
thresholds.modsec_escalation_window_min) – raising the window from
the shipped default of 10 minutes to e.g. 4 hours lets the modsec
detector promote paced scanners to a Critical escalation finding,
which then trips the generic auto_block gate.
Kinds
web_account_compromise– default for findings attributable to a hosted account or script (PHP relay, webshell, login bruteforce, etc.).mailbox_takeover– SMTP/SASL, suspicious-login, credential-abuse, and rate signals tied to a mailbox or cPanel-local mail account.post_exploit_process– process exec from/tmp,/var/tmp,/dev/shm.host_integrity_risk– daemon/kernel-level signals (sensitive file writes, fake kernel threads, auditd disabled).host_takeover– any two of a new uid-0 account, a planted suid binary, and an outbound connection to a bad ASN, seen for the same host inside the merge window.credential_spray– one source IP brute-forcing many distinct mailboxes/accounts inside the merge window. Keyed on the source IP rather than per-mailbox, so a scanner spraying thousands of usernames produces one super-incident instead of thousands of mailbox_takeover rows. Findings from the same IP after the trip attach to this incident’s timeline. See “Credential-spray suppression” below.
Severity policy
Severity escalates only. Each incident keeps the highest severity any
joined finding has carried. Findings themselves are never re-emitted at
a higher severity. The audit trail records an
incident_severity_changed action when an incident’s severity bumps.
Correlation window
15 minutes by default. Findings outside the window for the same key start a new incident. The window is a named constant in code; not yet exposed via config.
Open threshold
Non-Critical findings need at least two correlated sightings inside the merge window before an incident opens. The first sighting is held in a pending bucket and counted toward the threshold; the second promotes both into a new incident with a two-event timeline. Stale pending entries are pruned by the daily retention sweep.
Critical-severity findings (account compromise, cloud-relay abuse, modsec rule escalations) bypass the threshold and open immediately so escalations still page on first hit.
The threshold suppresses one-shot scanner noise (a single modsec
deny from a wandering scanner, an isolated mistyped password) without
hiding sustained activity. The current pending-bucket size is exposed
as the csm_incidents_pending gauge.
The stored incident includes the full correlation key, including process PID/UID and remote IP when those are the only available dimensions, so active incidents keep merging after daemon restart.
API
GET /api/v1/incidents– list, newest first. Without query parameters the response is a bare JSON array (compat with the existing wire shape phpanel/SIEM consumers decode against). When?limit=,?offset=, or?status=is present the response switches to an envelope:{"items":[...], "total":N, "offset":N, "limit":N, "status":"..."}. Status accepts the four spec values plusactive(open + contained, the default web UI filter). Limit is capped server-side at a safe maximum.GET /api/v1/incidents/<id>– one incident.POST /api/v1/incidents/<id>/status– transition status.
See api.md for endpoint detail.
Web UI
Open Monitor -> Incidents. The page has three tabs:
- Correlated – the default flat list of incidents with status filter, page size, and detail panel. The detail panel shows the current firewall block state for the incident’s source IP (permanent, temporary, cphulk, or not blocked) when an IP is known.
- Grouped – rolls up incidents by
(kind, source)so a credential spray that produced thousands of mailbox_takeover incidents shows as one row per attacker IP. Pageable with the same page-size selector as Correlated. Click a group to see member incidents in the detail panel, which also surfaces the source IP’s firewall block state; clicking a sample id jumps back to the Correlated tab focused on that incident. - Timeline Search – the older IP/account history search across the audit log.
Admin tokens can transition incident status (open / contained / resolved / dismissed); read-scope tokens can browse all three tabs.
Control socket
csm incidents list [--status all|active|open|contained|resolved|dismissed] [--limit N] [--offset N] [--all]
csm incidents show <id>
csm incidents status <id> <open|contained|resolved|dismissed> [details]
csm incidents bulk-status --older-than 24h [--last-seen-before RFC3339] [--status active|open|contained] [--kind K] [--domain D] [--account A] [--mailbox M] [--limit N] [--to resolved|dismissed] [--apply --confirm]
csm incidents list returns the first 100 incidents by default. Use
--offset for the next page, --status active for open + contained
incidents, or --all for an explicit full dump.
csm incidents bulk-status defaults to dry-run. It prints the total
match count and a bounded preview of the incidents that would change.
At least one age guard is required: --older-than, --last-seen-before,
or both. To mutate incidents, pass both --apply and --confirm.
Metrics
csm_incidents_open– gauge of currently open + contained incidents.csm_incidents_created_totalcsm_incidents_severity_changed_totalcsm_incidents_status_changed_totalcsm_incidents_findings_merged_totalcsm_incidents_compacted_totalcsm_incidents_pending– gauge of findings held in the threshold gate, awaiting a second correlated sighting.
Incident Response Runbook
Use this flow when CSM flags account compromise, mailbox takeover, malicious database triggers, or outbound spam on a production cPanel host.
Safety rules
- Do not delete customer files during first response.
- Do not thaw, release, or purge queued mail until affected credentials are rotated or an operator approves the specific queue action.
- Do not close incidents until the account was reviewed, credentials were rotated or explicitly deferred, and a fresh scan is clean.
- Take a CSM backup before upgrading CSM or changing incident state.
1. Verify the deployed binary
Deploy only after the required GitLab pipeline passed and the package was published.
/root/deploy-csm.sh check
/root/deploy-csm.sh upgrade
/opt/csm/csm version
/opt/csm/csm doctor --json
2. Take a backup
mkdir -p /root/csm-backups
/opt/csm/csm backup /root/csm-backups/csm-pre-response-$(date +%Y%m%d%H%M%S).tar.gz
sha256sum /root/csm-backups/csm-pre-response-*.tar.gz
Confirm the archive is readable:
gzip -t /root/csm-backups/csm-pre-response-*.tar.gz
tar -tzf /root/csm-backups/csm-pre-response-*.tar.gz | sed -n '1,80p'
3. Preserve evidence
mkdir -p /root/csm-forensics
/opt/csm/csm forensic-snapshot <account> --out /root/csm-forensics/<account>-$(date +%Y%m%d%H%M%S).tar.gz
sha256sum -c /root/csm-forensics/<account>-*.sha256
tar -xOzf /root/csm-forensics/<account>-*.tar.gz manifest.txt
Check the manifest for private-path exclusions, schema count, capture
errors, and recent_mtimes_status=ok.
4. Map affected accounts
Map incident domains and queued local senders to cPanel users before rotating credentials or changing mail queue state.
/opt/csm/csm incidents list --status open --all
exim -bpc
exim -bp | exiqsumm
grep -E '^example.com:' /etc/trueuserdomains /etc/userdomains
whmapi1 listaccts searchtype=user search=<account> --output=json
Use native cPanel APIs for inventory:
uapi --user=<account> Email list_pops --output=json
uapi --user=<account> Ftp list_ftp --output=json
uapi --user=<account> Mysql list_users --output=json
5. Rotate credentials
Rotate the cPanel account password, FTP accounts, affected mailboxes, WordPress administrator users, database users, and application secrets for the affected account. Prefer WHM and UAPI calls or the control panel over direct file edits.
Do this before releasing mail or marking incidents resolved unless the operator explicitly defers rotation for a documented reason.
6. Review queued mail
Start with read-only summaries:
exim -bpc
exim -bp | exiqsumm
exim -bp
Review headers before any queue action:
exim -Mvh <message-id>
Group messages into:
- safe to remove: frozen bounces, obvious backscatter, duplicate failed delivery notices with no customer value
- do not touch: current customer conversations, invoices, form leads, or any message where the business value is unclear
- needs review: suspicious local sender messages, mixed external bulk mail, or messages tied to an account whose credentials are not rotated
Only remove or thaw message IDs that were reviewed:
exim -Mrm <message-id>
exim -Mt <message-id>
7. Review stale incidents
Preview first:
/opt/csm/csm incidents bulk-status --older-than 72h --status active --kind web_account_compromise --limit 20
/opt/csm/csm incidents bulk-status --older-than 24h --status active --kind mailbox_takeover --limit 20
Apply in bounded batches only after review:
/opt/csm/csm incidents bulk-status --older-than 72h --status active --kind web_account_compromise --limit 100 --apply --confirm --details "operator cleanup after review"
For mailbox incidents, confirm mailbox rotation or explicit operator deferral before applying status changes.
8. Confirm recovery
/opt/csm/csm status --json
/opt/csm/csm doctor --json
exim -bpc
Keep the forensic archives, CSM backup, command notes, and queue decisions with the incident record.
CVE Mitigations
CSM treats CVEs as a three-layer problem:
- Operator-driven hardening via
csm harden ...- applies the right preventive control for the host (modprobe blacklist, seccomp drop-ins, sysctl tweaks). - Continuous enforcement by the daemon - re-asserts the control on every startup and as a periodic check, so a package upgrade or manual
modprobedoes not silently undo the mitigation. - Live detection - auditd/BPF listeners flag exploit signatures the moment they fire, even on hosts where the kernel itself cannot be patched.
The hardening audit detects what the host actually has (kernel build, KernelCare livepatches, seccomp coverage) and refuses to claim protection it cannot deliver. Run csm harden with no arguments for the full list of available mitigations on the current host.
Active mitigations
CVE-2026-31431 - “Copy Fail” (Linux kernel AF_ALG)
Two operator paths depending on whether AF_ALG is loadable on the kernel:
csm harden --copy-fail- writes/etc/modprobe.d/csm-copy-fail-mitigation.confblacklistingalgif_aeadandaf_alg, then unloads them. Refuses on kernels where AF_ALG is built in (typical cPanel / CloudLinux 8), since the blacklist would have no effect there.csm harden --copy-fail-seccomp- writes systemdRestrictAddressFamilies=~AF_ALGdrop-ins for the units that spawn untrusted code: LiteSpeed, Apache/Nginx, every PHP-FPM pool, cron, mail. The right path on built-in-AF_ALG kernels.
The audit recognises KernelCare CVE-2026-31431 livepatches via kcarectl --patch-info and reports pass only when the host is genuinely mitigated (module blacklisted, seccomp drop-ins applied, or KernelCare livepatch active).
Live detection and BPF blocking
BPF-tagged builds (make BPF=1 or go build -tags bpf) load an LSM socket_create program on kernels with BPF LSM and ringbuf support. That program refuses socket(AF_ALG, ...) from non-root UIDs before the AF_ALG socket is allocated, returns EPERM, emits a ringbuf event, and feeds the existing Critical af_alg_socket_use finding path. UID 0 keeps AF_ALG access for system crypto users. There are no UID-range or alert-only BPF tunables today; use detection.af_alg_backend to select auditd or none if a host needs to avoid kernel-side refusal.
Default builds, BPF-tagged builds on unsupported kernels, and hosts forced to detection.af_alg_backend: auditd keep the audit-log listener. The audit path catches non-system AF_ALG socket attempts at Critical within about 500 ms but cannot stop the syscall before it reaches the kernel. Hosts that are not exploitable skip the live listener.
If CSM attempts the AF_ALG BPF path and cannot start it, it emits a
bpf_unavailable finding. The finding says whether the audit fallback
is active or no live fallback is available.
Auto-response
auto_response.copy_fail_kill_process: true- SIGKILL the offending process when the live listener fires. Default off (alert-only).auto_response.disable_enforce_af_alg: true- suspend the periodic re-assertion of the module blacklist without removing the hardening marker. For triage only.
The daemon self-heals its auditd rule file on startup if it has drifted from the embedded copy, closing the upgrade gap where a new binary shipped without re-running postinstall would leave detection silently inactive.
Configuration knob
detection.af_alg_backend-auto(default) |bpf|auditd|none.auditdis the kill switch for a misbehaving BPF-tagged release.bpfis strict mode (no fallback).nonedisables the live listener entirely.
The csm_af_alg_backend{kind="bpf-lsm"|"auditd-tail"|"none"} Prometheus gauge surfaces which backend the coordinator selected at startup, so dashboards can see the active path without parsing logs.
BPF validation
On a BPF LSM host with a BPF-tagged CSM build:
-
Set
detection.af_alg_backend: bpffor strict validation, or leave it asautoand confirm BPF was selected. -
Start the daemon and check metrics:
curl -k -H "Authorization: Bearer $CSM_TOKEN" https://127.0.0.1:9443/metrics \ | grep -E 'csm_af_alg_backend|csm_bpf_backend'Expected selected series:
csm_af_alg_backend{kind="bpf-lsm"} 1 csm_bpf_backend{feature="af_alg",kind="bpf"} 1 -
As a non-root user, attempt an AF_ALG socket:
sudo -u nobody python3 - <<'PY' import errno import socket import sys try: socket.socket(socket.AF_ALG, socket.SOCK_SEQPACKET, 0) except OSError as exc: print(exc.errno) sys.exit(0 if exc.errno == errno.EPERM else 1) raise SystemExit("AF_ALG socket unexpectedly succeeded") PY -
Confirm the command prints
1(EPERM) and CSM emits a Criticalaf_alg_socket_usefinding. The finding details should say the call was refused by the BPF LSM program.
CVE-2026-41940 - cPanel/WHM auth-bypass
Detection in the access-log path:
- Non-infra WHM login attempts surface at Warning (suppressible alongside other cPanel-login alerts) so an operator can see brute force traffic against the bypass surface.
- The tokenless WHM-script request the published exploit uses for cache promotion surfaces at Critical, always-on, and feeds auto-block.
No operator hardening command is required. The host fix is to apply the cPanel patched build. CSM provides the detection layer for windows where patching has not yet rolled out.
Future CVEs
New mitigations land here as they ship. The bar is:
- The host can be measurably hardened (modprobe / seccomp / sysctl / config), and/or
- An exploit-signature detector can fire reliably without false positives.
CVEs that are purely “patch the package”, with no preventive control we can apply and no signature we can detect, do not get a CSM mitigation; the right answer is the vendor patch. The daemon’s package-integrity check (rpm -V / debsums) covers the “did the operator actually apply the patch” question.
Firewall (nftables)
CSM includes a native nftables firewall engine that replaces LFD and fail2ban. It uses the kernel netlink API directly via google/nftables - no iptables, no Perl, no shell commands.
Features
- Atomic ruleset - single netlink transaction, no partial application
- Named IP sets with per-element timeouts (blocked, allowed, infra, country)
- Rate limiting - SYN flood, UDP flood, per-IP connection rate, per-port flood
- Country blocking via MaxMind GeoIP CIDR ranges
- Outbound SMTP restriction by UID (prevent spam from compromised accounts)
- Subnet/CIDR blocking with auto-escalation from individual IPs and safety guards for infra, local, and allowed addresses
- Permanent block escalation after repeated temp blocks
- Dynamic DNS hostname resolution (updated every 5 min) with grace-period guard against transient resolver failures
- IPv6 dual-stack with separate sets
- Commit-confirmed safety - Juniper-style auto-rollback timer
- Infra IP protection - refuses to block infrastructure IPs
- Auto-response dry-run - safety default that records intended blocks without touching nftables
- Verdict callback - optional advisory hook to the panel before each auto-block (allow / block / attach metadata)
- cphulk integration - unblock flushes cphulk too
- Audit trail - JSONL log with 10MB rotation
- State persistence with atomic writes
CLI Commands
# Status
csm firewall status # Show status and statistics
csm firewall ports # Show configured port rules
# Block / Allow
csm firewall deny <ip> [reason] # Block IP permanently
csm firewall allow <ip> [reason] # Allow IP (all ports)
csm firewall allow-port <ip> <port> [reason] # Allow IP on specific port
csm firewall remove <ip> # Remove from blocked and allowed
csm firewall remove-port <ip> <port> # Remove port-specific allow
# Temporary
csm firewall tempban <ip> <dur> [reason] # Temporary block
csm firewall tempallow <ip> <dur> [reason] # Temporary allow
# Subnets
csm firewall deny-subnet <cidr> [reason] # Block subnet
csm firewall remove-subnet <cidr> # Remove subnet block
# Search
csm firewall grep <pattern> # Search blocked/allowed IPs
csm firewall lookup <ip> # GeoIP + block status lookup
# Bulk operations
csm firewall deny-file <path> # Bulk block from file
csm firewall allow-file <path> # Bulk allow from file
csm firewall flush # Clear all dynamic blocks
# Safety
csm firewall apply-confirmed <minutes> # Apply with auto-rollback timer
csm firewall confirm # Confirm applied changes
csm firewall rollback status|confirm|revert # Manage pending config rollback
csm firewall restart # Reapply full ruleset
# Profiles
csm firewall profile save|list|restore <name> # Profile management
# Audit
csm firewall audit [limit] # View audit log
# GeoIP
csm firewall update-geoip # Download country IP blocks
# Cloudflare
csm firewall cf-status # Show Cloudflare IP whitelist status
Configuration
Firewall defaults can be edited in two places:
- Web UI: Settings -> Firewall section. Port lists, rate limits, flood protection, deny caps, country block, and outbound SMTP restriction are all editable. Changes are restart-class. The save endpoint warns if the WebUI listen port is missing from
tcp_in. Theport_floodper-port rule list is YAML-only for now. - YAML: edit
/etc/csm/csm.yamldirectly. Runcsm rehashthensystemctl restart csm.
Tentative apply (rollback timer)
The Firewall section in the Web UI offers two save buttons. Save writes
the new config and prompts you to restart. Apply with rollback timer
writes the new config, restarts the daemon, and starts a timer (default 5
minutes, range 1-30). If you do not click Confirm before the timer
expires, the daemon restores the previous config and restarts again. This
protects against locking yourself out by, for example, removing the WebUI
port from tcp_in.
When the Web UI is unreachable (firewall mistuned, daemon broken), use the CLI escape hatch:
csm firewall rollback status
csm firewall rollback confirm
csm firewall rollback revert
Rollback state survives daemon restarts (the snapshot is persisted in bbolt). On startup the daemon checks for a pending rollback: if the deadline has already passed it restores the previous config and restarts; otherwise it rearms the timer for the remaining window.
firewall:
enabled: true
ipv6: false
conn_rate_limit: 200 # new connections per minute per IP (CGNAT-tolerant)
syn_flood_protection: true
conn_limit: 400 # max concurrent connections per IP (0 = disabled)
smtp_block: false # restrict outbound SMTP
log_dropped: true
dyndns_hosts: # resolved every 5 min and whitelisted
- "monitoring.example.com"
Full firewall reference: Configuration - Firewall.
Auto-response interaction
Auto-block calls require firewall.enabled: true because they go through the firewall engine. The engine consults two policy hooks first:
-
auto_response.verdict_callback- when enabled, the engine POSTs a signed JSON request to the panel after local validation and infra-IP safety checks. When a secret is configured, CSM rejects unsigned callback replies by default. The panel can downgrade toallow(audit-only), attachtenant_idfor downstream correlation, or add a note. CSM fails open on hook errors. Wire contract:docs/verdict-callback-contract.md. -
auto_response.dry_run- when true (or absent; safety default),BlockIP()records the intended block to bbolt and returns success without touching nftables. Manualcsm firewall ...operator commands bypass viaBlockIPForceand always apply. Verify withcsm firewall statusafter policy changes; “Recently Blocked” timestamps newer than the last restart confirm live mode. See Auto-response - Dry-run safety default.
Subnet blocks refuse the default route and any range that contains an infrastructure IP, a resolved infra hostname, a local host address, a full-IP allow, or a port-specific allow. Remove the allow or narrow the CIDR before applying the block.
Infrastructure IP DNS guard
Hostnames listed in top-level infra_ips or firewall.infra_ips are resolved every 5 minutes and their current addresses feed the infra auto-block guard. If a hostname stops resolving, the daemon emits an infra_ips_unresolvable Warning finding and keeps the last known addresses protected during the grace period (default 10 min). This prevents a transient DNS outage from deprotecting the management plane. The finding auto-clears when resolution recovers.
ModSecurity Integration
CSM detects and manages ModSecurity (WAF) on Apache, Nginx, and LiteSpeed across cPanel, plain Debian/Ubuntu, and plain AlmaLinux/Rocky/RHEL hosts. It deploys custom rules (cPanel only) and provides a web UI for rule overrides and escalation.
Supported Web Servers
| Web server | Config candidates | Status check | Custom rule deployment |
|---|---|---|---|
| Apache on cPanel EA4 | /usr/local/apache/conf/*, /etc/apache2/conf.d/modsec*, whmapi1 modsec_is_installed | Yes | Yes (via cPanel modsec user conf) |
| Apache on Debian/Ubuntu | /etc/apache2/mods-enabled/security2.conf, /etc/apache2/conf-enabled/*, /etc/apache2/conf.d/modsec2.conf | Yes | Not yet (plain Linux) |
| Apache on RHEL/Alma/Rocky | /etc/httpd/conf.d/mod_security.conf, /etc/httpd/conf.modules.d/* | Yes | Not yet (plain Linux) |
| Nginx on any distro | /etc/nginx/nginx.conf, /etc/nginx/modules-enabled/50-mod-http-modsecurity.conf, /etc/nginx/modsec/main.conf | Yes | Not yet (plain Linux) |
| LiteSpeed | /usr/local/lsws/conf/httpd_config.xml, /usr/local/lsws/conf/modsec2.conf | Yes | Not yet |
When ModSecurity is not installed, the waf_status check emits a platform-specific install hint:
# On Ubuntu + Nginx:
Install: apt install libnginx-mod-http-modsecurity modsecurity-crs
# On Ubuntu + Apache:
Install: apt install libapache2-mod-security2 modsecurity-crs && a2enmod security2
# On AlmaLinux + Apache:
Install (requires EPEL): dnf install -y epel-release && dnf install -y mod_security
# On AlmaLinux + Nginx:
Install (requires EPEL): dnf install -y epel-release && dnf install -y nginx-mod-http-modsecurity
# On cPanel:
Install: WHM > Security Center > ModSecurity
Rule-staleness alerts scan both the flat CRS layout (/usr/share/modsecurity-crs/rules/*.conf) used by distro packages and the per-vendor subdirectory layout used by cPanel (/usr/local/apache/conf/modsec_vendor_configs/VENDOR/*.conf). Update instructions are also platform-specific (apt update && apt upgrade modsecurity-crs, dnf upgrade modsecurity-crs, or WHM on cPanel).
Features
- Custom CSM rules - IDs 900000-900999 in
configs/csm_modsec_custom.conf(cPanel only today) - Rule override management -
SecRuleRemoveByIddirectives for false positive suppression - Escalation control - change rule severity or action per-rule
- Live deny escalation - repeated ModSecurity deny events from one IP emit an escalation finding that feeds auto-response blocking. CSM-owned rules keep their existing per-rule escalation controls.
- WAF event log parsing - correlates events by IP, URI, and rule ID
- Hot-reload - apply changes without Apache restart (cPanel only)
Web UI Pages
ModSecurity (/modsec) - WAF status overview, event log, active block list
ModSec Rules (/modsec/rules) - per-rule management:
- View loaded rules with descriptions
- Enable/disable individual rules
- Override rule severity or action
- Deploy custom rules
API Endpoints
GET /api/v1/modsec/stats WAF statistics
GET /api/v1/modsec/blocks Blocked request log
GET /api/v1/modsec/events WAF event details
GET /api/v1/modsec/rules Loaded rules list
POST /api/v1/modsec/rules/apply Apply custom rules
POST /api/v1/modsec/rules/escalation Change rule severity/action
Signature Rules
CSM uses YAML and YARA-X rules for malware detection. Rules are stored in /opt/csm/rules/ and scanned both in real-time (fanotify) and during deep scans.
YAML Rules
rules:
- name: webshell_c99
severity: critical
category: webshell
file_types: [".php"]
patterns: ["c99shell", "c99_buff_prepare"]
min_match: 1
- name: phishing_login
severity: high
category: phishing
file_types: [".html", ".php"]
patterns: ["password.*submit", "credit.*card.*number"]
exclude: ["legitimate_form_handler"]
min_match: 2
Fields:
name- unique rule identifierseverity- critical, high, or warningcategory- webshell, backdoor, phishing, dropper, exploitfile_types- file extensions to match (or["*"]for all)patterns- literal strings or regex patternsexclude- patterns that prevent a match (false positive reduction)min_match- minimum patterns that must match
YARA-X Rules (Optional)
Build CSM with YARA-X support:
CGO_LDFLAGS="$(pkg-config --libs --static yara_x_capi)" go build -tags yara ./cmd/csm/
Place .yar or .yara files alongside YAML rules in /opt/csm/rules/. CSM compiles them at startup and uses them for:
- Real-time fanotify file scanning
- Deep scan filesystem sweeps
- Email attachment scanning
Without the yara build tag, YARA rules are silently ignored.
Updating Rules
csm update-rules # download latest rules and reload the running daemon
csm update-rules now asks the daemon to reload through the control socket once the download completes. If the daemon is not running, the next start picks the files up automatically. kill -HUP $(pidof csm) still works.
Or from the web UI: Rules page > Reload Rules button.
Remote rule updates are now signature-verified. Any configuration that enables signatures.update_url or signatures.yara_forge.enabled must also set signatures.signing_key to the 64-character hex-encoded Ed25519 public key that verifies the downloaded .sig files.
Remote update URLs must use HTTP or HTTPS and must not point at localhost, loopback, link-local, unspecified, or RFC1918 / ULA private addresses.
YARA Forge Integration
CSM can automatically fetch curated YARA rules from YARA Forge, which aggregates and quality-tests rules from 40+ public sources including signature-base, Elastic, Malpedia, and ESET.
Configuration
signatures:
signing_key: "0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef"
yara_forge:
enabled: true
tier: "core" # core (5K rules, low FP), extended (10K), full (12K)
update_interval: "168h" # weekly
download_url: "https://mirrors.pidginhost.com/csm/yara-forge/{version}/yara-forge-rules-{tier}.zip"
disabled_rules: # rule names to exclude from Forge downloads
- SUSP_Example_Rule
The project operates the signed mirror shown above. A ready-to-use drop-in is shipped at /usr/lib/csm/profiles/yara-forge.example.yaml; copy or include it under /etc/csm/conf.d/ to enable Forge without editing the main csm.yaml. The matching signing_key is the project Ed25519 public key (hex), published on the release signing page.
signing_key must be a hex string for the Ed25519 public key that matches the private key used to sign the remote Forge artifact. It is not a PEM block and not a file path.
YARA Forge’s upstream GitHub releases publish ZIP files, but not CSM detached signatures. CSM therefore requires yara_forge.download_url to point at a mirror you operate. The URL may contain {tier} and {version} placeholders. The detached signature must be available at the resolved ZIP URL plus .sig.
If you do not have a signed update source yet, disable remote updates instead:
signatures:
signing_key: ""
update_url: ""
yara_forge:
enabled: false
Tiers
| Tier | Rules | Size | False Positive Risk |
|---|---|---|---|
| core | ~5,000 | 1.6 MB | Low (quality >= 70, score >= 65) |
| extended | ~10,500 | 3.3 MB | Medium |
| full | ~11,600 | 3.7 MB | Higher (includes score >= 40) |
Update Flow
- CSM checks the latest YARA Forge release tag on GitHub
- If newer than the installed version, downloads the ZIP for the configured tier from
yara_forge.download_urland its detached signature - Verifies the download against
signatures.signing_key - Filters out any rules listed in
disabled_rules - Compile-tests the rules with YARA-X before installing
- Atomically replaces the previous Forge rules file
- Reloads the YARA scanner
Custom rules in malware.yar are never overwritten by the Forge fetcher.
Disabling Rules
If a Forge rule produces false positives, add its name to disabled_rules in the config and reload:
signatures:
disabled_rules:
- SUSP_XOR_Encoded_URL
- HKTL_Mimikatz_Strings
After editing, send SIGHUP or restart the daemon to apply.
How Rules Avoid False Positives
Signature rules require structural nesting, not co-presence of strings. Two dangerous function calls appearing in the same file but in unrelated code paths won’t trigger a rule. The call must directly wrap or chain with the other for a match.
Auto-quarantine adds a safety gate: files need Shannon entropy >= 4.8 or hex density > 20% before automatic quarantine. Legitimate plugin code (~4.2 entropy) passes through; obfuscated malware (~5.5+) is caught.
Alert Rate Limiting
Default: 30 emails/hour (configurable via max_per_hour). CRITICAL findings always get through regardless of rate limit. Only lower-severity alerts are rate-limited.
Suppressions
Create suppression rules to silence known false positives:
- From the Findings page: click the suppress button on any finding
- From the Rules page: manage suppression rules directly
- Via API:
POST /api/v1/suppressions
To suppress email alerts for specific checks while keeping them visible in the web UI, use disabled_checks in your config:
alerts:
email:
disabled_checks:
- "email_spam_outbreak"
- "perf_memory"
Email AV
CSM scans email attachments in real-time using ClamAV and YARA-X on the Exim mail spool.
How It Works
- fanotify watches the Exim spool directory for new messages
- Attachments are extracted and scanned by ClamAV (socket) and YARA-X (if available)
- Zip and tar.gz attachments are unpacked within configured size and file limits
- Extracted parts are staged under
state_path/emailav-tmp, which must stay daemon-owned and private - Attachment names written to logs and the UI use sanitized base names
- Infected messages are quarantined with full metadata
- Sender, recipient, and message ID are logged
Web UI
The Email page (/email) shows:
- AV watcher status (active, engine health)
- Scan statistics (scanned, infected, quarantined)
- Quarantined email list with release/delete actions
API Endpoints
GET /api/v1/email/stats Scan statistics
GET /api/v1/email/quarantine Quarantined email list
GET /api/v1/email/av/status AV watcher status
POST /api/v1/email/quarantine/ Release or delete quarantined email
Related Checks
email_content- scans outbound email body for credentials and suspicious URLsemail_weak_password- detects email accounts with weak passwordsemail_forwarder_audit- audits forwarders for exfiltration redirectsmail_queue- alerts on queue buildup (spam outbreak indicator)mail_per_account- per-account sending volume spikes
Threat Intelligence
CSM tracks, scores, and correlates attacks using a local attack database enriched with external feeds and GeoIP data.
Attack Database
- Per-IP event tracking (brute force, webshell upload, phishing, C2, WAF block)
- Threat score calculation with temporal decay (older attacks weighted less)
- Auto-block on reputation threshold
- Top attackers leaderboard
IP Intelligence
Combines multiple sources into a unified verdict:
| Source | Data |
|---|---|
| Local attack DB | Event count, types, score |
| AbuseIPDB | External reputation (if API key configured) |
| Rspamd | Per-IP rolling history (if controller access configured) |
| Upstream HTTP cache | Panel-side shared score (if reputation.upstream configured) |
| Permanent blocklist | Operator-managed persistent blocks |
| Firewall state | Currently blocked/allowed status |
| GeoIP | Country, city, ASN, ISP |
| RDAP | Network name, organization (cached 24h) |
Verdicts: clean, suspicious, malicious, blocked
Pluggable sources
Threat-intel sources implement a small Source interface (lookup-by-IP returning a score + reason). The aggregator queries every enabled source in parallel, applies per-source weighting, and produces the unified verdict above. Adding a new source means implementing the interface and registering it; no existing source code changes.
Currently shipped:
- AbuseIPDB (
reputation.abuseipdb_key) - external IP reputation feed. CSM caps uncached lookups per cycle and reserves store-backed daily quota before sending requests. - Rspamd (
reputation.rspamd.*) - per-IP rolling-history signals from the local rspamd controller. Token resolves fromtoken_envat query time so rotation does not require a daemon restart. - Upstream HTTP cache (
reputation.upstream.*) - shared panel-side cache of AbuseIPDB or proprietary scores. Useful in fleets: agents pay a bounded local cache hit (cache_ttl_min, default 15 m) instead of hammering the upstream once per agent. CSM temporarily opens a fail-open circuit breaker after repeated upstream failures and lets only one cooldown probe through at a time. Use HTTPS for remote panels; plain HTTP is accepted only for loopback. Wire contract:docs/upstream-threat-intel-contract.md.
Abuse Reporting
reputation.report can send minimized confirmed-abuse reports to a central
database or private collector. It is off by default. Remote targets must use
HTTPS; plain HTTP is accepted only for loopback collectors. Keys and target
wiring are read at daemon startup, so changes to this block require a restart.
Web UI
The Threat Intel page (/threat) provides:
- IP lookup with composite scoring
- Top attackers with GeoIP enrichment
- Attack type breakdown chart
- Hourly trend chart
- Whitelist management (permanent and temporary)
API Endpoints
GET /api/v1/threat/stats Attack stats and type breakdown
GET /api/v1/threat/top-attackers Top attacking IPs with GeoIP
GET /api/v1/threat/ip IP threat lookup
GET /api/v1/threat/events IP event history
GET /api/v1/threat/whitelist Whitelisted IPs
GET /api/v1/threat/db-stats Attack database statistics
POST /api/v1/threat/block-ip Block IP permanently
POST /api/v1/threat/whitelist-ip Permanent whitelist
POST /api/v1/threat/temp-whitelist-ip Temporary whitelist
POST /api/v1/threat/clear-ip Clear from attack DB
POST /api/v1/threat/unwhitelist-ip Remove from whitelist
GeoIP
MaxMind GeoLite2 integration for IP geolocation and ASN enrichment.
Features
- City database - country, city, latitude/longitude
- ASN database - ISP, organization, autonomous system number
- Auto-download on first use
- Auto-update every 24 hours (configurable)
- RDAP fallback for detailed ISP/org info (cached 24h)
Where It’s Used
- Threat intel page (top attackers, IP lookup)
- Firewall audit log (country flags)
- Login alerts (geographic context)
- Country-based login suppression (
trusted_countries) - Country blocking (firewall CIDR ranges)
Configuration
geoip:
account_id: "YOUR_MAXMIND_ACCOUNT_ID"
license_key: "YOUR_MAXMIND_LICENSE_KEY"
editions:
- GeoLite2-City
- GeoLite2-ASN
auto_update: true
update_interval: 24h
Free account: maxmind.com/en/geolite2/signup
CLI
csm update-geoip # Manual database update
csm firewall update-geoip # Download country CIDR blocks
csm firewall lookup <ip> # GeoIP + block status lookup
API
GET /api/v1/geoip IP geolocation (?ip=&detail=1)
POST /api/v1/geoip/batch Batch lookup (array of IPs)
Challenge Pages
JavaScript proof-of-work challenge pages - a CAPTCHA alternative for suspicious IPs.
How It Works
- Suspicious IP hits a protected resource
- CSM serves a challenge page requiring client-side SHA-256 proof-of-work
- Browser computes the proof (shows progress bar)
- On valid solution, CSM issues an HMAC-verified token
- Subsequent requests pass through automatically
Features
- SHA-256 based difficulty - configurable 0-5 levels
- Client-side computation - no server load
- HMAC token verification - prevents replay attacks
- Nonce-based anti-replay
- User-friendly - progress bar, instant feedback
- Bot filtering - headless browsers and scripts fail the challenge
Use Cases
- Gray-listing alternative to hard IP blocks
- Protecting WordPress login pages
- Rate limiting without blocking legitimate users
- DDoS mitigation layer
Routing Behavior
When challenge.enabled: true, CSM routes eligible IPs to the challenge page instead of hard-blocking them. This works independently of auto_response settings.
Challenge-Eligible Checks
Pre-auth, browser-visible attack signals only: wp_login_bruteforce,
xmlrpc_abuse, wp_user_enumeration, webmail_bruteforce, ip_reputation,
local_threat_score. Post-auth audit events (cPanel, webmail, file upload,
WHM logins), WAF high-volume attacker findings, and non-browser protocols
(SSH, FTP, DNS recursion, outbound traffic, API auth) are excluded - their IPs
have no useful challenge step or no browser session to render the PoW page.
Always Hard-Blocked
Confirmed malware (webshells, YARA/signature matches), WAF high-volume attackers, C2 connections, backdoor ports, phishing pages, database injections, and spam outbreaks are hard-block candidates immediately, even when challenge is enabled.
Timeout Escalation
If an IP doesn’t solve the PoW challenge within 30 minutes, it is automatically escalated to a hard firewall block.
Bind address
The listener binds to 127.0.0.1 by default, so enabling the challenge
server alone does not expose a new public port. The webserver integration
uses direct redirects to challenge.public_url; installed direct mode
therefore needs a non-loopback listener and a public URL ending in
/challenge.
challenge:
enabled: true
listen_addr: 0.0.0.0
listen_port: 8439
public_url: https://cpanel.example.com:8439/challenge
tls_cert: /var/cpanel/ssl/cpanel/mycpanel.pem
tls_key: /var/cpanel/ssl/cpanel/mycpanel.pem
When CSM’s firewall is enabled and challenge.port_gate.enabled is true,
the daemon also opens challenge.listen_port in the main firewall rules.
The port-gate chain still drops traffic to that port unless the source is
loopback, an infra_ips entry, or an IP currently on the challenge list.
Port-gate rules follow the configured listener address family. An IPv6-only
listener gates only IPv6 clients; IPv4 challenge entries stay in the
webserver map but are ignored by the IPv6 nftables set.
Run csm doctor challenge after changing these fields. The command checks
the public URL shape, TLS files, port-gate setting, installed webserver
snippet version, webserver configtest, and the live /challenge/gate
endpoint. Add --json for automation.
TLS
The challenge listener serves HTTPS when challenge-specific TLS material is configured. Loopback listeners stay on plain HTTP by default. Direct/public listeners can reuse the Web UI cert.
Resolution order:
challenge.tls_cert+challenge.tls_key(explicit per-service).webui.tls_cert+webui.tls_key(shared cert; cPanelmycpanel.pemcovers both webui and the challenge port without extra config) only whenchallenge.listen_addris not loopback.- Plain HTTP. This is expected for the default loopback-only path.
Public listeners without TLS log a startup warning.
HSTS-pinned parent domains (cPanel, phpanel, customer apex) will
fail with
ERR_SSL_PROTOCOL_ERRORbecause the browser auto- upgrades the URL to https; ship TLS material in production.
challenge:
tls_cert: /var/cpanel/ssl/cpanel/mycpanel.pem
tls_key: /var/cpanel/ssl/cpanel/mycpanel.pem
Trusted Proxies
By default, the challenge server uses RemoteAddr to identify clients.
The shipped webserver integration redirects browsers directly to
challenge.public_url, so it does not need trusted_proxies. Configure
trusted proxies only for a custom proxy deployment where CSM receives
traffic from a proxy and must trust X-Forwarded-For from that proxy.
challenge:
enabled: true
trusted_proxies:
- "127.0.0.1"
- "::1"
Without trusted_proxies, X-Forwarded-For is ignored to prevent IP spoofing.
Successful Verification
When a client passes the challenge:
- The IP is temporarily allowed through the firewall for 4 hours
- A verification cookie is set
- The IP is removed from the challenge list so the webserver stops sending that visitor to the challenge flow
Webserver Integration
The webserver integration redirects challenge-listed IPs to
challenge.public_url. The installer refuses to run until that URL is
an absolute http or https URL ending in /challenge, and the
configured challenge listener is non-loopback.
csm webserver-integration install # initial wire-up
csm webserver-integration upgrade # re-apply after a CSM upgrade
csm webserver-integration status # show detected stack + version drift
csm webserver-integration validate # run the webserver's configtest
csm webserver-integration remove # uninstall the snippet
The installer auto-detects the active webserver via
internal/platform. Supported stacks and snippet paths:
| Stack | Snippet path |
|---|---|
| cPanel + Apache (EasyApache) | /etc/apache2/conf.d/csm-challenge.conf |
| Debian/Ubuntu Apache | /etc/apache2/conf-enabled/csm-challenge.conf |
| RHEL family Apache (httpd) | /etc/httpd/conf.d/csm-challenge.conf |
| LiteSpeed (LSWS) | /usr/local/lsws/conf/templates/csm-challenge.conf |
| Nginx (plain + Engintron + phpanel) | /etc/nginx/conf.d/csm-challenge.conf |
The snippets are rendered from the effective CSM config. Apache and LSWS
read their RewriteMap from /run/csm/challenge_ips.txt; Nginx reads a
native map include from /run/csm/challenge_ips.nginx.map. Both live
outside the private state directory so the webserver user can read them.
CSM rewrites the Nginx include on challenge-list changes and reloads
Nginx only when the file content changes.
On every run, the installer:
- Writes the new snippet to a sibling temp file and renames it into place atomically.
- Runs the webserver’s own configtest (
apachectl configtest,nginx -t,lswsctrl conftest). - On pass: reloads the webserver gracefully and exits 0.
- On fail: restores the previous snippet bytes (or removes the file if it did not exist before) and exits non-zero with the captured configtest output. The webserver is never reloaded with a broken config.
The snippet header carries a version marker; upgrade is a no-op when
the on-disk version matches the shipped version. Hand-edited files
(missing or mismatched marker) trip a “modified” status and the
installer refuses to overwrite them - remove or rename first.
Hosts with no detectable webserver exit with status=skipped so
package post-install hooks succeed cleanly on, e.g., a plain phpanel
worker that doesn’t run nginx locally.
Bypass Paths
Three opt-in bypass mechanisms let legitimate traffic skip the PoW page entirely. All default off; an upgraded csm.yaml with no new blocks behaves exactly as before.
CAPTCHA Fallback (JS-Disabled Visitors)
The PoW solver requires JavaScript. Visitors with JS off (older mobile browsers, accessibility tooling, text browsers, scripted integrations) would otherwise be locked out. When configured, CSM renders a Cloudflare Turnstile or hCaptcha widget inside a <noscript> block; on completion the form posts to /challenge/captcha-verify and CSM validates the token server-side against the provider’s siteverify endpoint.
Provider rejections do not spend the page nonce, so a visitor can retry the
same challenge page after a mistyped, expired, or failed widget response.
challenge:
captcha_fallback:
provider: turnstile # turnstile | hcaptcha | "" (off)
site_key: "0xAAAA..." # public key embedded in the widget
secret_key: "0xBBBB..." # verified server-side; never sent to client
timeout: 10s
Verified Operator Sessions
Operators who repeatedly hit the challenge during normal admin work can mint a signed cookie that bypasses PoW for the cookie’s TTL. The signing key is generated at daemon startup and rotates on every restart – old cookies stop working automatically.
challenge:
verified_session:
enabled: true
cookie_name: csm_admin_session # default
ttl: 4h # default
admin_secret: "long-shared-secret" # required
To issue a cookie, POST the secret to the challenge server:
curl -i -X POST -d "secret=long-shared-secret" \
https://your-host:8439/challenge/admin-token
# 204 No Content
# Set-Cookie: csm_admin_session=...; Path=/; HttpOnly; Secure; SameSite=Lax
The cookie binds to the requester’s IP, so a stolen cookie does not work from a different network.
Verified Search Crawlers
Googlebot and Bingbot can be allow-passed by reverse-DNS forward-confirm. CSM looks up the visitor’s PTR, checks it ends in a known crawler suffix (e.g. .googlebot.com), then forward-resolves that name to confirm the original IP appears in the result. A spoofed User-Agent: Googlebot from a residential IP fails forward-confirm and falls through to PoW.
challenge:
verified_crawlers:
enabled: true
providers: [googlebot, bingbot]
cache_ttl: 15m
Positive results cache for cache_ttl; negative results cache for one-fifth that long so a transiently-broken resolver does not lock out a real crawler for the full window.
Operational
Backups
csm store export and csm store import capture the bbolt store, state JSON files (baseline file hashes), and signature-rules cache into a single tar+zstd archive. Use these for re-provisioning, cluster cloning, and disaster recovery rather than re-baselining a 200k-file account tree from scratch.
csm store export /var/backups/csm-$(date +%F).csmbak
sha256sum -c /var/backups/csm-$(date +%F).csmbak.sha256
# transfer the .csmbak + .sha256 to the target host
systemctl stop csm
csm store import /var/backups/csm-2026-04-27.csmbak
systemctl start csm
Partial restore: --only=baseline restores only the file-hash state (useful after a full re-install where firewall and history should stay fresh); --only=firewall merges the firewall buckets into an existing daemon (useful for cloning blocklists across a cluster).
Performance Monitor
CSM monitors server performance metrics and generates findings when thresholds are exceeded.
Critical Checks (every 10 min)
| Check | What it monitors |
|---|---|
perf_load | CPU load average vs core count (critical/high/warning thresholds) |
perf_php_processes | PHP process count and total memory usage |
perf_memory | Swap usage percentage and OOM killer activity |
Deep Checks (every 60 min)
| Check | What it monitors |
|---|---|
perf_php_handler | PHP handler type (DSO vs CGI vs FPM) and configuration |
perf_mysql_config | MySQL my.cnf settings (buffer pool, connections, query cache) |
perf_redis_config | Redis memory limits, persistence, eviction policy |
perf_error_logs | Error log file sizes (bloat detection) |
perf_wp_config | WordPress wp-config.php hardening and debug settings |
perf_wp_transients | WordPress database transient bloat |
perf_wp_cron | WordPress cron scheduling (missed crons, excessive events) |
Web UI
The Performance page (/performance) shows real-time metrics:
- Server load and CPU usage
- PHP process and memory charts
- MySQL and Redis health
- WordPress performance indicators
The findings list also exposes admin-only fixes, per-row and as a Bulk fix dropdown that applies one fix to every matching finding at once:
perf_error_logs: truncate a bloatederror_login place. The inode is preserved so running PHP processes keep writing to the same file.perf_wp_config: disabledisplay_errorsin.user.ini,php.ini, or.htaccessby commenting the matched line and appending an Off override.perf_wp_cron: adddefine('DISABLE_WP_CRON', true)towp-config.phpand install a per-user system cron that runswp-cron.phpon a fixed interval. Disabling WP-Cron alone would stop scheduled WordPress tasks, so the cron is installed in the account owner’s own crontab (visible and editable by the customer). The cron is installed before the define is written, so a crontab failure leaves WordPress scheduling unchanged. The define is inserted before the “stop editing” marker (or thewp-settings.phprequire); insertion points inside multiline comments or heredocs are ignored, and the fix refuses awp-config.phpwith no safe insertion point rather than corrupt it.
These actions are limited to configured account roots, reject symlinks and unsupported file types, and remove the fixed row from the active findings view after a successful edit.
WP-Cron fix settings
Tune the WP-Cron remediation under Settings -> Performance:
performance.wp_cron_fix.interval_minutes(default5, range 1-60): how often the installed system cron runswp-cron.php. 5 minutes balances scheduled-task responsiveness against the load that HTTP-triggered WP-Cron creates.performance.wp_cron_fix.php_bin(default empty = auto-detect): the PHP interpreter for the cron line. CLI php is used instead of an HTTP request so the job never ties up a web worker.
To let the daemon apply this fix automatically on every WP-Cron finding, set
auto_response.fix_wp_cron: true (default false; requires
auto_response.enabled: true). It is opt-in because it edits customer
wp-config.php files and crontabs.
MySQL telemetry auth
The MySQL panel runs mysql -e "SHOW STATUS LIKE 'Threads_connected'" from
the csm process. The client needs to authenticate against the local server,
and csm supports two setups out of the box:
-
A
~/.my.cnffor the csm runtime user with credentials for a MySQL account that holds at least thePROCESSprivilege. cPanel and CloudLinux ship/root/.my.cnffor the root user; csm running as root picks it up automatically. -
A unix-socket grant for the csm OS user, e.g. on Debian/Ubuntu MariaDB:
CREATE USER 'root'@'localhost' IDENTIFIED VIA unix_socket; GRANT PROCESS ON *.* TO 'root'@'localhost';
If neither is configured, the MYSQL card renders n/a / n/a instead of a
misleading 0 conn. csm makes no attempt to connect over TCP or store
credentials on its own.
Redis telemetry auth
The Redis panel connects to local Redis at 127.0.0.1:6379. If Redis
requires a password, set REDISCLI_AUTH in the csm daemon environment.
The dashboard uses that password for its in-process Redis client.
API
GET /api/v1/performance Current performance metrics snapshot
POST /api/v1/perf/fix-error-log
POST /api/v1/perf/fix-display-errors
POST /api/v1/perf/fix-wp-cron
Web UI
HTTPS dashboard with polling-based live updates (10s feed, 60s stats). Dark/light theme toggle.
Navigation
The sidebar groups pages by operator workflow. URLs are stable; the groups only reorder visibility:
- Overview - Dashboard
- Triage - Incidents, Findings (Active and History tabs)
- Response - Firewall, Quarantine, Cleanup, Email, ModSecurity, Threat Intel
- Operations - Performance, Hardening, Rules, ModSec Rules, Audit
- Configuration - Settings
Sidebar group expand/collapse state is saved in the browser. On
viewports under 992px the sidebar collapses into a top-bar drawer
toggled from the hamburger button. Account detail (/account) is
hidden from the sidebar; it is reached from finding rows, incident
detail, and Threat Intel result panels. Read-scope sessions hide
admin-only navigation entries such as Configuration and ModSec Rules.
Pages
| Page | URL | Purpose |
|---|---|---|
| Dashboard | /dashboard | Triage queue, daemon status strip, Components matrix, system posture, 24h stats, recent activity, accounts at risk, auto-response summary, brute-force summary, timeline charts |
| Findings | /findings | Active findings with search, check/account filters, header grouping toggle, detail panel, fix/dismiss/suppress actions, sticky bulk operations, modal account scan |
| Findings > History | /findings?tab=history | Paginated archive of all findings with date range and severity filters, CSV export |
| Quarantine | /quarantine | Quarantined files with content preview, restore capability |
| Cleanup | /cleanup-history | File pre-clean backups and DB-object backups with preview and restore controls |
| Firewall | /firewall | Subview-tabbed page (?view=overview/lookup/blocks/allow/config/audit/danger): blocked IPs/subnets with GeoIP, whitelist management, search, audit log; destructive actions live under the Danger tab |
| ModSecurity | /modsec | WAF workbench: status strip, Active WAF pressure summary list (top attackers by hits), top rules / domains side panel, and Blocked IPs / Events / Rules tabs. Block detail panels show first-seen, top URIs, sample events, and direct links to Threat Intel, Firewall lookup, and rule management |
| ModSec Rules | /modsec/rules | Per-rule management, overrides, escalation control |
/email | Email workbench: status strip (queue, frozen, oldest, AV, group counts), grouped action rows on the left (compromised, spam outbreak, auth failure, queue, malware), Mail protection state on the right, and Findings / Auth failures / Queue / Quarantine / Senders / Forwarders / Deliverability / Outbound abuse tabs below. Queue breaks the spool into real mail vs null-sender bounce backscatter (frozen count, oldest age, top stuck recipients) and flushes frozen backscatter in one click without touching real or retrying mail. Forwarders lists cPanel forwarders – destination provider, owner, and whether a local copy is also kept – so off-server relays to free providers are visible at a glance; held forward copies appear here to release or delete. Enforce mode currently holds null-sender backscatter and bad-sender-IP copies before external relay while the local copy still delivers. Deliverability shows which providers are throttling the server, the affected sending IPs, and each provider’s stated reason. Outbound abuse lists recent PHP-mail relay detections (spam outbreaks from one source IP across many sites, high-volume scripts or accounts) with the contributing site/script breakdown and a one-click 24h block. | |
| Threat Intel | /threat | IP lookup with scoring/GeoIP/ASN, top attackers, attack type charts, trends |
| Hardening | /hardening | On-demand hardening audit, stored report, score, and remediation guidance |
| Incidents | /incident | Correlated incident list with detail panel plus forensic timeline search by IP or account |
| Rules | /rules | YAML/YARA rule management, suppressions, state export/import, test alerts |
| Account | /account | Per-account analysis: findings, quarantine, history, on-demand scan |
| Audit | /audit | System-wide action log with search, action and date filters, URL state, and export |
| Performance | /performance | Server load, PHP processes, MySQL, Redis, WordPress metrics |
| Settings | /settings | Searchable config editor with grouped large sections, field-level validation errors, restart notices, redacted secret updates, and firewall tentative apply with rollback timer |
Security
- Authentication - Bearer token (header or HttpOnly/Secure/SameSite=Strict cookie)
- CSRF - HMAC-derived token on all POST mutations
- Headers - X-Frame-Options DENY, Content-Security-Policy, HSTS, nosniff
- TLS - Auto-generated self-signed certificate
- Rate limiting - 5 login attempts/min, 600 API requests/min per IP
- Bearer auth skips CSRF (for API-to-API calls)
Keyboard Shortcuts
General
| Key | Action |
|---|---|
? | Show shortcut help |
/ | Focus search input |
Ctrl-K / Cmd-K | Open command palette |
Navigate
| Key | Action |
|---|---|
g d | Go to Dashboard |
g f | Go to Findings |
g h | Go to Findings > History tab |
g t | Go to Threat Intel |
g r | Go to Rules |
g b | Go to Blocked IPs (Firewall) |
Findings page
| Key | Action |
|---|---|
j / k | Move selection down/up |
d | Dismiss selected finding |
f | Fix selected finding |
WHM Plugin
CSM installs a WHM plugin (addon_csm.cgi) that redirects operators from WHM to the daemon Web UI. After the redirect, API calls are same-origin requests to the daemon.
API Reference
Machine-readable HTTPS API. All endpoints require token authentication. State-changing POST, PUT, PATCH, and DELETE requests require CSRF protection for browser cookie sessions.
Authentication
# Bearer token (header)
curl -H "Authorization: Bearer YOUR_TOKEN" https://server:9443/api/v1/status
# Cookie-based (after login)
curl -b "csm_auth=YOUR_TOKEN" https://server:9443/api/v1/status
Cookie-authenticated state-changing requests require the X-CSRF-Token header (obtained from the login response or page meta tag). Admin-scope Bearer requests are CSRF-exempt because the Authorization header is the write credential.
Token scopes
Configure tokens under webui.tokens: with a scope of admin or read:
webui:
tokens:
- name: "operator"
token: "..."
scope: admin # full read+write
- name: "panel-readonly"
token: "..."
scope: read # status, findings, history, stats, blocked IPs, health, components, capabilities, SSE
The legacy single-token webui.auth_token: is migrated automatically to a legacy-auth-token admin entry on first start. Read-scope tokens are intended for orchestrators and dashboards that consume status, findings, history, stats, blocked-IP summaries, health, components, capabilities, and SSE events. Admin scope is still required for write routes and for sensitive reads such as quarantine, settings, firewall internals, threat-intel detail, rules, ModSecurity, account detail, exports, incident timelines, and audit history. metrics_token: is a separate, read-only credential for /metrics only.
Status & Data
GET /api/v1/status Full health snapshot: version, uptime, watchers, severity counts,
store health, blocklist size, capabilities[], config_hash, binary_hash,
automation rollout state, challenge pending count, rollback state.
`latest_scan` is the canonical last-scan timestamp; `last_scan_time`
is a legacy alias kept for older clients and will be removed.
GET /api/v1/capabilities Static feature list (e.g. `confd.dropins.v1`, `events.sse.v1`,
`webhook.phpanel.v1`, `webui.prefs.v1`, `webui.undo.v1`,
`mail.queue.composition.v1`). Use for orchestrator feature-detect.
GET /api/v1/components Watcher/component matrix with attachment, event, and upstream freshness state.
GET /api/v1/events Server-Sent Events stream of findings as they dispatch.
Read-scope token sufficient. One JSON event per `data:` line.
GET /api/v1/health Daemon health (fanotify, watchers, engines)
GET /api/v1/findings Current active findings
GET /api/v1/findings/enriched Enriched findings with GeoIP, accounts, fix info
GET /api/v1/finding-detail Finding detail with action history (?check=&message=)
GET /api/v1/history Paginated history (?limit=&offset=&from=&to=&severity=&search=)
GET /api/v1/history/csv CSV export (up to 5,000 entries)
GET /api/v1/stats 24h severity counts, accounts at risk, auto-response summary
GET /api/v1/stats/trend 30-day daily severity counts
GET /api/v1/stats/timeline Event timeline
GET /api/v1/quarantine Quarantined files with metadata (incl. htaccess pre_clean backups)
GET /api/v1/quarantine-preview Preview quarantined file content (?id=)
GET /api/v1/db-object-backups db_object_backups bucket (MySQL trigger/event/procedure/function drops)
GET /api/v1/db-object-backup-preview Preview captured CREATE SQL (?key=)
GET /api/v1/blocked-ips Blocked IPs with reason and expiry
GET /api/v1/accounts cPanel account list
GET /api/v1/account Per-account findings, quarantine, history (?name=)
GET /api/v1/audit UI audit log
GET /api/v1/export Export state (suppressions, whitelist)
GET /api/v1/incident Incident timeline (?ip=&account=&hours=)
GET /api/v1/performance Performance metrics snapshot
POST /api/v1/perf/fix-error-log Truncate a fixed-row error_log finding
POST /api/v1/perf/fix-display-errors
Disable display_errors for a fixed-row config finding
GET /api/v1/hardening Last stored hardening audit report
GeoIP
GET /api/v1/geoip IP geolocation (?ip=&detail=1)
POST /api/v1/geoip/batch Batch GeoIP lookup (JSON array of IPs)
Threat Intelligence
GET /api/v1/threat/stats Attack stats, type breakdown, hourly trend
GET /api/v1/threat/top-attackers Top attacking IPs with GeoIP (?limit=)
GET /api/v1/threat/ip IP threat lookup (?ip=)
GET /api/v1/threat/events IP event history (?ip=&limit=)
GET /api/v1/threat/whitelist Whitelisted IPs
GET /api/v1/threat/db-stats Attack database statistics
POST /api/v1/threat/block-ip Block IP permanently
POST /api/v1/threat/whitelist-ip Permanent whitelist
POST /api/v1/threat/temp-whitelist-ip Temporary whitelist (with expiry)
POST /api/v1/threat/clear-ip Clear IP from attack database
POST /api/v1/threat/unwhitelist-ip Remove from whitelist
POST /api/v1/threat/bulk-action Bulk block/clear/whitelist across many IPs
Firewall
GET /api/v1/firewall/status Config, blocked/allowed counts
GET /api/v1/firewall/allowed Whitelisted IPs
GET /api/v1/firewall/subnets Blocked subnets
GET /api/v1/firewall/audit Firewall audit log
GET /api/v1/firewall/check Check if IP is blocked (?ip=)
POST /api/v1/block-ip Block an IP
POST /api/v1/unblock-ip Unblock an IP
POST /api/v1/unblock-bulk Bulk unblock IPs
POST /api/v1/firewall/allow-ip Allow an IP
POST /api/v1/firewall/remove-allow Remove IP from allow list
POST /api/v1/firewall/deny-subnet Block subnet
POST /api/v1/firewall/remove-subnet Remove subnet block
POST /api/v1/firewall/flush Clear all blocks
POST /api/v1/firewall/unban Unblock IP + flush cphulk
POST /api/v1/firewall/cphulk-clear Flush cphulk bans only
ModSecurity
GET /api/v1/incidents/groups Roll up open/contained incidents by (kind, source) so credential spray collapses into one row per attacker IP. Read scope. Accepts ?status=active|all|open|contained|resolved|dismissed, ?kind=, ?limit=.
GET /api/v1/modsec/stats WAF statistics (read scope)
GET /api/v1/modsec/blocks Blocked requests log, aggregated per IP (read scope)
GET /api/v1/modsec/events WAF event details (read scope)
GET /api/v1/modsec/rules Loaded rules list
POST /api/v1/modsec/rules/apply Apply custom rules
POST /api/v1/modsec/rules/escalation Change rule severity/action
Rules & Suppressions
GET /api/v1/rules/status YAML/YARA rule counts, version
GET /api/v1/rules/list Rule files
GET /api/v1/suppressions Suppression rules
POST /api/v1/rules/reload Reload signature rules from disk
POST /api/v1/suppressions Add or delete suppression rule
POST /api/v1/rules/modsec-escalation ModSec escalation override
GET /api/v1/email/stats Email scanning statistics
GET /api/v1/email/forwarders Mail forwarder inventory with destination providers and local-copy flags (read scope)
GET /api/v1/email/deferrals Outbound deferral rollup by provider and sending IP with reason codes, parsed from exim_mainlog (read scope)
GET /api/v1/email/queue-composition Mail queue makeup: real vs null-sender bounce backscatter, frozen count, oldest age, top stuck recipients (read scope)
POST /api/v1/email/queue/flush-backscatter Remove only frozen null-sender (backscatter) messages from the exim queue on cPanel hosts; returns removed count or 503 when unavailable (admin scope, CSRF)
GET /api/v1/email/held Forward copies held by the forward guard (admin scope)
POST /api/v1/email/held/{id}/release Re-inject a held forward copy to its external recipient (admin scope, CSRF)
DELETE /api/v1/email/held/{id} Discard a held forward copy (admin scope, CSRF)
GET /api/v1/email/groups Server-grouped action rows (kind=compromised_account|spam_outbreak|auth_failure|queue_alert|malware) with from/to/limit (read scope)
GET /api/v1/email/relay-abuse Outbound PHP-mail abuse detections (spam outbreaks, high-volume scripts/accounts) with per-site script breakdown; from/to/limit (read scope)
GET /api/v1/email/quarantine Quarantined email list
GET /api/v1/email/av/status Email AV watcher status
POST /api/v1/email/quarantine/ Release or delete quarantined email
Hardening
GET /api/v1/hardening Load last hardening audit report
POST /api/v1/hardening/run Run hardening audit and save report
Actions
POST /api/v1/fix Apply fix for a finding
POST /api/v1/fix-bulk Bulk fix multiple findings
POST /api/v1/dismiss Dismiss a finding
POST /api/v1/scan-account On-demand account scan
POST /api/v1/quarantine-restore Restore quarantined file
POST /api/v1/quarantine/bulk-delete Bulk-delete quarantined files
POST /api/v1/db-object-backup-restore Restore a dropped MySQL object from its db_object_backups record
POST /api/v1/test-alert Send test alert through all channels
POST /api/v1/import Import state bundle (suppressions, whitelist)
Settings
GET /api/v1/settings List editable config sections
GET /api/v1/settings/<section> Read a config section (secrets redacted)
POST /api/v1/settings/<section> Update a config section (safe fields reload, restart fields queue)
POST /api/v1/settings/restart Request a daemon restart (after editing restart-required fields)
POST /api/v1/settings/firewall/tentative-apply Save firewall config, restart, and arm rollback timer
GET /api/v1/settings/firewall/rollback Read pending rollback state
POST /api/v1/settings/firewall/confirm Confirm tentative firewall changes
POST /api/v1/settings/firewall/revert Revert tentative firewall changes now
Sections map to top-level config keys: alerts, auto_response, challenge, reputation, performance, infra_ips, sentry, etc. Writes persist to csm.yaml, re-sign the integrity hash, and hot-reload where possible; restart-required changes are queued for /api/v1/settings/restart. Invalid field values return 422 and do not touch disk. Firewall tentative apply is restart-class by design: it snapshots the previous config, writes the new one, restarts the daemon, and auto-reverts unless the operator confirms before the timer expires.
Operator preferences
Per-operator state (UI density, timestamp display, default auto-refresh,
saved filter views) is keyed server-side by SHA-256 of the auth token,
so preferences follow the operator across browsers and devices without
the daemon ever storing the raw credential. Capability flag:
webui.prefs.v1. These endpoints require admin scope because they read
or mutate operator-private UI state.
GET /api/v1/prefs/user Read this operator's UI preferences
PUT /api/v1/prefs/user Replace the prefs blob (CSRF on cookie sessions)
GET /api/v1/prefs/views List saved views; `?page=findings` filters by page
PUT /api/v1/prefs/views Upsert one view {page, name, params} (CSRF on cookie sessions)
DELETE /api/v1/prefs/views Delete one view {page, name} (CSRF on cookie sessions)
Response shape for GET /api/v1/prefs/user:
{
"density": "comfortable",
"timezone": "local",
"auto_refresh": "on",
"table_columns": { "findings-table": ["check","severity","when"] }
}
density is comfortable or compact. timezone is server, local,
or an IANA-shaped zone string (e.g. Europe/Bucharest). auto_refresh
is on or off. Server-side sanitisation drops any other value. Unset
prefs encode as empty strings; the UI applies comfortable, local, and
on defaults.
Response shape for GET /api/v1/prefs/views:
[
{
"name": "Critical SSH",
"page": "findings",
"params": { "severity": "critical", "check": "smtp_bruteforce" },
"updated": 1779743255
}
]
Saved views are operator-scoped and capped at 200 per operator. The saved
view collection is stored as one 64 KiB preference blob. page and
params keys must be simple identifiers: ASCII letters, digits,
underscore, hyphen, or dot, up to 64 bytes. Each view has at most 32
params, and param string values are capped at 256 bytes. name must be
1-80 bytes with no control characters. PUT and DELETE return
{"status":"ok"} on success.
Bulk-action undo
Bulk threat block / whitelist and bulk firewall unblock responses return
an undo_token when the daemon queues an inverse operation server-side
for 30 seconds. The UI surfaces a banner with the same TTL; CLI callers
can act on the token through the endpoints below. Each successful undo
writes an undo_<original_action> audit entry. Capability flag:
webui.undo.v1. These endpoints require admin scope because they read
or mutate operator-private action state.
GET /api/v1/undo/pending Latest pending undo entry for this operator (empty object if none)
POST /api/v1/undo/run Consume an entry and dispatch its inverse {id}; empty id uses latest
Non-empty response shape for GET /api/v1/undo/pending:
{
"id": "188d1f2a6c8b0000",
"action": "threat_bulk_block",
"inverse": "threat_bulk_unblock",
"summary": "Blocked 2 IPs",
"recorded_at": "2026-05-26T00:07:09Z",
"expires_at": "2026-05-26T00:07:39Z"
}
POST /api/v1/undo/run returns {status, action, inverse, count} on
success, or 410 Gone when the entry is missing, already consumed, or
past its 30-second TTL. Recognised inverse action keys are
threat_bulk_unblock, threat_bulk_block, threat_bulk_unwhitelist,
threat_bulk_whitelist, and firewall_bulk_reblock. Other bulk actions
(quarantine delete, generic fix) do not surface an undo token because
they have no clean inverse.
Finding fields
Every finding in /api/v1/findings, /api/v1/events, and the JSONL audit log carries optional correlation fields when CSM can attribute them:
| Field | Meaning |
|---|---|
tenant_id | Tenant attribution from the verdict callback or panel-side webhook reply |
domain | Domain associated with the event (e.g. PHP-relay scriptKey host, mailbox domain) |
mailbox | Mailbox attribution (e.g. mail brute-force target, PHP-relay envelope-from) |
relay_total | PHP-relay trigger count for the path that fired |
relay_breakdown | PHP-relay script samples that contributed to the alert, with script key, hit count, last seen time, and a bounded sample subject when available |
Fields are omitted when the daemon could not attribute them. Orchestrators should treat absence as “unknown,” not “global.”
Cleanup fields
GET /api/v1/quarantine also powers the Cleanup page’s file-backup list. Entries include:
| Field | Meaning |
|---|---|
kind | quarantine or pre_clean |
live_state | original_missing, live_differs, original_not_file, archive_missing, archive_not_file, or unknown. Byte-identical restored entries are hidden. |
GET /api/v1/db-object-backups returns restored and restored_at when a captured MySQL trigger/event/procedure/function backup has already been replayed.
Incidents
GET /api/v1/incidents
Returns every incident (open, contained, resolved, dismissed) sorted by
updated_at descending.
GET /api/v1/incidents/<id>
Returns one incident by id. 404 if not found.
POST /api/v1/incidents/<id>/status
Body:
{"status": "resolved", "details": "operator-marked"}
Status values: open, contained, resolved, dismissed. Closing an
incident (resolved/dismissed) means future findings for the same
correlation key start a fresh incident. Reopening an incident binds the
same key again. Incident JSON includes correlation_key when CSM has a
stored account, mailbox, domain, process, or remote-IP key.
Metrics (Prometheus)
CSM exposes a /metrics endpoint on its HTTPS web UI port
(default 9443). The endpoint serves the Prometheus text exposition
format (Content-Type: text/plain; version=0.0.4) and is safe to
scrape every 15 seconds.
“Available metrics” below is the shipped set. New call sites are
instrumented in ongoing releases; check CHANGELOG.md under
## [Unreleased] for the latest additions.
Enabling
Metrics are on whenever webui.enabled: true is set in csm.yaml.
The endpoint has its own auth knob:
webui:
enabled: true
auth_token: "<UI login token>"
metrics_token: "<long random string for Prometheus scraper>"
metrics_token is optional. When set, a Bearer header containing
this exact value unlocks /metrics. The UI auth_token or a valid
UI session cookie is also accepted so the dashboard can self-scrape,
but keeping the two tokens separate is recommended: rotating
auth_token does not then break Prometheus scraping, and giving
your monitoring stack the scrape token does not also give it UI
access.
Prometheus scrape config
scrape_configs:
- job_name: csm
scheme: https
tls_config:
# CSM serves a self-signed cert by default; either skip
# verification here or pin the CA you chose.
insecure_skip_verify: true
authorization:
type: Bearer
credentials: "<metrics_token from csm.yaml>"
static_configs:
- targets:
- csm-host-1.example.internal:9443
- csm-host-2.example.internal:9443
A complete, validated version of this snippet (with global: block)
ships as docs/src/examples/prometheus-scrape.yml. The CI pipeline
runs promtool check config against that file in the promtool-check
job; if the example ever stops validating, the pipeline fails.
Quick check
curl -sk -H "Authorization: Bearer $METRICS_TOKEN" \
https://localhost:9443/metrics | head
Available metrics
Build / process
csm_build_info{version}(gauge, always 1): build metadata. Scrape once to discover the running version. Join on it in queries viagroup_left(version).
YARA-X worker (default-on; off only if signatures.yara_worker_enabled: false)
csm_yara_worker_restarts_total(counter): cumulative number of times the supervisor has restarted thecsm yara-workerchild. Alert on sustained growth: a single restart is routine (rule deploys), a steady climb means the worker is crash-looping and real-time YARA scans are degraded.
Findings
csm_findings_total{severity}(counter): every finding CSM records is counted here. Severities areCRITICAL,HIGH, andWARNING(matching thealert.Severityenum). Userate(...)for arrival velocity; watch for sudden CRITICAL spikes.
Alert delivery
csm_alert_dispatch_failures_total(counter): alert channel sends that failed after CSM detected findings. Counts email, webhook, and phpanel delivery failures. Sustained growth means findings are not reaching operators; check SMTP, webhook reachability, and credentials.
State
csm_store_size_bytes(gauge): on-disk size of the bbolt state database (/var/lib/csm/state/csm.dbby default). Enable theretention:block to bound logical growth and runcsm store compactduring maintenance to reclaim freelisted pages; without either, this gauge only climbs.
Fanotify realtime monitor
csm_fanotify_queue_depth(gauge): current number of queued events waiting for the analyzer pool. The queue capacity is 4000; sustained values near that cap mean drops are imminent. Alert target:max_over_time(csm_fanotify_queue_depth[5m]) > 3500.csm_fanotify_events_dropped_total(counter): cumulative events dropped because the analyzer queue was full. The reconcile pass still rescans drop-affected directories 60 s later, so dropped events do not disappear from detection – they arrive delayed. Alert target:rate(csm_fanotify_events_dropped_total[5m]) > 0paired with a short for-clause.csm_fanotify_reconcile_latency_seconds(histogram): how long the post-overflow reconcile pass takes to walk drop-affected directories and rescan recent files. Buckets: 0.01 s .. 60 s. Watch p95: reconcile stealing tens of seconds means bulk events are piling up faster than the walker can keep up.csm_checks_domlog_discovery_dropped_total{reason}(counter): per-vhost access-log paths the WP brute-force domlog discovery helper dropped before scanning. Labels:reasonisevalsymlinks_error(broken symlink, attacker-removed log file) orstat_error(file vanished between glob and stat, permission regression on the log directory). Steady growth means a real chunk of vhosts is being silently skipped each cycle. Stale-mtime drops are intentional filtering and are NOT counted here.csm_realtime_content_scan_truncated_total{check}(counter): cumulative real-time content checks where the underlying file was larger than the main read window, so the full-rule pass saw only the leading window. The read cap protects RE2 cost on huge files; sustained growth on a label means full-rule coverage is capped on large files. Labels currently emitted:phpcontent_inline(known webshell filename),phpcontent_uploads(PHP in uploads),php_check(generic PHP content scan),crontab(per-user /var/spool/cron write),htaccess(per-vhost .htaccess write),user_ini(per-vhost .user.ini write),html_phishing(HTML phishing heuristic), andcgi_backdoor(CGI backdoor heuristic). Compare against finding history forwebshell_content_realtime(or the matching check name for non-PHP labels) to judge whether a raised cap would surface real findings.
Periodic check runner
-
csm_checks_crontab_base64_truncated_total(counter): crontab base64 candidates that exceeded the per-blob decode cap before decoded-content pattern matching ran. Sustained growth means encoded cron content is larger than the scanner currently inspects; review affected crontabs and tune the scanner before redeploying. -
csm_check_duration_seconds{name,tier}(histogram): wall-clock time each check takes to complete. Labelnameis one of the 62 checks (fake_kernel_threads,webshells, …); labeltieriscritical,deep, orall. Buckets: 0.01 s .. 900 s. Most checks keep the 300 s timeout ceiling; heavy filesystem checks can run up to 900 s. Useful aggregations:# p95 of the slowest check in the critical tier: histogram_quantile(0.95, sum by (le, name) ( rate(csm_check_duration_seconds_bucket{tier="critical"}[10m]) ) ) # total time each cycle spends in deep-tier checks: sum by (tier) (rate(csm_check_duration_seconds_sum{tier="deep"}[1h]))
Threat intelligence
Registered when reputation.upstream.enabled: true.
csm_threatintel_cache_hits_total(counter): upstream threat-intel lookups served from CSM’s local per-IP cache.csm_threatintel_cache_misses_total(counter): upstream threat-intel lookups not served from the local cache. A miss may still fail open without an HTTP request when the breaker is open.csm_threatintel_backend_failures_total(counter): upstream backend failures from network errors, non-200 responses, malformed JSON, response IP mismatches, or out-of-range scores.csm_threatintel_breaker_open(gauge): 1 while the upstream circuit breaker is refusing calls, 0 when closed or allowing its single cooldown probe.
Firewall
csm_blocked_ips_total(gauge): number of IPs currently on the firewall block list. Excludes expired temp bans – the store’sLoadFirewallStatefilters those before the gauge reads.csm_firewall_rules_total(gauge): total firewall rules across all four categories (blocked IPs, allowed IPs, blocked subnets, port-specific allows). Excludes expired temp blocks and allow-list rows. Sudden drops are worth investigating; expected drops happen when temporary block or allow deadlines pass.
Config reloads
csm_config_reloads_total{result}(counter): SIGHUP reload attempts, by outcome. Labels:resultis one of:success– safe fields swapped in place, integrity hash re-signed, live config updated.restart_required– one or more fields that need a full restart changed; live config unchanged.error– YAML parse failure, validation failure, or re-sign failure; live config unchanged.noop– file edit produced no semantic change (identical values, whitespace edit, etc.). Alert target:rate(csm_config_reloads_total{result="error"}[5m]) > 0paired with a short for-clause.
Auto-response
csm_auto_response_actions_total{action}(counter): every auto-response action fired, by class. Labels:actioniskill,quarantine, orblock. Incremented once per finding the correspondingAuto*helper produces, so a batch blocking four IPs in one cycle adds 4 toaction=block. Useful for detecting response storms:rate(csm_auto_response_actions_total[5m]).
Retention (when retention.enabled: true)
csm_retention_sweeps_total(counter): number of retention sweep cycles completed since daemon start. A flatline after a restart means the sweep goroutine is not scheduling; a healthy daemon increments this on everysweep_intervaltick.csm_retention_deleted_total(counter): cumulative entries deleted across thehistory,attacks:events, andreputationbuckets. Spikes on the first sweep after enabling retention (initial backlog), then settles to the steady-state churn. Useful for estimating when the file might benefit from acsm store compactmaintenance window.
PHP-relay (email abuse, cPanel only)
All series are prefixed csm_php_relay_. Registered when email_protection.php_relay.enabled: true and the host is cPanel; otherwise zero across the board. See Real-time detection.
csm_php_relay_findings_total{path}(counter): findings emitted per detection path. Labels:pathis one ofheader,volume,volume_account,fanout(and laterbaseline,reputationfor Stages 2-3). Userate(...)to spot detection storms; a sudden rate jump onheadertypically means a contact-form vulnerability is being exploited, onvolume_accounttypically means an account password was leaked.csm_php_relay_actions_total{action,result}(counter): auto-freeze invocations attempted. Labels:actionis currentlyfreeze;resultisokorfail. Pair withcsm_php_relay_findings_totalto confirm freeze keeps up with detection.csm_php_relay_action_gone_total(counter): messages already absent from the spool by the timeexim -Mfran. Normal queue churn; not a failure. Sustained growth means the spool is moving fast and the freezer is racing the queue runner.csm_php_relay_path_skipped_total{path,reason}(counter): path evaluation that bailed before producing a finding. Labels:pathmatches the finding labels above;reasonenumerates the gate that fired (e.g. ignore-list match, missing scriptKey).csm_php_relay_spool_scan_fallbacks_total{reason}(counter): AutoFreeze fell back to a full spool walk to find msgIDs. Labels:reasoniscapped(the in-memoryactiveMsgsper script hit its cap, so a fresh disk walk was needed) orreputation(a late reputation finding arrived for a script with no liveactiveMsgs). Sustained growth oncappedmeans a single script is firing faster than the in-memory window keeps state for; consider raisingheader_score_volume_minor adding an ignore.csm_php_relay_active_msgs_capped_total(counter): per-scriptactiveMsgsset hit its cap and dropped the oldest entry. Counts the eviction event itself; the next freeze for that script will land incsm_php_relay_spool_scan_fallbacks_total{reason="capped"}.csm_php_relay_windows_active{kind}(gauge): retained per-script / per-IP / per-account window state. Labels:kindisscript,ip, oraccount. Sized by Flow E sweep cadence (5 min for windows, 24 h retention for accounts); flat values across hours are normal.csm_php_relay_msgid_index_size{layer}(gauge): msgID dedup index size by storage layer. Labels:layerismemory(in-process map) orbbolt(persisted batch writer). Memory ceiling is 200k entries; bbolt grows freely until the 25 h Flow E sweep prunes it.csm_php_relay_msgindex_persist_dropped_total(counter): bbolt persist queue overflow drops (the 4096-deep buffered channel was full when the watcher tried to enqueue). Should be zero in steady state; a non-zero value means the bbolt writer is blocked on disk and the in-memory dedup is the only thing protecting against double-fire on a queue-runner re-write.csm_php_relay_msgindex_persist_errors_total(counter): bbolt commit failures from the async batch writer. Each bump also emits a Criticalemail_php_relay_msgindex_persist_failedfinding. Disk-full or permissions issue on/var/lib/csm/state/csm.db.csm_php_relay_inotify_overflows_total(counter): kernelIN_Q_OVERFLOWevents on the spool watcher. Each one triggers a bounded recovery scan (default cap 1000 files); if the cap fires, also emitsemail_php_relay_overflow_scan_truncatedCritical. Sustained growth means the spool is churning faster than inotify can keep up — usually a backup restore or a real attack.csm_php_relay_spool_read_errors_total(counter):emailspool.ParseHeaderserrors on-Hfiles the watcher tried to consume. Usually transient (file disappeared between inotify event and open) and self-correcting; sustained growth points at a permissions or filesystem problem.csm_php_relay_userdata_errors_total(counter):cpanelUserDomainsresolver errors reading/var/cpanel/userdata/. Used by the Path 1Frommismatch check; errors here mean Path 1 is potentially undercounting until the read recovers.
Signature retroactive rescans
csm_signature_rescans_total(counter): full deep-tier sweeps completed because a signature file’s mtime advanced. Steady-state zero on hosts that don’t auto-update rules; ticks once perupdate-rulesinvocation otherwise.
Counter reset semantics
Prometheus counters in CSM live in process memory. They reset to zero
whenever the daemon restarts (config change, binary upgrade, crash
recovery). This is the standard behaviour for every
Prometheus-instrumented daemon; Prometheus’s scrape pipeline detects
counter resets on its own and rate(), increase(), and
rate_over_time() all handle them correctly.
Operators should not alert on “counter decreased across a scrape” as
a failure condition. Alert on rate() or increase() of a counter
over a window long enough to absorb expected restarts.
Persisting counters across restarts would require writing to bbolt on every increment, which would not pay for itself. If a specific metric needs restart-stable behaviour later, a gauge-over-the-bbolt-counter pattern can be added for that one case without affecting the rest.
Caveats
- Scrape the web UI’s HTTPS port, not a separate listener.
curl -k/insecure_skip_verifyis appropriate only when the cert is self-signed and the network path is trusted. Pin a CA for anything else.- Prometheus label cardinality: per-account and per-IP labels are deliberately not exposed. Shared-hosting deployments with 1000+ cPanel users would otherwise overwhelm a Prometheus server.
- Metric vectors cap label-value combinations at 1000 children per
metric, including the overflow bucket. Once a vector reaches that
cap, new combinations are aggregated under
_overflow_.
Not instrumented (yet)
- Per-account labels on any metric. Deliberately off: shared-hosting deployments with 1000+ cPanel users would blow out Prometheus cardinality.
- Fanotify inline auto-response actions (the quarantine-while-
seeing-the-write path in
fanotify.go). The periodiccsm_auto_response_actions_totaldoes not count those; a follow- up may split the metric or add asourcelabel. - bbolt per-bucket size breakdown,
csm_store_used_bytes, andcsm_store_last_compact_ts. Deferred to the online-compaction follow-up of the retention work (seeROADMAP.md).
Audit Log (SIEM)
Audit Log
CSM ships every deduplicated finding to one or more SIEM-friendly sinks before the operator-alert rate limiter runs, so Splunk, Loki, Elastic, and friends always see the complete picture even when email and webhook traffic is throttled.
Two sink types ship today, both opt-in via csm.yaml. They can be
enabled together or independently.
Schema
Every event, regardless of transport, has the same shape:
{
"v": 1,
"ts": "2026-04-28T10:32:14.512938Z",
"finding_id": "8e3f1c204c1d8b95",
"severity": "CRITICAL",
"check": "webshell_realtime",
"message": "PHP execution primitive in uploads/",
"details": "...",
"file_path": "/home/customer/public_html/uploads/x.php",
"hostname": "host.example.com"
}
The v field is the schema version. CSM bumps it on incompatible
changes and will not bump it for additive fields, so SIEM parsers
can pin on v: 1 and ignore unknown keys.
finding_id is a stable 16-hex-char hash of the canonical fields
(timestamp, check, severity, message, file path). Two emits of the
same finding produce the same ID, so downstream dedup works across
re-runs.
Process context
Exec and outbound-connection findings on BPF-backed hosts carry an
optional process object with PID, PPID, UID, user, cPanel account
(when known), comm, exe, sanitized cmdline, and a parent chain. The
field is omitted when no context is available, so existing parsers
that ignore unknown keys see no schema change.
{
"severity": "HIGH",
"check": "outbound_connection",
"message": "Suspicious outbound connection",
"process": {
"pid": 4242,
"ppid": 4200,
"uid": 1001,
"user": "alice",
"account": "alice",
"comm": "ncat",
"exe": "/usr/bin/ncat",
"cmdline": ["ncat", "203.0.113.10", "587"],
"parent": {
"pid": 4200,
"ppid": 4100,
"uid": 1001,
"comm": "sh"
}
},
"timestamp": "2026-05-07T12:34:56Z"
}
The parent chain may be truncated at depth 5 and may stop early if an intermediate parent has been evicted from the cache.
File sink (JSONL)
alerts:
audit_log:
file:
enabled: true
path: /var/log/csm/audit.jsonl # default
The default path is created with mode 0640 and the parent dir
with 0750. The packaged logrotate fragment uses copytruncate
mode so the daemon’s open file descriptor stays valid across
rotation – no SIGHUP needed.
Tail it for an interactive view:
tail -F /var/log/csm/audit.jsonl | jq -c
Or hand it to a log shipper like Vector, Filebeat, or Fluentbit.
Syslog sink (RFC 5424)
alerts:
audit_log:
syslog:
enabled: true
network: udp # udp | tcp | unix | unixgram | tls
address: 127.0.0.1:514 # host:port for udp/tcp/tls, path for unix*
facility: local0 # default
tls_ca: "" # optional PEM file for tls transport
Wire-line is RFC 5424 with the JSON event embedded as the MSG body, so receivers that already understand the JSONL schema parse it the same way regardless of transport. UDP and unix-datagram emit one datagram per message; TCP, TLS, and unix-stream use LF framing.
Severity mapping onto the standard syslog level set:
| CSM severity | Syslog level | Numeric |
|---|---|---|
| CRITICAL | crit | 2 |
| HIGH | err | 3 |
| WARNING | warning | 4 |
Tested against rsyslog and syslog-ng receivers in integration.
Backfill
When you first turn on the audit log, the SIEM has no history. Use
csm export --since <when> to dump prior findings in the same JSONL
schema:
csm export --since 24h > recent.jsonl
csm export --since 2026-04-01T00:00:00Z > q2.jsonl
<when> is either an RFC 3339 timestamp or a duration relative to
now (24h, 7d). The output is one JSON event per line on stdout,
identical in shape to what the live sinks emit, so you can pipe it
straight into the same ingest pipeline.
Requires a running daemon.
What gets logged
Every finding the alert pipeline produces, after deduplication but before:
- the per-account rate limiter (so audit signal is not lost when email and webhook are throttled);
- the “blocked IP suppression” filter (so SIEM correlation sees events that operators were spared);
- the per-sink disabled-checks list (audit log is not subject to
email’s
disabled_checks).
This means audit-log volume is generally higher than the email or webhook stream. Plan SIEM retention accordingly.
What does not get logged
The audit log is not a replacement for csm.history (the bbolt
history bucket). Only findings that pass through alert.Dispatch()
are emitted. Internal state changes – daemon startup, reload events,
config changes – live in journald via csm.service and are not
mirrored here.
Building & Testing
Build
# Standard build (no YARA-X)
go build ./cmd/csm/
# Build with YARA-X support (requires libyara_x_capi)
CGO_LDFLAGS="$(pkg-config --libs --static yara_x_capi)" go build -tags yara ./cmd/csm/
Test
go test ./... -count=1 # all tests
go test -race -short ./... # CI mode (race detector, skip slow tests)
Fuzz
CSM has a dozen parsers that read attacker-controlled input: Exim mainlog lines, Dovecot maillog lines, Apache Combined Log Format, /proc/net/tcp rows, wp-config.php bodies, /etc/shadow, auditd comm fields, and finding messages coming back from the WebUI.
Each parser has a Go fuzz target (files named fuzz_parsers_test.go under internal/checks/ and internal/daemon/). Fuzz targets do two things:
- Their seed corpus runs as part of the normal test suite.
go test ./...executes every seed, so a known-bad input stays a regression test forever. - The actual fuzzer runs with
-fuzz=FuzzFoo.
Run a target for a fixed time while investigating:
go test ./internal/checks/... -run=^$ -fuzz=^FuzzExtractPHPDefine$ -fuzztime=30s
Run only the seeds:
go test -run=Fuzz ./internal/checks/... ./internal/daemon/...
If the fuzzer finds a crasher it writes the failing input to testdata/fuzz/FuzzFoo/<hash>. Commit that file alongside the fix and the input becomes a permanent seed.
Adding a fuzz target:
func FuzzMyParser(f *testing.F) {
// Seeds: real-world valid shape, empty, malformed.
f.Add("valid input")
f.Add("")
f.Add("corrupt/truncated")
f.Fuzz(func(t *testing.T, s string) {
_ = myParser(s) // must not panic on any input
})
}
Keep the target tight: call one function, assert it returns. Output verification belongs in a regular test.
Lint
make lint # must pass before push
gofmt -l . # must produce no output
make lint uses repo-local cache directories under .cache/ so the command behaves consistently in local shells, sandboxes, and CI runners.
Linter config in .golangci.yml: errcheck, govet, staticcheck, unused, ineffassign, gocritic, misspell, bodyclose, nilerr.
CI/CD
GitLab CI (.gitlab-ci.yml) is the internal build pipeline. It runs lint/test/package jobs, publishes internal packages, mirrors to GitHub, and creates the public GitHub release artifacts.
| Stage | What it does |
|---|---|
| lint | golangci-lint, gofmt, gosec (blocking), govulncheck |
| test | go test -v -race -timeout=300s -covermode=atomic -coverprofile -coverpkg=./internal/... ./... |
| build-image | Build CSM builder Docker image with YARA-X (manual trigger) |
| build | Two architectures: amd64 with YARA-X CGO, arm64 pure Go |
| integration | Spin up AlmaLinux + Ubuntu cloud servers via phctl, install CSM from the public mirror, run the integration test binary on both hosts, collect coverage. Only runs on main |
| package | RPM + DEB via nFPM |
| sign | Detached signatures on release artifacts |
| publish | Internal GitLab Generic Package Registry (versioned + latest) |
| repo | Publish RPM/DEB to the public mirrors.pidginhost.com apt/dnf repos |
| pages | Docs + coverage HTML (GitLab Pages preview) |
| cleanup | Remove old package versions |
| release | GitLab release on tags matching v* |
| github | Mirror to GitHub + upload release artifacts (auto on tag push) |
Public Releases
To cut a release:
- Move the
[Unreleased]heading inCHANGELOG.mdto the new version (e.g.[2.4.2] - YYYY-MM-DD), commit asrelease: cut X.Y.Z. - Tag and push:
git tag vX.Y.Z git push origin main vX.Y.Z - Wait. The tag pipeline runs integration, publishes packages to the mirror, creates the GitHub release, and uploads every artifact including the fresh
merged-coverage.out. No manual pipeline clicks needed.
The coverage badge rebuilds automatically once the GitHub release exists, because the Pages workflow fetches merged-coverage.out from the latest release that carries one (it walks back through releases if the newest is missing the asset).
Installs and upgrades on end-user servers come from the GitHub release artifacts or the apt/dnf mirror. The internal GitLab package registry is operational tooling only.
Code Conventions
- Imports: stdlib, blank line, third-party, blank line, internal. Use
goimports -local github.com/pidginhost/csm - Errors: Return up the call stack. Wrap with
fmt.Errorf("context: %w", err) - Store:
store.Global()singleton bbolt DB. Always nil-check. - State:
state.Storehandles finding dedup, alert throttling, baseline tracking, latest findings persistence. Passed to subsystems at init - Web UI: Vanilla JS, no framework, no build step. Tabler CSS framework. Use
CSM.get()/CSM.post()/CSM.delete()for API calls. Escape string-built markup withCSM.esc(); prefer DOM APIs for attacker-controlled values. - Logging: New code should use
internal/log(wrapslog/slog). Legacyfmt.Fprintf(os.Stderr, "[%s] ...", ts())call sites remain valid until migrated.
Structured Logging (slog)
CSM’s daemon emits ~190 log lines via fmt.Fprintf(os.Stderr, "[%s] ...", ts()). The internal/log package provides a drop-in slog wrapper so operators can opt into JSON output for log-shipping pipelines (Loki, ELK, Datadog) without a big bang migration.
Operator controls
Two environment variables, read once at daemon startup:
| Variable | Values | Default | Effect |
|---|---|---|---|
CSM_LOG_FORMAT | text, json | text | Output handler |
CSM_LOG_LEVEL | debug, info, warn, error | info | Minimum log level |
Set via systemd drop-in:
# /etc/systemd/system/csm.service.d/logging.conf
[Service]
Environment="CSM_LOG_FORMAT=json"
Environment="CSM_LOG_LEVEL=info"
Then systemctl daemon-reload && systemctl restart csm.
Writing new logging code
import csmlog "github.com/pidginhost/csm/internal/log"
csmlog.Info("scan complete", "findings", len(f), "duration_ms", d.Milliseconds())
csmlog.Warn("log not found, will retry", "path", path, "retry_in", "60s")
csmlog.Error("alert dispatch failed", "err", err, "channel", "email")
Keys should be snake_case. Values should be machine-parseable (numbers, strings, booleans) – avoid formatted strings when you can pass the raw value.
Migrating legacy call sites
Migration is incremental and optional. The legacy format stays valid. Start with the hottest subsystems (alert dispatch, firewall operations, WAF handlers) where structured fields provide the most value, then work outward. Do not batch-convert – each subsystem should get a dedicated commit with before/after log samples in the PR description.
Keep the [TIMESTAMP] prefix of journalctl lines readable by humans: slog’s text handler uses time=... level=... msg=... which is also human-parseable, so journalctl viewers still work.
YARA-X Worker Process
CSM runs YARA-X in a supervised child process by default (since the
2026-04-23 default-flip). The goal is blast-radius control: a cgo
crash inside yara_x_capi (the 2026-04-16 production incident) stays
contained to the child and the daemon keeps its fanotify watchers,
log watchers, and firewall engine alive. See ROADMAP.md (Related
work already landed → “YARA-X process isolation”) for the decision
record.
The knob is a tri-state *bool: omit it (or set true) for the
default-on child process; set false to fall back to the in-process
scanner.
signatures:
# yara_worker_enabled: true # default; omit for default-on
# yara_worker_enabled: false # explicit opt-out → in-process
When on, daemon startup:
- Does not call
yara.Init()in the daemon process. - Builds a
yaraworker.Supervisorand callsStart(ctx). - The supervisor runs
exec.Command(/opt/csm/csm, "yara-worker", "--socket", "/var/run/csm/yara-worker.sock", "--rules-dir", <rulesDir>). - Supervisor waits for the worker’s first
Pingbefore returning. - Installs itself as
yara.SetActive(...)so the existingyara.Active()callers (fanotify, rule reload) route transparently through the IPC.
Operator view:
ps axfshows the daemon with onecsm yara-workerchild.- New socket:
/var/run/csm/yara-worker.sock(0600, root-only). - Crashes produce a Critical
yara_worker_crashedfinding (rate- limited to one per minute) and restart with exponential backoff (1 s, 2 s, 4 s, capped at 60 s). Restarts reset to 1 s after the worker stays up for 30 s. - A
csm update-rulesrun that completes triggers the supervisor’s in-processReload(the worker recompiles). Escalate to a full worker restart from Go code viaSupervisor.RestartWorker().
Emailav under worker mode: the IPC wire format carries string-valued
rule metadata on every match (yaraipc.Match.Meta /
yara.Match.Meta). The emailav adapter consumes
Meta["severity"] via yara.Active(), so both in-process and worker
backends produce the same verdict shape. Non-string metadata (ints,
floats, bytes) is deliberately dropped at the worker boundary; add a
typed value struct here only if a future consumer actually needs one.
Testing:
- Unit-level:
internal/yaraipc(protocol framing + round-trip) andinternal/yaraworker(handler adapter, Run, supervisor). The supervisor tests re-invoke the test binary as a mock worker via the standardTestMain+ env-var helper-process pattern, including a realSIGKILL-driven signal-death test that exercises thesyscall.WaitStatus.Signaled()branch. - Integration: staged in the GitLab pipeline’s
integrationstage against AlmaLinux + Ubuntu cloud servers.
Building the Documentation
cd docs
mdbook build # generates docs/book/
mdbook serve # local preview at http://localhost:3000
Release Signing
CSM has two separate signing paths:
- Package repository signing for the normal APT/DNF install path.
- Detached Ed25519 artifact signatures for raw binaries, tarballs, and package files downloaded outside the package manager.
Do not reuse keys between these paths. The package repositories use GPG because APT and DNF verify repository metadata that way. Detached release signatures use Ed25519 because the standalone install and deploy scripts verify raw artifact bytes with OpenSSL.
Status
| Surface | Key type | CI variable | Notes |
|---|---|---|---|
| APT repository metadata | GPG | CSM_GPG_SIGNING_KEY | Published by repo:publish; operators install with signed-by=/etc/apt/keyrings/csm.gpg. |
| RPM packages and repository metadata | GPG | CSM_GPG_SIGNING_KEY | Published by repo:publish; operators use gpgcheck=1 and repo_gpgcheck=1. |
Raw binaries, tarballs, .deb, .rpm siblings | Ed25519 | CSM_SIGNING_KEY | Detached .sig files for direct downloads and standalone scripts. |
| YARA Forge rule ZIPs | Ed25519 | CSM_SIGNING_KEY | Signed by the yara-forge-mirror job; clients verify via signatures.signing_key. |
The preferred operator path is the signed APT/DNF repository documented in Installation. Standalone scripts also verify detached signatures: scripts/install.sh embeds the Ed25519 public key in EMBEDDED_SIGNING_KEY. Override it at runtime with CSM_SIGNING_KEY_PEM. Without any key, the scripts warn and continue unless CSM_REQUIRE_SIGNATURES=1 is set.
Public Key
The same Ed25519 key signs release artifacts and YARA Forge rule ZIPs.
Hex form, for signatures.signing_key in CSM config:
2d1472b2a1d9728c2717b75111487145a7863f7ce731c1b44181f7a68bb908f7
PEM form, for standalone script verification (EMBEDDED_SIGNING_KEY / CSM_SIGNING_KEY_PEM):
-----BEGIN PUBLIC KEY-----
MCowBQYDK2VwAyEALRRysqHZcownF7dREUhxRaeGP3znMcG0QYH3pou5CPc=
-----END PUBLIC KEY-----
Package Repository Signing
repo:publish runs on version tag pipelines and rebuilds the public package repositories from the current tag plus the retained historical releases.
Required protected CI variables:
| Variable | Type | Purpose |
|---|---|---|
CSM_GPG_SIGNING_KEY | File | GPG private key used to sign APT metadata, RPM packages, and RPM repo metadata. |
CSM_MIRROR_SSH_KEY | File | SSH key used to publish the mirror output. |
CSM_MIRROR_KNOWN_HOSTS | Variable | SSH host keys for the mirror host. |
The job exports the public key as csm-signing.gpg and publishes it at the mirror root so install docs can reference:
https://mirrors.pidginhost.com/csm/csm-signing.gpg
APT verifies signed repository metadata through the signed-by= keyring. DNF verifies both RPM package signatures and repository metadata via gpgcheck=1 and repo_gpgcheck=1.
Detached Artifact Signatures
sign:artifacts signs release files with the Ed25519 private key in CSM_SIGNING_KEY when that variable is present. Each signed file gets a .sig sibling uploaded with the artifact.
Examples:
csm-linux-amd64
csm-linux-amd64.sig
csm_3.0.0_amd64.deb
csm_3.0.0_amd64.deb.sig
csm-3.0.0-1.x86_64.rpm
csm-3.0.0-1.x86_64.rpm.sig
The signature covers the raw artifact bytes with no hashing wrapper. Verification uses:
openssl pkeyutl -verify -pubin -inkey csm-signing.pub -rawin \
-sigfile csm-linux-amd64.sig -in csm-linux-amd64
Detached Signature Setup
On a trusted workstation:
openssl genpkey -algorithm ed25519 -out csm-signing.key
openssl pkey -in csm-signing.key -pubout -out csm-signing.pub
Store the private key in GitLab as a protected CSM_SIGNING_KEY variable. Keep the private key in an offline password manager and a second secure backup location. Do not commit it.
For standalone script verification, either:
- Embed the public key PEM in
EMBEDDED_SIGNING_KEYinscripts/install.sh,scripts/deploy.sh, andscripts/deploy-gitlab.sh. - Or pass the public key at runtime with
CSM_SIGNING_KEY_PEM.
To make missing signatures or missing public keys fatal:
CSM_REQUIRE_SIGNATURES=1 curl -sSL https://raw.githubusercontent.com/pidginhost/csm/main/scripts/install.sh | bash
If a .sig file exists but verification fails, the installer aborts regardless of CSM_REQUIRE_SIGNATURES.
Key Rotation
Package repository GPG key rotation:
- Generate a new GPG signing key.
- Replace
CSM_GPG_SIGNING_KEYin protected CI variables. - Publish a tag pipeline so
repo:publishexports the new public key to the mirror. - Update install docs or automation if the key URL changes.
Detached Ed25519 key rotation:
- Generate a new Ed25519 key pair.
- Replace
CSM_SIGNING_KEYin protected CI variables. - Update the embedded public key in standalone scripts, or rotate the
CSM_SIGNING_KEY_PEMvalue used by automation. - Tag a new release.
Old detached signatures remain verifiable only with the old public key. Archive old public keys alongside release metadata so historical releases can still be checked.
Manual Detached Verification
curl -LO https://github.com/pidginhost/csm/releases/download/v3.0.0/csm-linux-amd64
curl -LO https://github.com/pidginhost/csm/releases/download/v3.0.0/csm-linux-amd64.sig
openssl pkeyutl -verify -pubin -inkey csm-signing.pub -rawin \
-sigfile csm-linux-amd64.sig -in csm-linux-amd64
If verification fails, treat the artifact as untrusted. Do not install it.