# System

# Console Firewall

# Console Firewall

_Pro Edition feature._ Maps to **System > Console Firewall** (`view_console_firewall.cfm`, `inc/firewall_action.cfm`, `inc/generate_nginx_configuration.cfm`).

Console Firewall is a **static IP allowlist** for the two admin surfaces of the gateway: the Hermes admin console (`/admin/` and `/admin/2/`) and the Ciphermail web admin (`/ciphermail/`). When enabled, nginx returns `403 Forbidden` to any request for those paths from a source IP not on the list. This is enforced at the nginx layer before Authelia ever sees the request — it's a perimeter filter, not an authentication filter.

## How it differs from IPS

Both pages live under System and both touch nginx and ban traffic, so admins routinely confuse them. The distinction is reactive vs. preventative:

| | **Console Firewall** | **[IPS](https://docs.deeztek.com/books/administrator-guide/page/ips)** |
|---|---|---|
| Model | Static allowlist (default-deny) | Dynamic blocklist (default-allow) |
| Layer | nginx `allow`/`deny` directives | iptables drop rules via fail2ban |
| Scope | `/admin/`, `/admin/2/`, `/ciphermail/` only | All exposed surfaces: SMTP/IMAP, Authelia SSO |
| Trigger | Admin adds an IP to the list | Failed-auth threshold tripped in a log |
| Audience | Internal admins / known office IPs | Anyone on the public internet |
| Storage | `firewall` table + `parameters2.firewall_status` | `intrusion_prevention_jails` + `fail2ban_ips` |
| Apply | Auto: regen nginx + preload restart on every save | Manual: admin clicks Apply Settings after edits |

Both layers stack. A request to `/admin/` from a non-allowlisted IP is rejected by Console Firewall (nginx 403) before fail2ban ever sees an Authelia auth event. A request from an allowlisted IP that then fails login five times still gets the IPS ban from the `authelia` jail.

## What's behind the page

```
Browser request to https://<console>/admin/
        │
        ▼
   hermes_nginx  (sites-enabled/<console>_hermes-ssl.conf)
        │
        ├─►  location /admin/ {
        │       allow 10.0.0.5;       ◄── from `firewall` table where hermesadmin='yes'
        │       allow 192.168.1.0/24;
        │       deny all;
        │       ...auth_request /authelia...
        │       proxy_pass http://hermes_commandbox:8888/admin/;
        │    }
        ▼
   Authelia (if allowed)
        ▼
   hermes_commandbox
```

The firewall is **purely an nginx allow/deny block** rendered into the per-console-host vhost. When `firewall_status = enabled`, the rules are present. When `disabled`, the placeholder is rendered as an empty string and nginx falls back to its default allow-all behavior for that location.

## Database schema

| Table / Column | Role |
|---|---|
| `firewall.ip` | Single IP address (no CIDR — see the validation note below) |
| `firewall.hermesadmin` | `'yes'` / `'no'` — include this IP in the `/admin/` allow list |
| `firewall.ciphermailadmin` | `'yes'` / `'no'` — include this IP in the `/ciphermail/` allow list |
| `firewall.note` | Free-text annotation surfaced in the table |
| `firewall.datetime` | Last-modified timestamp |
| `parameters2` row where `parameter='firewall_status' AND module='firewall'` | Master switch — `enabled` or `disabled` |

The schema (`hermes_install.sql` line 812) defines `ip` as `varchar(50)` but the validator at `inc/validate_ip_address.cfm` is a single-address IPv4 regex — there is no CIDR support and no IPv6 support on this page. A 24-bit range needs 256 rows, one per host. For larger ranges, install an upstream firewall instead.

## The auto-apply flow

Every action handler in `inc/firewall_action.cfm` (`addip`, `editip`, `deleteip`, `setfirewall`) ends the same way:

1. Update the `firewall` table (or `parameters2.firewall_status` for the master switch).
2. Set a numeric `session.m` alert code (1–7 for errors, 33–37 for success).
3. **Always** include `generate_nginx_configuration.cfm` at the bottom of the file — re-render every per-console vhost from `/opt/hermes/templates/hermes-ssl.conf` with current firewall rules baked in.
4. `cflocation` to `/admin/2/preload_restart_nginx.cfm?returnUrl=/admin/2/view_console_firewall.cfm`.

There is **no "Apply Settings" button** on this page. The Save & Apply button on the master-status card and the row-level edit/delete buttons are themselves the apply — every individual change triggers a full nginx regen and a restart. This is the opposite of the [IPS](https://docs.deeztek.com/books/administrator-guide/page/ips) page's batched pending-changes model.

> **Operational consequence.** A burst of edits (adding ten allowed IPs one at a time) triggers ten back-to-back nginx regens, each ending in a restart. The `preload_restart_nginx.cfm` pattern bridges this — the page renders a static "please wait" before the restart fires, then polls until nginx is back, so the admin's own session doesn't `ERR_CONNECTION_REFUSED` mid-redirect. There is no batch-add path; bulk imports are an `INSERT INTO firewall ...` SQL job followed by one manual Save & Apply on the status card.

## Template placeholders

`generate_nginx_configuration.cfm` queries `firewall` twice and renders two placeholder substitutions into the per-vhost rendered file:

| Template token | Substituted with | Used in |
|---|---|---|
| `hermes_fw_hermes` | `allow <ip>;` lines for every `firewall` row where `hermesadmin='yes'`, terminated by `deny all;` | `location /admin/ { ... }` block (template line 157) |
| `hermes_fw_ciphermail` | `allow <ip>;` lines for every `firewall` row where `ciphermailadmin='yes'`, terminated by `deny all;` | `location /ciphermail/ { ... }` block (template line 287) |

When the firewall is disabled, both placeholders are blanked out — the `location` blocks render without any `allow`/`deny` and nginx falls back to its default allow-all. When the firewall is enabled but **no row** has the relevant flag set to `yes`, the recordcount-zero branch in the generator also blanks the placeholder. There is no "deny everyone" mode that locks the page from itself; see the safety checks below.

The `/users/`, `/nc/`, `/main/`, `/plugins/`, and `/web/` locations are **not** firewalled by this page — they have no `hermes_fw_*` placeholder. Mailbox users, Nextcloud users, and Ciphermail end-user portal users hit Authelia directly with no IP allowlist. This is deliberate: those are end-user surfaces, not admin surfaces.

## Safety checks — the four guardrails

Without protection, an admin could trivially lock themselves out of the gateway by deleting their own IP, editing it to something wrong, or enabling the firewall before adding their own IP. `inc/firewall_action.cfm` carries four guard rules (each tied to its own alert code):

| Guard | When it fires | Alert |
|---|---|---|
| Can't delete own IP while firewall enabled | `getip.ip = ClientIP AND firewall_status = enabled` on `deleteip` | `m=3` |
| Can't edit own IP while firewall enabled (unless the new IP is also the client's IP) | Same condition on `editip` with a different new IP | `m=4` |
| Can't enable firewall unless current IP is in the list with `hermesadmin='yes'` | `setfirewall` to `enabled` with no matching `firewall` row for `ClientIP` | `m=5` |
| Duplicate IP rejected on add/edit | Unique-IP check by query | `m=2`, `m=6` |

`ClientIP` is set in `Application.cfc` from the `X-Forwarded-For` header (nginx sets it from `$remote_addr`). When testing behind a load balancer or VPN, what the page considers "your IP" may not match what your laptop reports — verify with the per-row table what nginx is actually seeing before clicking the master enable.

## The recovery path when locked out

If a misconfiguration locks the admin out anyway (forgotten to add the new office IP, master flipped before the row was saved, browser using an unexpected egress IP), the recovery sequence is shell-level on the Docker host:

```
# Disable the firewall directly in the DB
docker exec hermes_db_server mariadb -u root hermes -e \
    "UPDATE parameters2 SET value2='disabled' \
     WHERE parameter='firewall_status' AND module='firewall'"

# Add the new admin IP
docker exec hermes_db_server mariadb -u root hermes -e \
    "INSERT INTO firewall (ip, hermesadmin, ciphermailadmin, note) \
     VALUES ('<your-ip>', 'yes', 'yes', 'Recovery add')"

# Trigger a manual nginx regen by hitting the page from inside the CommandBox container
docker exec hermes_commandbox curl -s http://localhost:8888/admin/2/inc/generate_nginx_configuration.cfm

# Reload nginx
docker exec hermes_nginx nginx -s reload
```

The MariaDB call uses unix-socket auth (root via the container) — no password, by design. Once back in, re-enable the firewall from the UI so the lockout-guard alerts are restored.

A planned Hermes CLI Management Console (`scripts/hermes-cli.sh`) will wrap this recovery into a menu option. Until it ships, the docker-exec sequence above is the supported recovery path.

## Interaction with Console Settings

The console hostname change (`edit_console_settings.cfm`) regenerates the same per-console nginx vhost from the same template — meaning a hostname change automatically picks up the current Console Firewall state. The Firewall rules carry over to the new vhost transparently; the admin does not need to revisit this page after a hostname change.

The reverse is not true: editing the Firewall does not change the hostname. But because `firewall_action.cfm` always calls `generate_nginx_configuration.cfm`, which always renders every active console vhost, a stale-vhost scenario (where an old hostname's vhost still exists alongside the new one) gets both vhosts re-rendered on a Firewall save. This is fine in practice; it's been the established behavior since the AdminLTE 4 refactor (`a348e73f`).

## License gating

The page is wrapped in the standard Pro check:

```cfml
<cfif NOT isDefined("session.edition") OR session.edition NEQ "Pro">
    <cfset proFeatureName = "Admin Console Firewall">
    <cfinclude template="./inc/license_pro_required.cfm">
    <cfabort>
</cfif>
```

Community installs see the gating panel. The `firewall` table and `parameters2.firewall_status` row exist anyway (they're seeded); pre-existing rules continue to render into the nginx vhost as long as `firewall_status='enabled'`. Switching from Pro to Community does **not** auto-disable the firewall — if it was on when the license downgraded, it stays on. To turn it off, an admin needs to either reactivate Pro or use the recovery path above.

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_console_firewall.cfm` | `hermes_commandbox` | Main page + modals |
| `config/hermes/var/www/html/admin/2/inc/firewall_action.cfm` | `hermes_commandbox` | All add/edit/delete/status handlers; auto-applies via the nginx regen include |
| `config/hermes/var/www/html/admin/2/inc/generate_nginx_configuration.cfm` | `hermes_commandbox` | Renders `hermes_fw_hermes` and `hermes_fw_ciphermail` placeholders |
| `config/hermes/var/www/html/admin/2/inc/validate_ip_address.cfm` | `hermes_commandbox` | IPv4 single-address regex (no CIDR, no IPv6 on this page) |
| `config/hermes/var/www/html/admin/2/preload_restart_nginx.cfm` | `hermes_commandbox` | Pre-restart splash + polling rejoin so the admin's session survives the reload |
| `config/hermes/opt/hermes/templates/hermes-ssl.conf` | `hermes_commandbox` | nginx vhost template with the `hermes_fw_*` tokens |
| `config/nginx/etc/nginx/sites-available/<token>_hermes-ssl.conf` | `hermes_nginx` (mounted) | Live rendered vhost — what nginx actually serves |

## Related

- [IPS](https://docs.deeztek.com/books/administrator-guide/page/ips) — the reactive blocklist that complements this preventative allowlist
- [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — hostname changes regenerate the same vhost and pick up Firewall state automatically
- [Authentication Settings](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings) — Authelia runs after Console Firewall passes; both layers stack
- [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) — RemoteAuth admins still hit Console Firewall first; the upstream LDAP bind only matters once the request reaches Authelia

# Authentication Settings

# Authentication Settings

Admin path: **System > Authentication Settings** (`view_authentication_settings.cfm`,
`inc/get_authelia_settings.cfm`, `inc/edit_authelia_settings.cfm`,
`inc/auth_generate_secret.cfm`, `inc/generate_authelia_configuration.cfm`,
`inc/restart_authelia.cfm`).

This page configures **Authelia** — the identity-aware proxy that gates
every Hermes web surface (`/admin`, `/users`, `/nc`). It is global
gateway plumbing: secrets, session timing, login-failure regulation,
SMTP notifier credentials, Duo Push integration, and the OIDC client
that Nextcloud uses for SSO. Per-user MFA enforcement, app passwords,
and the local-vs-remote credential model are documented in the
[Credential Model](https://docs.deeztek.com/books/administrator-guide/page/credential-model) chapter
and on the [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) page; this page is
strictly the gateway configuration.

## Where Authelia sits

```
Browser ──► nginx (hermes_nginx) ──► auth_request /authelia
                                          │
                                          ▼
                              hermes_authelia (port 9091)
                                          │
                                          ▼
                            ┌─────────────┴─────────────┐
                            │                           │
                hermes_ldap (cn=admins,                  hermes_db_server
                cn=mailboxes, cn=relays,                 (MariaDB)
                cn=one_factor, cn=two_factor)            database: authelia
                                                         (TOTP, WebAuthn,
                                                          encryption,
                                                          identity_verification,
                                                          authentication_logs)
```

Every protected request triggers nginx's `auth_request /authelia` which
proxies to `hermes_authelia:9091/api/verify`. Authelia checks its
session cookie (stored in Redis via `hermes_authelia_redis`), and if
needed redirects the user through a one-factor (password) or two-factor
(password + MFA) login flow against the LDAP directory. The nginx
snippets that wire this up are
`config/nginx/etc/nginx/snippets/auth.conf` and `snippets/authelia.conf`.

The Authelia container reads its config from `/config/configuration.yml`
inside the container (host path
`config/authelia/configuration.yml`). The config file is
**regenerated from a template** every time this page is saved — the
template at `/opt/hermes/templates/configuration.yml` (host path
`config/authelia/configuration.HERMES`) is read, the `hermes_*`
placeholders are substituted with values from `parameters2` where
`module = 'authelia'`, the result is written, and the container is
restarted. Direct edits to `configuration.yml` are overwritten on the
next save.

## Configuration storage and persistence

| Setting class | Lives in | Read by |
|---|---|---|
| Toggles, durations, hostnames, log level | `parameters2` table, `module = 'authelia'` | Form load via `get_authelia_settings.cfm`; template substitution at regen |
| High-entropy secrets | Files under `/opt/hermes/keys/` (Docker secret mounts) | Authelia reads via `{{ secret "..." }}` directives in `configuration.yml` |
| Sessions (cookie state) | Redis (`hermes_authelia_redis`) | Authelia at runtime |
| MFA registrations | MariaDB `authelia` database | Authelia at runtime; encrypted at rest with the Storage Encryption Key |
| Identity verification tokens | MariaDB `authelia` database | Reset-password and add-device flows |

Secrets are never round-tripped through the form. Read-only fields on
the page show a masked tail (last 4 chars) of the file contents so the
admin can verify the secret exists and roughly recognise it. The
regenerate button next to each field writes a fresh random value
directly to disk (`auth_generate_secret.cfm`), regenerates
`configuration.yml`, and restarts Authelia.

## Storage backend — MySQL, not SQLite

Authelia stores MFA registrations, identity-verification tokens, and
audit logs in the `authelia` MariaDB database on `hermes_db_server`.
This is intentionally different from Authelia's upstream SQLite default:

- **Survives container recreation.** Docker `down`/`up` cycles wipe
  named volumes if the operator hasn't bind-mounted SQLite's storage
  path. MariaDB lives on the Data tier and is backed up by the
  standard system backup.
- **Tolerates concurrent reads.** SQLite serialises writes; with
  hundreds of mailboxes hitting `/users` and `/nc` simultaneously this
  becomes a contention point.
- **Single backup surface.** The Hermes system backup already includes
  all MariaDB databases. The Authelia DB is included automatically; no
  separate path to remember.

The credential to the `authelia` database is stored as a Docker secret
file at `/opt/hermes/keys/authelia_db_password` and referenced from
the Authelia config via `{{ secret "/keys/authelia_db_password" | msquote }}`.

## Cards on the page

### General Settings

| Field | What it controls | Stored as |
|---|---|---|
| **Password Reset JWT Secret** | Signs the time-limited token in password-reset email links. Rotating invalidates every outstanding reset link. | File `/opt/hermes/keys/authelia_identity_validation_reset_password_jwt_secret_file` |
| **Reset Password Function** | Enable/disable the "Forgot password?" link on the login page. Disable when password is owned by remote AD/LDAP. | `parameters2.authentication_backend.disable_reset_password` |
| **Storage Encryption Key** | AES key Authelia uses to encrypt TOTP secrets and WebAuthn credentials at rest inside the `authelia` database. **Rotating this key invalidates every TOTP and WebAuthn registration** — every MFA-enrolled user must re-enrol on next login. | File `/opt/hermes/keys/authelia_storage_encryption_key_file` |

> **Do not rotate the Storage Encryption Key casually.** The red
> callout on the page exists for a reason. Rotation is correct after
> a confirmed compromise; in every other case it locks every MFA user
> out of their tokens. Duo Push survives because Duo enrollment lives
> in Duo's cloud, not the Authelia DB — see the Duo section below.

### Session Settings

| Field | Default | Notes |
|---|---|---|
| **Session Name** | `hermes_session` | Cookie name. Changing forces every active session to log in again. |
| **Session Secret** | random | Signs the session cookie. Rotating invalidates all sessions immediately. |
| **Session Provider Password (Redis)** | random | Auth between Authelia and `hermes_authelia_redis`. Rotating requires the Redis container to pick up the new secret on Authelia restart. |
| **Session Expiration** | `43200` (12h) | Absolute lifetime from login. NIST SP 800-63B AAL2 ceiling. |
| **Session Inactivity** | `3600` (1h) | Idle timeout. NIST 800-63B recommends 1800s (30 min) for AAL2 / 900s (15 min) for AAL3. |
| **Remember Me Duration** | `43200` (12h) | When ticked at login, replaces Session Expiration **and bypasses Session Inactivity entirely**. Set to `-1` to remove the checkbox from the login form. |

The "Remember Me" interaction is the gotcha. Authelia 4.39 source
(`internal/handlers/handler_authz_authn.go`) confirms that a remembered
session is exempt from inactivity checks — its lifetime is the
Remember Me Duration, full stop. If your compliance posture requires
inactivity enforcement on **every** session, set Remember Me Duration
to `-1`; otherwise users who tick the box are governed only by the
absolute ceiling.

Saving this card also pushes matching values into Nextcloud via `occ
config:system:set` (`session_lifetime`, `session_keepalive`,
`remember_login_cookie_lifetime`). NC sessions are kept in lockstep
with Authelia to prevent stale NC sessions from triggering the OIDC
auto-redirect URL mangle.

### SMTP Notification Settings

The address and subject Authelia uses when sending password-reset,
identity-verification, and new-device-registration emails. Hermes
points Authelia at its own internal Postfix re-injection port
(`hermes_postfix_dkim:10026`) so notification mail goes through the
gateway's outbound pipeline like any other Hermes-originated message.
SMTP host/port are not exposed on this page — they are hard-coded in
the template because there's no real reason to change them on a
self-contained install.

### Login Regulation

Authelia's built-in brute-force throttle.

| Field | Effect |
|---|---|
| **Login Failures Before Ban** | Number of consecutive failures from one source before that source is banned (default 5) |
| **Time Between Failed Logins** | Sliding window over which the failure count is measured, in seconds (default 120) |
| **Banned Time** | How long the ban lasts, in seconds (default 300) |

This is the inner brake. The outer brake is the
[Intrusion Prevention](https://docs.deeztek.com/books/administrator-guide/page/ips) `authelia` jail
(`hermes_fail2ban`) which scans `/remotelogs/authelia/authelia.log`
and applies host-level iptables bans for longer durations. The two
layers are complementary: Authelia regulates per-account in the
application; Fail2ban regulates per-source-IP at the firewall.

### Logging

Authelia log level (`trace`, `debug`, `info`, `warn`, `error`),
format (`json` or `text`), and retention in days. The retention
dropdown applies to the **rotated** Authelia log files
(`config/authelia/log/authelia.log.*`) — the live file is rotated
by the host logrotate config and old rotations are pruned to the
retention window. Default 30 days; legal/compliance reviewers may
need 90 or 180.

### Duo Security

Optional second factor via Duo Push (mobile-app one-tap approval).
Disabled by default. When enabled, fields are required:

| Field | Source |
|---|---|
| **Duo Hostname** | Duo Admin Panel → Applications → Auth API → "API hostname" (`api-XXXXXXXX.duosecurity.com`) |
| **Duo Integration Key** | Same panel, "Integration key" |
| **Duo Secret Key** | Same panel, "Secret key" |
| **Duo Self Enrollment** | If enabled, users who don't yet have a Duo account can self-enrol from the Authelia MFA page |

Integration and Secret keys are stored as Docker secret files at
`/opt/hermes/keys/authelia_duo_api_integration_key_file` and
`authelia_duo_api_secret_key_file`. The form blanks the input on
display and only writes when a non-empty value is submitted (the
masked tail under the box shows the current value's last 4 chars).
This lets the admin save other fields without re-entering Duo
credentials every time.

> **Duo survives storage-key rotation and SQLite-to-MySQL migrations.**
> Duo enrollment lives on Duo's servers, not in Authelia's database;
> Hermes only stores the API credentials. The TOTP and WebAuthn tables
> in the `authelia` MariaDB database are wiped when the storage key
> rotates or the SQLite-to-MySQL migration runs; Duo Push keeps working.

### Webmail OIDC (Nextcloud)

Authelia acts as the OpenID Connect provider for Nextcloud's
`user_oidc` app — this is what makes "Sign in with Hermes" work on
`/nc` and (transparently) auto-login users who already have a valid
Authelia session.

| Field | Role | Stored as |
|---|---|---|
| **OIDC HMAC Secret** | Signs Authelia-issued OIDC tokens | `/opt/hermes/keys/authelia_identity_providers_oidc_hmac_secret_file` |
| **OIDC Client Secret** | Shared secret between Authelia (RP) and Nextcloud (client). Hashed with PBKDF2 inside Authelia. | Plain: `authelia_identity_providers_oidc_clients_client_secret_plain_file`; digest: `authelia_identity_providers_oidc_clients_client_secret_digest_file` |
| **OIDC Key** | RSA 2048 private key (JWKS) Authelia uses to sign ID tokens | `/opt/hermes/keys/authelia_identity_providers_oidc_jwks_file` (generated with `openssl genrsa`) |

The OIDC client is registered as `Hermes_SEG_Webmail`, redirect URI
`https://<console>/nc/apps/oidc_login/oidc`, scopes `openid profile
email groups`. The `groups` scope is what gives Nextcloud the LDAP
group claims it needs to apply NC's own group ACLs.

Rotating the OIDC Client Secret triggers a follow-up `occ
user_oidc:provider Hermes_SEG --clientsecret=...` execution against
the `hermes_nextcloud` container so both sides stay in sync. Rotating
the HMAC Secret or OIDC Key on Authelia's side will invalidate all
in-flight OIDC sessions — users will see a fresh login challenge on
next request.

## What this page does NOT configure

| Setting | Lives on |
|---|---|
| **Console hostname** (Authelia `session.cookies[].domain` + `authelia_url`) | [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — regenerating console settings re-templates Authelia and restarts it |
| **LDAP backend address / bind DN / filters** | Hard-coded in the template to point at `hermes_ldap`. The Hermes LDAP container's structure is provisioned at install time and not exposed as a runtime knob. |
| **Upstream AD / LDAP authentication for specific mailboxes or relay recipients** | [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) — Authelia still binds locally; the local LDAP entry has a `seeAlso` overlay pointing at the upstream directory |
| **Per-user MFA enforcement** | The admin's mailbox/relay-recipient detail pages — `recipients.enforce_mfa` is a TINYINT(3) admin-policy flag (see [Credential Model](https://docs.deeztek.com/books/administrator-guide/page/credential-model) and #225 below) |
| **Password reset flow UI** | [Password Resets](https://docs.deeztek.com/books/administrator-guide/page/password-resets) — the reset page itself, CAPTCHA, rate limiting |
| **System users / admins list** | [System Users](https://docs.deeztek.com/books/administrator-guide/page/system-users) — managing accounts in `cn=admins`,`ou=users` |
| **Fail2ban brute-force protection** | [Intrusion Prevention](https://docs.deeztek.com/books/administrator-guide/page/ips) — the host-firewall layer in front of Authelia |
| **Nextcloud OIDC auto-redirect toggle** | [Email Server Settings](https://docs.deeztek.com/books/administrator-guide/page/settings) — moved off this page; controls whether `/nc` silently SSOs the already-authenticated user |

## MFA enforcement is decoupled from the `cn=two_factor` LDAP group (#225)

This is the single most-often-confused part of Hermes authentication.

**The LDAP group `cn=two_factor` is a *capability* marker, not an
*enforcement* marker.** Membership in `cn=two_factor` tells Authelia
"this user has at least one MFA method registered and should be
prompted for it." Membership in `cn=one_factor` tells Authelia
"password only." A user moves from `one_factor` to `two_factor` by
**enrolling an MFA method themselves** through the user portal's
Account Settings page — admins do not force-flip the group.

**Admin policy lives in `recipients.enforce_mfa` (and
`system_users.enforce_mfa` for system users) — a TINYINT(3) column,
not a group.** When the admin sets this to 1, the user-portal pages
that depend on it consult
`config/hermes/var/www/html/users/2/inc/check_enforce_mfa_restriction.cfm`
on each request. If the user is in `cn=mailboxes` or `cn=relays`, has
`enforce_mfa = 1`, and is **not** yet in `cn=two_factor`, the page
renders a restricted-access panel pointing them at Account Settings
to enable 2FA. Once they enrol, the group flips and the restriction
clears on the next page load.

### Why this two-layer model

The chicken-and-egg without it: enrolling TOTP or WebAuthn requires
the user to receive an identity-verification email from Authelia.
A brand-new mailbox has no working mail client yet — they need to
get into the portal first to set up an app password, configure their
phone, and read the email. Hard-locking them out of the portal until
they enrol means they can never enrol.

The bootstrap surfaces (Account Settings, My App Passwords, Set Up
Your Devices, Webmail) remain accessible under the restriction; the
rest of the portal does not. Once the user enables 2FA, the
restriction lifts automatically.

### Operational consequence — log out, don't just refresh

Authelia caches LDAP group membership in the session for the
refresh interval (default 5 minutes). When a user enables 2FA,
their LDAP group flips to `cn=two_factor` immediately, but
Authelia's session still says `cn=one_factor` until the cache
expires. Hermes works around this by redirecting through
`/logout` after the enable-2FA flow, which forces a fresh
Authelia session and picks up the new group membership on the
next request. If a user reports "I enabled 2FA but the portal
still says I haven't," the answer is always: log out and back in.

## Save flow

Save & Apply Settings runs `edit_authelia_settings.cfm`, which:

1. Validates every form field (whitelist regex per field, length minimums for secrets).
2. Updates the matching `parameters2` rows with `applied = '2'`.
3. After all field updates succeed, flips every `module = 'authelia'`
   row to `applied = '1'`.
4. Calls `generate_nextcloud_configuration.cfm`, pushes session
   parameters into Nextcloud via `occ config:system:set`, and restarts
   `hermes_nextcloud`.
5. Calls `generate_authelia_configuration.cfm` which re-templates
   `configuration.yml` from `/opt/hermes/templates/configuration.yml`.
6. Calls `restart_authelia.cfm` (which uses the canonical preload
   pattern, not a hard restart, to avoid `ERR_CONNECTION_REFUSED` on
   the redirect back).
7. Sleeps 10 seconds to let Authelia come back up before the redirect
   lands.

If validation fails at any step the form short-circuits via `cflocation
url="#cgi.http_referer#"` with a `session.m` alert code; no partial
state is committed because each cascade step gates on `step = N`.

## Failure semantics

| What breaks | What happens |
|---|---|
| Authelia container down | nginx `auth_request` returns 500; every protected page shows "502 Bad Gateway" or similar. Mail flow is unaffected — Postfix, Dovecot, and Amavis don't depend on Authelia. |
| MariaDB `authelia` database unreachable | Authelia starts but cannot authenticate; same symptom as above. |
| Redis (`hermes_authelia_redis`) down | Authelia starts but cannot store sessions; users are bounced to the login page on every request. |
| Storage Encryption Key file missing | Authelia refuses to start. Check `docker logs hermes_authelia` for the missing-secret error. |
| `configuration.yml` syntax-broken after a bad save | Authelia refuses to start. Restore from the on-disk backup `configuration.BACKUP`, fix in the form, save again. |
| LDAP container down | Authelia starts but every login attempt fails. Same recovery as MariaDB-down — fix LDAP first, no Authelia restart needed. |

The Save & Apply Settings button does not have a pre-save dry-run; if
Authelia refuses to start after a save, the previous `configuration.yml`
is no longer on disk. The `restart_authelia.cfm` step will surface the
container start failure in the admin UI's restart-output area; the
admin should not navigate away until the success banner appears.

## Related documentation

- [Credential Model](https://docs.deeztek.com/books/administrator-guide/page/credential-model) — the four-credential architecture (web login, app passwords, NC internal password, Hermes System app password) that this page's session settings gate
- [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) — when web login is bound against an external AD/LDAP instead of Hermes's internal directory
- [Password Resets](https://docs.deeztek.com/books/administrator-guide/page/password-resets) — the forgot-password page that consumes the JWT Secret on this page
- [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — the console hostname change that triggers an Authelia template re-render
- [Intrusion Prevention](https://docs.deeztek.com/books/administrator-guide/page/ips) — the Fail2ban `authelia` jail that protects this surface at the firewall
- [System Users](https://docs.deeztek.com/books/administrator-guide/page/system-users) — admin accounts that live in `cn=admins`
- [Email Server Settings](https://docs.deeztek.com/books/administrator-guide/page/settings) — the Nextcloud OIDC auto-redirect toggle that complements the OIDC client configured here

# Backup/Restore

# Backup/Restore

Admin path: **System > Backup/Restore** (`view_system_backup.cfm`).

> **CLI-only by design.** Backup and restore run from the Docker host's shell, not from the admin console. The admin console's *Backup/Restore* page is a read-only info surface (CLI examples + a list of backups detected on disk + a link back to this doc). There are no buttons. Long-running operations + web UIs is a known footgun (page reload kills progress, browser timeouts, race conditions); the CLI is the canonical interface.

## What ships in this release

Two scripts under [`scripts/`](../../../scripts/):

| Script | Purpose |
|---|---|
| [`system_backup.sh`](https://github.com/deeztek/Hermes-Secure-Email-Gateway/blob/main/scripts/system_backup.sh) | **Hot mode by default — zero application downtime.** Uses application-native hot-backup primitives: `mariadb-dump --single-transaction`, `slapcat`, and live tar of mail tiers (Dovecot, Amavis, Postfix all use atomic-rename writes safe for live tar). Toggles `occ maintenance:mode --on` briefly during Nextcloud file tar to pause NC user writes (mail flow unaffected). `--cold` flag stops the full stack for legal-hold / forensic snapshots that need absolute byte-level consistency. |
| [`system_restore.sh`](https://github.com/deeztek/Hermes-Secure-Email-Gateway/blob/main/scripts/system_restore.sh) | **Always cold on the restore side** (we're overwriting tier contents — concurrent reads/writes would corrupt). Verifies the manifest + per-archive SHA256 BEFORE any destructive action, **auto-remaps** tiers to this host's paths (refuses only on a build-version mismatch unless `FORCE_VERSION_MISMATCH=1`), restores DBs via socket auth, restores OpenLDAP via `slapadd`, stream-extracts in-scope tiers directly to their mount paths, reconciles the Nextcloud DB user, restarts the stack, and on a cross-host restore offers to run `system_rehost.sh`. |

## Backup scopes

The `-B` flag chooses what to back up. Pick the scope that matches your need — there's no reason to back up 500 GB of vmail every night if only the DBs and configs are churning.

| Scope | Includes | Typical cadence | Hot-mode duration |
|---|---|---|---|
| `system` | Config tier + Data tier + 6 DB dumps + LDAP slapcat | Nightly | seconds to a few minutes (dominated by `/mnt/data` tar size; DB+LDAP dumps are fast) |
| `archive` | Archive tier (Amavis quarantine) | Weekly or per retention policy | proportional to archive size; mail intake continues uninterrupted |
| `vmail` | Vmail tier (Dovecot mailboxes) | Weekly | proportional to mailbox size; mail flow continues uninterrupted |
| `nextcloud` | Nextcloud tier (NC files) | Weekly | proportional to NC file size; NC web UI shows "under maintenance" during the tar; mail unaffected |
| `all` | Everything above | Periodic full-DR snapshot | sum of all of the above |

## Hot-mode safety per component

Why we don't need downtime:

| Component | Hot-backup technique | Why it's safe |
|---|---|---|
| **MariaDB** | `mariadb-dump --single-transaction --routines --triggers --events --databases <db>` | InnoDB MVCC gives a consistent point-in-time snapshot. No table locks. Stored procedures, triggers, and scheduled events captured. |
| **OpenLDAP** | `slapcat -b dc=hermes,dc=local` inside `hermes_ldap` | Standard hot LDIF export. |
| **Dovecot (vmail)** | tar `/mnt/vmail` live | maildir/sdbox writes are atomic-rename (write to temp filename, atomic `mv` to final name). No torn files. Worst case: messages arriving during the tar window may land after the tar's snapshot — they're durable upstream (postfix queue, sender's MX retries) and captured by the next backup. |
| **Amavis (archive)** | tar `/mnt/archive` live | Amavis quarantine writes are atomic-rename. Same as Dovecot. |
| **Nextcloud (files)** | tar `/mnt/files` live, with `occ maintenance:mode --on` toggled around the tar | NC writes are atomic, but the filesystem ↔ `oc_filecache` DB table can drift if a user uploads mid-tar. Maintenance mode pauses NC user writes — the NC web UI shows "under maintenance" briefly, but mail flow is unaffected. Use `--no-nc-maintenance` to skip the toggle if needed. |
| **Postfix (data tier)** | tar `/mnt/data/postfix` live | Postfix queue files are atomic-rename. |
| **Service logs (data tier)** | tar live | Append-only. A torn last line is cosmetic, not data loss. |
| **MariaDB / LDAP / ClamAV raw files** | **Excluded** from the data tier tar | DB dumps + LDAP slapcat are the authoritative restore sources, so the on-disk InnoDB tablespace files and slapd data files are redundant. ClamAV signatures are regenerable, not worth the backup space. |

Hot mode is the daily backup. **Cold mode (`--cold`) is the escape hatch for use cases where absolute byte-level consistency matters more than uptime** — legal hold, forensic snapshots, regulatory archive. Cold mode does `docker compose stop` for the full duration.

## Backup

### Backup quick start

```bash
sudo /opt/hermes-seg-docker-gl/scripts/system_backup.sh -P /mnt/backups -B system --yes
```

The script creates a **backup directory** at `/mnt/backups/hermes-backup-<scope>-<build_no>-<UTC-timestamp>/` (e.g. `hermes-backup-all-v260609-20260609T183616Z/`). It is written under a `.staging-…` name and **atomic-renamed** into place only on success. **There is no outer tarball** — the per-tier archives sit directly in the directory, so the restore verifies and stream-extracts each one in place without unpacking a wrapper first (no ~2× scratch space). Read `manifest.json` directly to inspect a backup before restoring.

### Output layout

Inside the backup directory (only the archives relevant to the chosen scope are present):

```text
manifest.json                  ← scope, mode (hot/cold), topology, source hostname,
                                  build_no, SHA256 per archive
backup.log                     ← the backup run's own log
databases.tar.gz               ← 6 .sql files; system / all scopes only
ldap.ldif.gz                   ← slapcat output; system / all scopes only
config.tar.gz                  ← Config tier USER-DATA subdirs only (keys, .gnupg,
                                  ssl, templates, sa-bayes, sa-learn, dkim, arc,
                                  conf_files) — NOT .env / secrets / compose / scripts
                                  (those are host-specific and excluded by design);
                                  system / all scopes only
data.tar.gz                    ← Data tier user-data only (excludes mysql/ ldap/
                                  clamav/ — captured by dumps / slapcat / regenerable);
                                  system / all scopes only
archive.tar.gz                 ← Archive tier; archive / all scopes only
vmail.tar.gz                   ← Vmail tier; vmail / all scopes only
nextcloud.tar.gz               ← Nextcloud tier; nextcloud / all scopes only
```

### Backup flags

| Flag | Purpose |
|---|---|
| `-P <path>` | **Required.** Output directory. Must exist and be writable. |
| `-B <scope>` | **Required.** One of: `system`, `archive`, `vmail`, `nextcloud`, `all`. |
| `--cold` | Stop the full stack for the duration of the backup. Use for legal-hold / forensic snapshots. Default is HOT mode (zero application downtime). |
| `--no-nc-maintenance` | Skip the brief `occ maintenance:mode --on` that hot-mode nextcloud / all backups use to pause NC user writes during the file tar. Without it, file uploads happening mid-tar may be missed by the backup. |
| `--yes` (or `-y`) | Skip the interactive confirmation prompt. Use for cron / Ofelia. |
| `--dry-run` (or `-n`) | Print what would happen without changing anything. |
| `--help` (or `-h`) | Show usage. |

### Scheduling

For nightly automated backups, use **host cron** on the Docker host. `system_backup.sh` is a host-level script (it runs `docker compose stop`, reads `.env` from the host, writes to `/mnt/backups` on the host) — host cron is the natural fit. Example `/etc/cron.d/hermes-backup`:

```cron
# m h dom mon dow user  command
0 3 * * *      root  /opt/hermes-seg-docker-gl/scripts/system_backup.sh -P /mnt/backups -B system    --yes >> /var/log/hermes-backup.log 2>&1
0 4 * * 0      root  /opt/hermes-seg-docker-gl/scripts/system_backup.sh -P /mnt/backups -B vmail     --yes >> /var/log/hermes-backup.log 2>&1
0 5 1 * *      root  /opt/hermes-seg-docker-gl/scripts/system_backup.sh -P /mnt/backups -B all       --yes >> /var/log/hermes-backup.log 2>&1
```

A typical cadence:

| Cadence | Scope | Why |
|---|---|---|
| Nightly | `system` | Small + fast. Captures DBs, LDAP, configs, install-root state. Run with hot mode = zero downtime. |
| Weekly | `vmail` (or `archive` or `nextcloud`, rotated) | Larger but slower-changing. |
| Monthly | `all` | Full disaster-recovery snapshot. |

The script's exit code reflects success (0) or failure (non-zero). For built-in email alerting, use the `--notify-email=ADDR` flag (see below). For "Hermes is so dead it can't even tell you" cases, see [External monitoring](#external-monitoring-strongly-recommended).

> **Why host cron and not Ofelia?** Ofelia runs as a container (`hermes_ofelia`). Its job model (`job-exec` into a named container, `job-local` on the Ofelia container itself) doesn't fit `system_backup.sh` cleanly — the script needs host-level `docker compose` access, root, and write access to `/mnt/backups`. Ofelia's image lacks `docker compose` plugin and root host access. Native Ofelia integration is deliberately NOT on the roadmap; the existing **System > Scheduled Tasks** admin page lists Ofelia jobs but does NOT support adding new ones from the UI today.

### Failure / success email alerting

Use `--notify-email=ADDR` to receive an email on backup completion. By default emails on **failure only** (the "noisy on failure, silent on success" pattern most operators want). Add `--notify-on-success` to also email on success — useful for "daily I-am-alive confirmation" use cases.

```bash
# Email on failure only (typical)
sudo /opt/hermes-seg-docker-gl/scripts/system_backup.sh -P /mnt/backups -B system --yes \
  --notify-email=admin@example.com

# Email on both failure AND success
sudo /opt/hermes-seg-docker-gl/scripts/system_backup.sh -P /mnt/backups -B all --yes \
  --notify-email=admin@example.com --notify-on-success
```

Subject lines are bracketed for easy scanning in a mail client:

- Success: `[SUCCESS] Hermes backup on <hostname> (scope=<scope>)`
- Failure: `[FAILURE] Hermes backup on <hostname> (scope=<scope>)`

Failure bodies include the timestamp, scope, mode, reason, log file path, and the last 50 lines of the log. Success bodies include the timestamp, scope, mode, output filename, file size, and run duration.

**How it works**: the script shells out to `docker exec -i hermes_postfix_dkim sendmail -t` and pipes the message into the Postfix container's `sendmail` binary. Postfix queues and delivers it like any other outbound mail from Hermes. No host MTA configuration is needed — Hermes's own Postfix does the work.

**Verify the path before wiring into cron** — `--test-notify` sends one `[TEST] [SUCCESS]` sample and one `[TEST] [FAILURE]` sample to the address you give, then exits without running a backup:

```bash
sudo /opt/hermes-seg-docker-gl/scripts/system_backup.sh --test-notify \
  --notify-email=admin@example.com
```

Both test messages have a `[TEST]` prefix in the subject so any ops-alert filters watching for `[FAILURE]` are not tripped. If both arrive, your notification path is good. If neither arrives, check `hermes_postfix_dkim` is running and look at the log file the script prints for sendmail errors.

**Caveat — needs Hermes to be at least partially healthy**: if the failure cause is "the Postfix container is down" or "the Docker daemon is down", `docker exec` has nothing to talk to and the email won't go out. The script logs the failure-to-notify as a warning and exits with the original non-zero status, but you won't get the email. This is the gap external monitoring fills — see below.

### External monitoring (strongly recommended)

Built-in email alerting covers the "backup ran but something went wrong" case (the 99% case). It does NOT cover "Hermes itself is so broken it can't send any email at all" — Docker daemon crashed, host out of disk, container restart loop, network partition, etc. For that, you need an external monitoring tool that lives off the Hermes host and tells YOU when Hermes goes dark.

**Strongly recommended for every production install.** Common choices:

| Tool | Pattern | Best for |
|---|---|---|
| **[Zabbix](https://www.zabbix.com/)** | Agent on the Hermes host reports up/down, disk, container health, custom metrics | Self-hosted, comprehensive; common in business / mid-size deployments |
| **[Nagios / Icinga](https://www.nagios.org/)** | NRPE plugin or similar | Self-hosted, classic; many existing operator setups already have it |
| **[healthchecks.io](https://healthchecks.io/)** | Cron pings a URL on success; if the ping doesn't arrive on schedule, healthchecks alerts you | Dead simple; free tier; cron-native pattern |
| **[Uptime Kuma](https://github.com/louislam/uptime-kuma)** | Self-hosted ping monitor with web UI | Free, self-hosted alternative to healthchecks.io |
| **PRTG / Datadog / New Relic / etc.** | Commercial monitoring | If you already have one, integrate Hermes alongside your other infrastructure |

The healthchecks.io pattern works nicely alongside cron-based backups:

```cron
# Pings healthchecks.io on success only (curl wraps the backup; ping is the URL of your check)
0 3 * * *  root  /opt/.../system_backup.sh -P /mnt/backups -B system --yes \
                 --notify-email=admin@example.com \
                 && curl -fsS --retry 3 https://hc-ping.com/<your-uuid> >/dev/null
```

If the backup fails, the `--notify-email` sends the failure email (assuming Postfix is up). If the backup succeeds, healthchecks.io gets the ping. If the WHOLE HOST is down (no ping, no email), healthchecks.io alerts you after the scheduled interval. Three-layer coverage with minimal moving parts.

### Off-site copy

`system_backup.sh` writes to the local `-P` path only. Off-site copy is left to your existing tooling — `rclone`, `rsync` to remote storage, `aws s3 cp`, `restic`, whatever you already use. Typical pattern:

```bash
sudo /opt/hermes-seg-docker-gl/scripts/system_backup.sh -P /mnt/backups -B system --yes \
  && rclone sync /mnt/backups remote:hermes-backups/
```

## Restore

### Restore quick start

```bash
sudo /opt/hermes-seg-docker-gl/scripts/system_restore.sh -F /mnt/backups/hermes-backup-system-v260609-20260601T103000Z
```

`-F` takes the backup **directory** (not a tarball).

**The restore replaces the data in the backup's scope and leaves other scopes alone.** Restoring a `system` backup overwrites the install root + Data tier + DBs + LDAP; the Vmail / Archive / Nextcloud tiers are untouched. Restoring a `vmail` backup overwrites only `/mnt/vmail`. The stack is stopped for the duration of the restore (always — even hot-mode backups are restored cold).

### Safety: SHA256 + version gates, topology auto-remap

Two gates fire BEFORE any destructive action, plus automatic topology handling:

1. **Manifest SHA256 verification.** Every archive's SHA256 is checked against `manifest.json` (verified in place — no unpacking). If any byte of the backup is corrupt or tampered with, the restore aborts BEFORE stopping the stack or touching any data.
2. **Hermes build-version match.** The backup's `build_no` (captured at backup time from `system_settings.build_no`) is compared against the current host's `build_no`. If they differ, restore refuses unless `FORCE_VERSION_MISMATCH=1` is set. Schema migrations between Hermes builds make cross-version restore unsafe — restoring an older DB dump onto a newer host leaves the schema in a state the running code does not expect, which breaks silently when something hits a missing or renamed column. **The correct procedure is to install Hermes at the matching build first (`git checkout <build>`), restore, then upgrade forward via `scripts/system_update_docker.sh`.**
3. **Storage-topology auto-remap.** If the backup's recorded mount paths (`/mnt/data`, `/mnt/vmail`, etc.) differ from this host's current mount paths in `.env` — typical when restoring onto different hardware — the restore **automatically retargets each tier to this host's paths** and prints a `REMAP` line per tier. No flag is needed; the old `FORCE_REMAP=1` gate was retired as needless friction for new-hardware DR.

### Disaster-recovery flow (different host)

1. Install Hermes fresh on the new host using [`install_hermes_docker.sh`](https://github.com/deeztek/Hermes-Secure-Email-Gateway/blob/main/scripts/install_hermes_docker.sh). The install root + `.env` need to exist before restore can succeed.
2. Make the backup directory reachable on the new host — **either** mount the backup storage (off-site / NAS share) on the new host (recommended: the restore stream-extracts in place, so there's no need to copy the whole backup), **or** `scp -r` the backup directory across to local disk.
3. Run `system_restore.sh -F /path/to/hermes-backup-<scope>-<build>-<ts>`. Storage-topology differences are **auto-remapped** to this host's paths; a build-version difference still requires `FORCE_VERSION_MISMATCH=1` (better: install the matching build first).
4. When the restore detects a **cross-host** restore (backup hostname ≠ this host), it **offers to run `system_rehost.sh` for you** — accept it to rewire host identity (`.env`, DB rows, all rendered configs, and the Nextcloud DB user).
5. Verify the admin console loads and a test message flows end-to-end.

> **A cross-host restore needs more than the restore itself.** The restored data
> carries the *source* host's identity and credentials, so several things must be
> reconciled by hand — run `system_rehost.sh`, re-activate the Pro license, and
> re-save the Content Checks pages to re-apply the milter chain. Follow the full
> checklist: **[Post-Restore Steps](https://docs.deeztek.com/books/installation-reference/page/post-restore-steps)**.

### Restore flags

| Flag | Purpose |
|---|---|
| `-F <path>` | **Required.** Path to the backup **directory** produced by `system_backup.sh`. |
| `--yes` (or `-y`) | Skip the interactive confirmation prompt (and auto-accept the rehost offer on a cross-host restore). |
| `--dry-run` (or `-n`) | Show what would happen without changing anything. |
| `--only=<scope>` | Restore only one scope out of an `all` backup (e.g. `--only=vmail`). |
| `--help` (or `-h`) | Show usage. |
| `FORCE_VERSION_MISMATCH=1` (env) | Override the build-version refusal. Topology differences auto-remap — no flag needed. |

## When to use hypervisor snapshots instead

The cold-mode escape hatch (`--cold`) covers byte-level-consistency use cases that the cold-mode scripts can satisfy. For two other cases, **hypervisor snapshots** are the right tool, not the Hermes scripts:

1. **Pre-upgrade safety net.** Always take a hypervisor snapshot before running `system_update_docker.sh` — that gives you a working rollback if the upgrade fails mid-flight. The methodology doc codifies this.
2. **Zero-downtime full-host snapshot.** If you want a single consistent point-in-time image of the entire Hermes host (every storage tier, the Docker daemon state, the host OS), a hypervisor snapshot is the only tool that captures all of that atomically.

Per-hypervisor snapshot mechanisms:

| Platform | Mechanism |
|---|---|
| Proxmox VE | Datacenter > Backup, or Snapshot from the VM's right-click menu |
| VMware vSphere / ESXi | VM > Snapshots > Take Snapshot |
| KVM / libvirt | `virsh snapshot-create-as <domain> <name> --disk-only --atomic` |
| AWS EC2 | EBS volume snapshot (or AMI for full image) |
| Azure VMs | Disk snapshot, or Recovery Services Vault |
| Google Compute Engine | Disk snapshot |
| Hyper-V | Checkpoint |

## What you should NOT do

### Do NOT run the legacy bare-metal scripts on a Docker host

The pre-Docker `config/hermes/opt/hermes/scripts/system_backup.sh` and `system_restore.sh` are kept in the repo for reference and for the legacy-to-Docker migration path. **Do not run them on a Docker install.** The legacy `system_restore.sh` does `cd / && tar -xvzf <backup-file>` — extracts the backup tarball relative to the host filesystem root and will overwrite host directories with files from a layout that does not match the Docker host's reality. Hermes services fail to start, host OS may become unbootable.

### Do NOT tar a running storage tier with `tar` directly

If for some reason you reach for `tar` directly instead of `system_backup.sh`, do NOT tar `/mnt/data`, `/mnt/vmail`, `/mnt/files`, or `/mnt/archive` while the stack is running **without using the hot-backup primitives the script uses**. Specifically:

- `/mnt/data` contains MariaDB's tablespace files — tar'ing them while `hermes_db_server` is running produces a backup MariaDB will reject as inconsistent on restore. Use `system_backup.sh` (which excludes `mysql/` from the data tar and captures DBs via `mariadb-dump`) instead.
- Without `slapcat`, raw tar of `/mnt/data/ldap` mid-write captures inconsistent slapd database files.

The Hermes scripts handle all of this correctly. Use them.

### Do NOT trust an untested restore procedure

Whatever backup strategy you adopt, **practice the restore at least once on a non-production system before you rely on it.** Take a backup of your live Hermes host, spin up a second VM, run the restore, verify you can log into the admin console and send a test message. A backup procedure that has never been restored from is not a backup procedure — it is wishful thinking.

## What's coming in Phase B

The Phase A scripts cover the common cases (hot daily system backup, scoped tier backups, cold-mode forensic snapshot, scope-aware restore). The Phase B refactor (post-Link-Guard) will add:

- **Retention pruning** (`--retain-last=N` deletes older backups beyond N)
- **Per-tier `--remap-tiers <old>:<new>`** to override individual tiers (today's default is whole-backup auto-remap to this host's paths)
- **Selective container restart** instead of full `compose down` on the restore side (faster restart, smaller blast radius)
- **Filesystem-snapshot integration** (LVM / ZFS / btrfs detection): if a tier lives on a snapshot-capable filesystem, take a filesystem snapshot and tar the snapshot rather than the live mount, for use cases where "best-effort hot tar" isn't good enough but `--cold` is too disruptive

**Not on the Phase B roadmap** (deliberately dropped):

- **Native Ofelia integration**. Cron is the right tool. Ofelia's job model (`job-exec` into a named container, `job-local` on the Ofelia container) doesn't fit a host-level script cleanly. Forcing it would mean a custom Ofelia image with `docker compose` plugin + Docker socket + root access, plus admin-page UI work to add jobs — all to honor a pattern that doesn't fit. Host cron is the answer.
- **Admin-UI launch button**. Long-running operations + web UIs is a footgun; the admin who runs a backup is already in SSH. The Backup/Restore admin page stays read-only / informational, by design.

Failure / success notification is a separate discussion — see the Scheduling section above. Today the answer is cron's `MAILTO=` / pipe exit code into existing alerting; if operators ask for native built-in notification, it's a small Phase B addition.

Tracking: [#219](https://github.com/deeztek/Hermes-Secure-Email-Gateway/issues/219) for the backup-side enhancements, [#220](https://github.com/deeztek/Hermes-Secure-Email-Gateway/issues/220) for the restore-side.

## Migrating from a legacy bare-metal install

A separate tool exists at [`scripts/migrate_legacy_to_docker.sh`](https://github.com/deeztek/Hermes-Secure-Email-Gateway/blob/main/scripts/migrate_legacy_to_docker.sh) for operators moving from a legacy bare-metal install to the Docker install. It consumes a backup produced by the **legacy** `system_backup.sh` (which is correct in the bare-metal context where it ran) and restores it into the Docker layout via a translation step — NOT the same as running the legacy restore script directly. See the migration section of the [v260119 release notes](https://github.com/deeztek/Hermes-Secure-Email-Gateway/releases/tag/v260119) for current scope.

## Cross-references

- [Storage Topology](https://docs.deeztek.com/books/installation-reference/page/storage-topology-5-tiers) — the five-tier layout the backup operates on
- [Release & Update Methodology](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology) — recommends taking a hypervisor snapshot before running `system_update_docker.sh`
- [scripts/migrate_legacy_to_docker.sh](https://github.com/deeztek/Hermes-Secure-Email-Gateway/blob/main/scripts/migrate_legacy_to_docker.sh) — separate from backup/restore; for one-time bare-metal-to-Docker migration only

# Console Settings

# Console Settings

Admin path: **System > Console Settings** (`view_console_settings.cfm`,
`inc/get_console_settings.cfm`, `inc/edit_console_settings.cfm`,
`inc/generate_auth_nginx_configuration.cfm`,
`inc/generate_nginx_configuration.cfm`,
`inc/generate_authelia_configuration.cfm`,
`inc/generate_nextcloud_configuration.cfm`,
`inc/edit_ciphermail_settings.cfm`, `preload_restart_nginx.cfm`).

This page configures **how the outside world reaches the Hermes web
console** — the FQDN or IP that nginx terminates TLS on, the
certificate it presents, and three HTTPS hardening toggles (HSTS, OCSP
stapling, OCSP stapling verify). It is the single source of truth for
the console hostname; every other component that needs to know "where
do I live" (Authelia session cookie, Nextcloud trusted domain and
theming URL, the User Console link in Nextcloud's top bar, Ciphermail
portal redirect URL, OIDC discovery URI) is regenerated from this page
when the Console Address changes.

Pairs with [Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup), which configures the
gateway's mail-side identity (Postfix `myorigin` / `myhostname` and the
host IP). The two pages together define every name Hermes presents to
the world.

## Where the console host fits

```
Browser  ──►  hermes_nginx (443)
                  │ server_name = <Console Address>
                  │ ssl_certificate = <Console Certificate>
                  ▼
              auth_request /authelia
                  │
                  ▼
              hermes_authelia
                  │ session.cookies[].domain = <Console Address>
                  │ authelia_url             = https://<Console Address>/authelia
                  ▼
              hermes_commandbox (admin + user portal)
              hermes_nextcloud  (NC trusted_domain + theming URL +
                                 user_oidc discovery URI +
                                 External Sites "User Console" link)
              hermes_ciphermail (portal URL = Console Address)
```

Every one of those downstream consumers is rewritten from the value
saved on this page. Direct edits to `auth.conf`, `hermes-ssl.conf`,
`configuration.yml`, Nextcloud's `config.php`, the Ciphermail portal
URL, or OIDC discovery are **overwritten on the next save**.

## Configuration storage

Both the Console Address and the four hardening / cert settings live in
the `parameters2` table with `module = 'console'`. The page is wired
strictly against that table — there are no file-backed secrets here,
only DB values.

| Setting | `parameters2.parameter` | Default |
|---|---|---|
| Console Address (IP or FQDN) | `console.host` | `smtp.domain.tld` (seed) |
| Console Certificate (FK into `system_certificates.id`) | `console.certificate` | `29` (seed snakeoil) |
| DH parameters | `console.dhparam` | `enable` |
| HSTS | `console.hsts` | `enable` |
| OCSP Stapling | `console.ssl_stapling` | `enable` |
| OCSP Stapling Verify | `console.ssl_stapling_verify` | `enable` |

> **DH parameters note.** The `console.dhparam` row is still in the
> schema and still set by the form handler when a DH file exists, but
> commit `2dbc2bd3` ("ECDHE-only ciphers, remove DH parameters
> feature") moved the active TLS cipher suite to ECDHE-only — DH is
> no longer offered. The setting is therefore inert; leave it at the
> default.

## Fields on the page

### Console Address (IP or FQDN)

The hostname or IP nginx terminates TLS on for `/admin`, `/users`,
`/nc`, `/portal` (Ciphermail), and every other console-served path.
Accepts:

- **IPv4** — validated against the standard dotted-quad regex
- **IPv6** — validated against the bracketed/colon form
- **FQDN** — validated by the email-trick (`IsValid("email",
  "bob@<host>")`)

`edit_console_settings.cfm` trims whitespace and strips any trailing
zone-file dots (`mail.example.com.` becomes `mail.example.com`) before
saving. That stripping happens at the input boundary so every
downstream consumer — `autoconfig.cfm`, `autodiscover.cfm`, nginx vhost
generation, the NC theming URL, the OIDC discovery URI — sees an
identical canonical string. Outlook for Mac is one of several MUAs that
breaks on the trailing dot, hence the strip.

> **If you set Console Address to an IP** and then the server's IP
> changes, you must update **both** Console Address (this page) **and**
> Host IP Address (Server Setup) — they are stored in separate
> parameters and neither cascades to the other. The page surfaces this
> in a warning callout.

### Console Certificate

Free-text autocomplete that searches `system_certificates` via
`getcertificates.cfm` (an ajax endpoint). Selecting a row populates a
hidden `certificateno_1` field with the certificate's row ID, plus
five read-only display fields (subject, issuer, serial, type, friendly
name). The handler validates the ID exists in `system_certificates`
before saving — an empty or unknown ID falls through to the next step
with `step = 3`, which means the existing `console.certificate` value
is preserved.

The selected cert becomes `nginx`'s `ssl_certificate` /
`ssl_certificate_key` for every console-facing vhost. Certificate
upload, renewal, and Let's Encrypt are managed on
[System Certificates](https://docs.deeztek.com/books/administrator-guide/page/system-certificates); this page is the
**binding** of one of those certificates to the console hostname.

### HSTS, OCSP Stapling, OCSP Stapling Verify

Three boolean (`enable` / `disable`) selects. Each is substituted into
`/opt/hermes/templates/hermes-ssl.conf` at regen time:

| Toggle | Effect on the generated `hermes-ssl.conf` |
|---|---|
| HSTS | `add_header Strict-Transport-Security "max-age=31536000; preload"` (enabled) vs. the same line commented out (disabled) |
| OCSP Stapling | `ssl_stapling on;` (enabled) vs. `#ssl_stapling on;` (disabled) |
| OCSP Stapling Verify | `ssl_stapling_verify on;` (enabled) vs. `#ssl_stapling_verify on;` (disabled) |

Defaults are all `enable` and should stay that way for any
publicly-reachable console. Disable only if you have a specific reason
(e.g., HSTS preload conflict during a hostname migration window).

## Save flow — the cascade

Clicking **Save & Apply Settings** posts `action=edit`, which runs
`edit_console_settings.cfm` as a strict 7-step sequence. Each step gates
on the previous step's success (`<cfif step is "N">`) — any
validation failure short-circuits with `cflocation
url="#cgi.http_referer#"` and `session.m` set to the matching alert
code; no partial state lands.

```
step 1  Validate + write console.host
step 2  Validate + write console.certificate
step 3  (DH param — inert)              ──► step 4
step 4  Write console.hsts
step 5  Write console.ssl_stapling
step 6  Write console.ssl_stapling_verify
step 7  Regen + restart cascade  ──┐
                                    │
        generate_auth_nginx_configuration.cfm   (rewrites snippets/auth.conf)
        generate_nginx_configuration.cfm        (rewrites snippets/hermes-ssl.conf)
        generate_authelia_configuration.cfm     (rewrites authelia/configuration.yml)
        generate_nextcloud_configuration.cfm    (rewrites nc/config.php trusted_domains)
        occ user_oidc:provider Hermes_SEG       (discovery URI + end-session URI)
        occ config:app:set external sites       (NC top-bar "User Console" link JSON)
        occ theming:config url                  (NC theming URL)
        edit_ciphermail_settings.cfm            (Ciphermail portal URL)
        restart_authelia.cfm                    (preload-style restart)
        restart_ciphermail.cfm                  (preload-style restart)
        preload_restart_nginx.cfm               (last — see below)
```

`preload_restart_nginx.cfm` is the canonical Hermes pattern for
restarting the proxy from inside a request that is **served by the
proxy**. A plain `docker container restart hermes_nginx` would close
the request's own connection and the browser would see
`ERR_CONNECTION_REFUSED` on the redirect back. The preload page returns
a full HTML response that includes a `fetch()` to a separate
`restart_nginx_post.cfm` endpoint and a poll-loop that waits for nginx
to come back before redirecting to `view_console_settings.cfm`. Always
use this pattern from any handler that ends in an nginx restart.

The Nextcloud `occ` calls in steps 7d–7f are all wrapped in
`<cftry>...<cfcatch type="any"></cfcatch></cftry>` and marked
**non-fatal** in the comments. A Nextcloud container that is down or
slow at the moment of save will leave the NC-side values stale; on the
next save (or a manual `occ` invocation) they will catch up.

> **By design.** The cascade is destructive — there is no dry-run, no
> diff preview, no "stage changes." Saving rewrites all four config
> files and restarts three containers. Plan saves outside business
> hours if the deployment is busy.

## Operational consequence — changing the Console Address mid-flight

A live Console Address change is the single most disruptive operation
on this page. While the cascade runs (typically 30–60 seconds end to
end including container restarts):

| Surface | Behavior during the change |
|---|---|
| The admin page that initiated the change | Held by `preload_restart_nginx.cfm` until nginx returns 200 on `/index.cfm`, then redirected back |
| Other open admin sessions | Will see `502 Bad Gateway` for the nginx restart window; their session cookie is also now scoped to the **old** hostname and they will be re-prompted to log in after they reload at the new address |
| User portal / Nextcloud / Webmail sessions | Same — all session cookies are domain-scoped; users at the old hostname must navigate to the new one and re-authenticate |
| Mail flow (SMTP/IMAP/Submission) | **Unaffected.** Postfix and Dovecot do not depend on the console nginx vhost. |
| Outbound DKIM signing | Unaffected. |
| Webmail OIDC | Discovery URI is rewritten at step 7d but the change only takes effect after Nextcloud picks up the new `occ user_oidc` settings — in practice this is instant because `occ` writes synchronously |

If the new Console Address has no DNS record yet, the change still
saves (Hermes does not DNS-resolve the value) but every external
client request will fail until DNS catches up.

## Bypassing this page — risks

There are three other paths that can change the console hostname or
hostname-derived values **without** going through this cascade. Each
one leaves Hermes in an inconsistent state. Do not use them unless you
are recovering from a broken cascade and you know what you are doing.

1. **Direct edit of `parameters2`** — sets `console.host` but does not
   regenerate `auth.conf`, `hermes-ssl.conf`, `configuration.yml`,
   `config.php`, theming, External Sites, OIDC, or Ciphermail.
2. **Direct edit of `config/nginx/.../snippets/*.conf` or
   `config/authelia/configuration.yml`** — the next save on this page
   overwrites your hand-edits.
3. **A future Hermes CLI Management Console** (`scripts/hermes-cli.sh`)
   is planned but not yet built. It will expose Change Console Host as
   a menu option so admins have a recovery path when a bad Console
   Address change has locked them out of the web UI. Until it ships,
   the only recovery is direct SQL + manual regen-script invocations
   against the `hermes_commandbox` container.

## Per-domain nginx vhosts are NOT regenerated by this page

This page rewrites `snippets/auth.conf` and `snippets/hermes-ssl.conf`
— the global console snippets. **Per-domain vhosts** generated for
mailbox domains, autodiscover, autoconfig, and any other
domain-scoped surface live in separate templates and are rendered on
their own pages (Mailboxes > Domains, mostly).

If you edit one of those per-domain templates by hand and expect
already-generated vhosts to pick it up, they will not. Either re-render
each affected domain from its own UI, or run the appropriate
domain-regen include directly. The same rule applies in reverse — a
console hostname change does **not** rewrite per-domain server blocks
that were generated before the change. Most installs do not need to,
because per-domain vhosts use the domain hostname, not the console
hostname. If a per-domain vhost was unusually wired to the console
hostname (manual customisation), re-render it.

## Failure semantics

| What breaks | What happens |
|---|---|
| Console Address validation fails (invalid IPv4/IPv6/FQDN) | `session.m = 3`, redirect, no DB write |
| Console Certificate ID not found in `system_certificates` | `session.m = 2`, redirect, no DB write |
| Nginx config syntax error after template substitution | `nginx -t` fails inside `restart_nginx_post.cfm`; the previous live config stays loaded (nginx never gets the `reload`), but the on-disk file is the broken one. Recovery: fix the template, re-save. |
| Authelia container fails to start after `configuration.yml` regen | See [Authentication Settings § Failure semantics](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings#failure-semantics). The `restart_authelia.cfm` output is logged but not surfaced in the success banner. |
| Nextcloud `occ` calls error out | Logged silently (cftry wrapping); next save retries. |
| Ciphermail not running | The portal URL stays out of sync; next save catches up after the container is back. |

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_console_settings.cfm` | `hermes_commandbox` | Page |
| `config/hermes/var/www/html/admin/2/inc/edit_console_settings.cfm` | `hermes_commandbox` | Save handler (7-step cascade) |
| `config/hermes/var/www/html/admin/2/inc/get_console_settings.cfm` | `hermes_commandbox` | Load handler |
| `config/hermes/var/www/html/admin/2/preload_restart_nginx.cfm` | `hermes_commandbox` | Restart-and-redirect overlay |
| `config/hermes/opt/hermes/templates/hermes-ssl.conf` | `hermes_commandbox` | Console nginx server-block template |
| `config/hermes/opt/hermes/templates/auth.conf` | `hermes_commandbox` | Console auth_request snippet template |
| `config/hermes/opt/hermes/templates/configuration.yml` | `hermes_commandbox` | Authelia config template |
| `/etc/nginx/snippets/hermes-ssl.conf` | `hermes_nginx` | Live console TLS / hardening snippet (regen target) |
| `/etc/nginx/snippets/auth.conf` | `hermes_nginx` | Live console auth_request snippet (regen target) |
| `/config/configuration.yml` | `hermes_authelia` | Live Authelia config (regen target) |
| `/var/www/html/config/config.php` | `hermes_nextcloud` | Live NC config — `trusted_domains` updated (regen target) |
| `oc_appconfig` (appid `external`, configkey `sites`) | `hermes_nextcloud` MariaDB | Top-bar User Console link JSON blob |
| `oc_appconfig` (appid `theming`, configkey `url`) | `hermes_nextcloud` MariaDB | NC theming URL |
| `user_oidc` provider `Hermes_SEG` | `hermes_nextcloud` | OIDC discovery + end-session URIs |

Every cross-container call uses `docker exec` per the standard Hermes
pattern. The temp-shell-script convention (`/opt/hermes/tmp/<token>_*.sh`)
is used for the External Sites `occ` call because the JSON value has
quoting/escaping that `cfexecute`'s `arguments` parsing handles
unreliably; writing a small shell script and executing it instead of
passing the JSON inline avoids that whole class of bug.

## Related

- [Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup) — the mail-side server identity (Postfix `myorigin` / `myhostname`, Host IP). Companion to this page; the two together define every name Hermes presents.
- [System Certificates](https://docs.deeztek.com/books/administrator-guide/page/system-certificates) — uploading, renewing, and managing the certificates this page selects from
- [Authentication Settings](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings) — Authelia configuration; this page rewrites its config file as part of every save
- [SMTP TLS Settings](https://docs.deeztek.com/books/administrator-guide/page/smtp-tls-settings) — the mail-side TLS certificate binding, the analogue of "Console Certificate" for SMTP banners
- [DNS Resolver](https://docs.deeztek.com/books/administrator-guide/page/dns-resolver) — if the Console Address is an internal-only FQDN, this page's resolver settings determine whether other Hermes containers can reach it

# DNS Resolver

# DNS Resolver

Admin path: **System > DNS Resolver** (`view_dns_resolver.cfm`,
`inc/dns_resolver_action.cfm`, `inc/generate_unbound_forward_conf.cfm`,
`inc/generate_unbound_local_conf.cfm`).

Hermes ships its own **recursive caching DNS resolver** — a stock
`hermes_unbound` container fronted by an admin UI that lets an operator
toggle recursive vs. forwarding mode, manage upstream forwarders, add
local-zone overrides for split-horizon hostnames, inspect cache
statistics, and run ad-hoc lookups. Every other Hermes container points
its `dns:` at `hermes_unbound` (`${IPV4SUBNET}.117`) rather than the
host's resolver, so RBL/DNSBL lookups, MX resolution, ARC verification,
Postfix recipient validation, and OIDC discovery all flow through this
single resolver.

## Why Hermes runs its own resolver

A mail gateway has DNS requirements that a stock host resolver does not
meet:

| Requirement | Why a shared resolver fails | What Unbound gives Hermes |
|---|---|---|
| RBL/DNSBL queries from a low-volume IP | Public resolvers (Cloudflare, Google, Quad9) issue thousands of queries per second on behalf of many tenants. RBL providers throttle or refuse responses to those shared IPs. | Recursive mode queries the authoritative servers directly from the gateway's own IP — well under any per-source rate limit. |
| Deterministic resolution path for DKIM / DMARC / ARC | A flaky host resolver causes intermittent `TEMPFAIL` on DNS-dependent auth | Unbound's cache survives container restarts of the consumers, and its TTLs are tuned for mail traffic |
| Split-horizon DNS (internal AD hostnames) | The host's `/etc/resolv.conf` typically points at public DNS — internal-only names fail | The Local DNS Overrides table writes `local-data` entries that Unbound returns for any container that asks |
| DNSSEC validation across the stack | Trust depends on every container running its own validator (rarely the case) | Unbound validates once; consumers get verified answers automatically |

The container itself is custom-built (Hermes-published image at
`ghcr.io/deeztek/hermes-unbound`) but the configuration is plain
Unbound — there is no Hermes-specific patching at the daemon level.

## How DNS flows through the stack

```
+-------------------+   +-------------------+   +-------------------+
| hermes_postfix    |   | hermes_mail_filter|   | hermes_ldap       |
| (RBL, MX lookups) |   | (SpamAssassin)    |   | (RemoteAuth bind) |
+---------+---------+   +---------+---------+   +---------+---------+
          |                       |                       |
          | dns: 172.16.32.117    |                       |
          v                       v                       v
+-----------------------------------------------------------------+
|  hermes_unbound  (.117 on hermes_net_ext, port 53/udp + 53/tcp) |
|                                                                  |
|   /etc/unbound/unbound.conf       <-- baseline (read-only mount) |
|   /etc/unbound/conf.d/forward.conf <-- generated from DB         |
|   /etc/unbound/conf.d/local.conf   <-- generated from DB         |
|                                                                  |
|   Forwarding mode? ----yes----> upstream forwarders (1.1.1.1 ...)|
|                  ----no -----> root hints, full recursion        |
+-----------------------------------------------------------------+
                              |
                              v
                   Authoritative DNS / Forwarders
```

Every container declares `dns: ${IPV4SUBNET}.117` in
[`docker-compose.yml`](https://github.com/deeztek/Hermes-Secure-Email-Gateway/blob/main/docker-compose.yml) so its `/etc/resolv.conf`
points at the Unbound container regardless of the host's resolver
configuration. The host itself is unaffected.

## Configuration storage

Forwarding mode and the forwarders/local-records lists live in three
places:

| Setting | Storage | Notes |
|---|---|---|
| Forwarding mode | `parameters2.module = 'unbound', parameter = 'forwarding.enabled'` | `yes` or `no` |
| Upstream forwarders | `dns_forwarders` table | `server`, `port`, `tls`, `enabled`, `sort_order`; seeded with Cloudflare (1.1.1.1 / 1.0.0.1) + Google (8.8.8.8 / 8.8.4.4) |
| Local DNS overrides | `dns_local_records` table | `hostname`, `record_type` (A/AAAA/CNAME/MX/TXT/PTR), `value`, `enabled`, `description`; UNIQUE on `(hostname, record_type)` |

The baseline `unbound.conf` (cache sizes, DNSSEC trust anchor, num-threads,
access-control for the Docker subnets) ships as a read-only mount and is
not editable from this page. To change those, edit
`config/unbound/unbound.conf` directly and restart the container.

## Recursive vs. forwarding mode

The default is **Recursive** and the in-page callout pushes hard against
flipping it. The reasoning is operational, not philosophical:

> **Forwarding through public resolvers will cause RBL/DNSBL lookup
> failures.** When queries are forwarded through Cloudflare / Google /
> Quad9, your blocklist lookups originate from their shared IP
> addresses. RBL providers throttle or block these IPs because thousands
> of other customers are making the same queries from the same
> resolvers. With recursive resolution, queries come from your server's
> own IP, keeping you well under per-source rate limits.

Forwarding is still useful in a few specific cases:

- Egress-restricted networks where outbound port 53 to arbitrary
  authoritative servers is blocked but a known forwarder is allowed
- Compliance requirements forcing all DNS through a logged corporate
  resolver
- DNS-over-TLS to a specific provider (set `tls = yes` and `port = 853`
  on each forwarder)

In any of those cases, configure forwarders that you control or that
have a known per-customer SLA. Public flat-rate resolvers cause RBL
breakage that surfaces days later as inflated spam scores.

## The four cards on the page

### 1. DNS Resolver Status

Shows container state (`running`, `exited`, or `error`) via
`docker inspect --format='{{.State.Status}}|{{.State.StartedAt}}'`,
computes the uptime in days/hours/minutes (the StartedAt timestamp is
UTC; the page converts before diffing — earlier versions had a tz-drift
bug, see commit `644d56b1`), and exposes a **Restart Unbound** button.

> **Restarts are mail-safe.** Restarting `hermes_unbound` typically takes
> 1–3 seconds. During that window, consumer containers fall back to
> retry; Postfix, Amavis, and Dovecot all tolerate a brief DNS outage
> without losing mail. Plan restarts freely; you do not need an outage
> window.

### 2. DNS Forwarding

Two sub-controls. The **DNS Resolution Mode** select (recursive vs.
forwarding) writes `parameters2.unbound.forwarding.enabled` and
regenerates `forward.conf`. The **Upstream Forwarders** table is the
working set used when forwarding is enabled — fields are Server IP, Port
(default `853` for DoT, `53` for plain), TLS (yes/no), and per-row
enable/disable + delete.

The two-step "edit then Apply" model is deliberate: adding, deleting, or
toggling a forwarder marks the change pending (the page banner shifts to
amber) but does **not** restart Unbound. Click **Apply &amp; Restart
Unbound** to regenerate `forward.conf` and bounce the container in one
shot. This lets an admin batch a multi-row change without triggering
multiple restarts.

### 3. Local DNS Overrides

A static-entries table that becomes `local-data` lines in
`/etc/unbound/conf.d/local.conf`. The same two-step
edit-then-Apply model applies.

This is the single most operationally important card on the page. Two
canonical use cases:

| Scenario | What to add |
|---|---|
| LDAP RemoteAuth against an internal AD DC (`dc01.corp.example.com`) that is not publicly resolvable | `dc01.corp.example.com` → `10.0.0.10` (A record). See [LDAP RemoteAuth § DNS resolution prerequisite](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth#dns-resolution-prerequisite). |
| Split-horizon: the Console Address resolves externally but you want internal containers to skip the public lookup | `console.example.com` → `192.168.1.10` (A record) |

The generator groups records by their second-level zone and emits a
single `local-zone: "<zone>." transparent` declaration before the
`local-data` lines — `transparent` means Unbound resolves the configured
hostnames locally but **forwards everything else in the same zone**
upstream as normal. This is the right choice for split-horizon: an
override for `dc01.example.com` does not break public lookups for
`www.example.com` against the same zone.

> **Operational consequence.** A misconfigured override can shadow a
> public hostname. Hermes resolves what you write — if you point
> `mail.example.com` at the wrong internal IP, every container that asks
> for that name will get the wrong answer. Test with the **DNS Lookup
> Test** card (below) before relying on the entry in production.

### 4. DNSSEC, Cache Statistics, DNS Lookup Test

Three read-only utility cards.

| Card | What it shows / does |
|---|---|
| **DNSSEC** | Parses the live `unbound.conf` inside the container; reports Enabled / Disabled based on `auto-trust-anchor-file` / `trust-anchor-file` / `module-config: validator` presence. **Test DNSSEC** runs `drill -D example.com` and dumps the response. DNSSEC is enabled in the shipped baseline; this card is informational. |
| **Cache Statistics** | Runs `unbound-control stats_noreset` and parses `total.num.queries`, `cachehits`, `cachemiss`, `prefetch`, plus RRset/message cache counts and average recursion time. Useful for diagnosing cold-cache latency after a restart. **Flush Cache** clears the entire cache (`unbound-control flush_zone .`) — typically used after a downstream DNS record change that you don't want to wait for the TTL on. |
| **DNS Lookup Test** | Runs `drill @127.0.0.1 <TYPE> <name>` inside the container. Supports A / AAAA / MX / TXT / NS / SOA / PTR. Input is validated to `[a-zA-Z0-9.\-]+` before being passed to the shell. This is the right tool to verify a local override actually took effect. |

## Apply flow

A single Save / Apply click runs roughly this:

```
1. Validate input (IP octets in range, port 1-65535, hostname charset, ...)
2. UPDATE or INSERT INTO parameters2 / dns_forwarders / dns_local_records
3. cfinclude generate_unbound_forward_conf.cfm   (or _local_conf.cfm)
        - Read the table back
        - Render the conf into chr(10)-newline plain text
        - fileWrite("/etc/unbound/conf.d/forward.conf", ..., "utf-8")
4. cfexecute /usr/local/bin/docker container restart hermes_unbound
        (30s timeout; typically returns in 1-3s)
5. cflocation back to view_dns_resolver.cfm with session.m set
```

The generated `conf.d/*.conf` files are written via Lucee `fileWrite`
into the `hermes_commandbox`-side bind-mount of `config/unbound/conf.d/`
— the same directory `hermes_unbound` reads on restart. There is no
`docker cp` step; both containers see the same files because they share
the bind mount (commit `06acd4e1` switched away from the legacy `docker
cp` pattern).

## Cache TTL behavior

The baseline `unbound.conf` sets:

| Knob | Value | Why |
|---|---|---|
| `cache-min-ttl: 300` | 5 minutes | Floor — protects against authoritative servers that publish ultra-short TTLs |
| `cache-max-ttl: 86400` | 24 hours | Ceiling |
| `cache-max-negative-ttl: 900` | 15 minutes | Floor on NXDOMAIN cacheing — important for DNSBL hits, which produce intentional NXDOMAINs |
| `prefetch: yes` | — | Refreshes hot records before TTL expiry so cache misses are rare |
| `qname-minimisation: yes` | — | Privacy + reduces authoritative-server query volume |

After a record change you depend on (e.g., updating an MX record at the
registrar), use **Flush Cache** to skip the wait.

## Failure semantics

| What breaks | What happens |
|---|---|
| `dns_forwarders.server` validation fails (non-IPv4, octet > 255, port out of range) | `session.m = 11`, redirect, no DB write. Error text in the alert. |
| `dns_local_records.hostname` empty or invalid record type | Same — `session.m = 11` with specific error text. |
| `fileWrite` to `conf.d/` fails | `session.m = 10`, error surfaces. The container is **not** restarted; the previous `.conf` stays live. |
| Container restart times out (30s) | `session.m = 10`. The restart was issued but did not complete in band; check `docker ps` and `docker logs hermes_unbound` manually. |
| `unbound-control` not available | Cache Statistics card shows "not available" message; the daemon itself is unaffected. |
| `drill` returns SERVFAIL for a DNSSEC test | Surfaced in the test output pane; usually means the test domain has misconfigured DNSSEC, not that Unbound is broken. |
| Local override shadows a public name | No error — Unbound returns the override. Use the DNS Lookup Test card to verify what consumers will actually see. |

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_dns_resolver.cfm` | `hermes_commandbox` | The page |
| `config/hermes/var/www/html/admin/2/inc/dns_resolver_action.cfm` | `hermes_commandbox` | Save handlers (`save_forwarding`, `add_forwarder`, `add_local_record`, `restart_unbound`, `flush_cache`, etc.) |
| `config/hermes/var/www/html/admin/2/inc/generate_unbound_forward_conf.cfm` | `hermes_commandbox` | Renders `forward.conf` from DB and restarts the container |
| `config/hermes/var/www/html/admin/2/inc/generate_unbound_local_conf.cfm` | `hermes_commandbox` | Renders `local.conf` from DB and restarts the container |
| `config/unbound/unbound.conf` | `hermes_unbound` (read-only mount) | Baseline daemon config — cache sizes, DNSSEC, access-control |
| `config/unbound/conf.d/forward.conf` | `hermes_unbound` (read-write mount, regen target) | Generated forwarders |
| `config/unbound/conf.d/local.conf` | `hermes_unbound` (read-write mount, regen target) | Generated local overrides |
| `dns_forwarders`, `dns_local_records` tables | `hermes_db_server` (`hermes` DB) | Source of truth for the regen |
| `parameters2.unbound.forwarding.enabled` | `hermes_db_server` (`hermes` DB) | Recursive vs. forwarding mode |
| `${IPV4SUBNET}.117` | Docker network `hermes_net_ext` | Fixed Unbound IP that every other container's `dns:` declaration points at |

## Related

- [LDAP RemoteAuth § DNS resolution prerequisite](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth#dns-resolution-prerequisite) — the canonical case for adding a Local DNS Override (internal AD DC hostname)
- [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — if the Console Address is an internal-only FQDN, this page's overrides decide whether other containers can reach it
- [Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup) — the mail-side hostname; RBL accuracy depends on the resolver's egress IP, which is the host's egress IP regardless of where Unbound is running
- [Scheduled Tasks](https://docs.deeztek.com/books/administrator-guide/page/scheduled-tasks) — the Ofelia jobs (RBL refresh, DMARC report fetch, fangfrisch malware-feed sync) that depend on this resolver
- [Storage Topology](https://docs.deeztek.com/books/installation-reference/page/storage-topology-5-tiers) — `hermes_unbound` is stateless; its mounts live in the Config tier (`config/unbound/`)

# IPS

# IPS

_Pro Edition feature._ Maps to **System > IPS** (`view_intrusion_prevention.cfm`, `inc/intrusion_prevention_generate_config.cfm`, `inc/intrusion_prevention_get_status.cfm`, `inc/intrusion_prevention_manual_ban.cfm`, `inc/intrusion_prevention_manual_unban.cfm`).

IPS (Intrusion Prevention System) is Hermes's brute-force defense layer. It binds two operational pieces together: the **`hermes_fail2ban`** container that scans authentication logs and inserts iptables drop rules, and a Hermes database/UI layer that lets an admin tune jail thresholds, manage a never-ban whitelist, manually ban or unban IPs, and see live ban counts. The page also doubles as a troubleshooting reference (the Info card lists every `docker exec` command an admin would need to chase a ban from the shell).

## Pipeline placement — where IPS sits in the stack

```
Attacker on the public internet
        │
        ▼
   Host network stack  (hermes_fail2ban runs network_mode: host)
        │
        ├─►  iptables DOCKER-USER chain
        │       └─►  f2b-dovecot, f2b-authelia chains  ◄── ban rules inserted here
        │
        ▼
   nginx / Docker bridge
        │
        ▼
   hermes_nginx ──► hermes_commandbox / hermes_authelia / hermes_dovecot
                              │
                              ▼  (auth attempt logged)
                    /remotelogs/<service>/<file>.log
                              ▲
                              │
   hermes_fail2ban  ─tails─►  same logs (bind-mounted into the container)
        │
        ├─►  match filter regex N times within findtime
        ▼
   hermes-iptables-<jail> action
        │
        ├─►  iptables -I f2b-<jail> -s <ip> -j DROP
        └─►  hermes-api-notify.sh BAN <ip> <SOURCE>
                  │
                  ▼
            POST http://<commandbox>:8888/hermes-api/
                  │
                  ▼
            INSERT INTO fail2ban_ips (...)
```

Two facts are worth pinning down before anything else:

| Fact | Consequence |
|---|---|
| `hermes_fail2ban` runs in **host network mode** | iptables rules apply to the Docker host directly, not to a bridge namespace. The DOCKER-USER chain is the entry point because Docker honors it before its own auto-inserted rules. |
| Docker DNS is **unavailable** inside the container | The notify script reads container IPs from `/opt/hermes/tmp/container_ips.env`, regenerated on every page load by `inc/generate_container_ips.cfm`. If that file is stale or missing, ban events still iptables-block correctly but fail to log to the database. |

## The container always runs — Pro gating is behavioral

`hermes_fail2ban` starts on every install regardless of edition. The Pro license check happens in CFML at page load, not at the container level. What changes on Community is:

- The configuration UI is replaced by the standard "Pro feature required" panel.
- Jail toggles in `intrusion_prevention_jails.enabled` and the master `intrusion_prevention_settings.enabled` switch default to disabled on a fresh install.
- The jail.local on disk reflects whatever the seed gave you; nothing rewrites it without an admin clicking through the page.

> **Operational consequence.** Stopping `hermes_fail2ban` to "turn off IPS on Community" is the wrong move. The container is needed for the schema, the include scripts, and the manual-unban API path. Leave it running; disable IPS through the UI when the page becomes accessible, or leave the seeded jails disabled.

## The two seeded jails

| Jail name | Display name | Log scanned | Filter | Action | Default thresholds |
|---|---|---|---|---|---|
| `dovecot` | Mail Server (Dovecot) | `/remotelogs/dovecot/dovecot-info.log` | `dovecot` (upstream Fail2ban filter) | `hermes-iptables-dovecot` | maxretry 5 / findtime 86400 (1 day) / bantime 1800 (30 min) |
| `authelia` | SSO Portal (Authelia) | `/remotelogs/authelia/authelia.log` | `authelia-auth` (Hermes-shipped) | `hermes-iptables-authelia` | maxretry 5 / findtime 86400 / bantime 1800 |

Both rows are seeded into `intrusion_prevention_jails` on install (see [hermes_install.sql](https://github.com/deeztek/Hermes-Secure-Email-Gateway/blob/main/config/database/hermes_install.sql) lines 845-846). Adding a third jail is a schema-row plus filter/action insertion exercise — there is no UI for it. The two-jail set covers the two real attack surfaces in Hermes: SMTP/IMAP login brute force and the web-console SSO login. Postfix's own brute-force protection (smtpd anvil rate limits) is the first line of defense for SMTP submission; this jail catches what gets past anvil.

The dovecot jail covers the `dovecot-info.log` line for failed authentication, not the Postfix auth log. SMTP-AUTH attempts terminate against Dovecot SASL — Postfix proxies SASL through Dovecot — so the dovecot filter sees both IMAP/POP and SMTP-AUTH failures from the same surface.

## Database schema

Three tables in the `hermes` database carry IPS state. A fourth (`fail2ban_ips`) is shared with the manual ban/unban flow and the API notify script.

| Table | Role | Notes |
|---|---|---|
| `intrusion_prevention_settings` | Two key/value rows: `enabled` (master switch), `config_synced` (pending-changes flag) | INSERT IGNORE on install, so an admin's local tuning survives upgrades |
| `intrusion_prevention_jails` | One row per jail with display metadata + maxretry/findtime/bantime/enabled/config_synced | Includes the filter and action names that get baked into `jail.local` |
| `intrusion_prevention_whitelist` | One row per IP/CIDR to ignore — three protected entries (`127.0.0.1/8`, `::1`, `172.16.0.0/12`) cannot be deleted | Whitelist rows render into the `ignoreip` directive of `[DEFAULT]` in `jail.local` |
| `fail2ban_ips` | Live ban ledger — one row per (IP, jail) pair currently or recently banned | Written by `hermes-api-notify.sh` (automatic bans) or the CFML manual-ban handler (admin bans) |

The `config_synced` flag works the same way as on other pages: every write handler flips it to `0` and renders a yellow "Pending Changes" badge; **Apply Settings** runs the regen-and-reload sequence and flips it back to `1`. There is no incremental sync — every Apply rewrites the whole `jail.local` from scratch.

## Apply Settings — the regen sequence

`inc/intrusion_prevention_generate_config.cfm` runs five hard-sequenced steps:

1. **Read** `intrusion_prevention_whitelist` (excluding the three protected IPs to avoid double-listing them in `ignoreip`).
2. **Read** `intrusion_prevention_jails` ordered by `jail_name`.
3. **Render** `jail.local` content into a `<cfsavecontent>` block: `[DEFAULT]` with `ignoreip = 127.0.0.1/8 ::1 172.16.0.0/12 <user-whitelist>`, then a `[<jail_name>]` stanza per row.
4. **Write** the rendered config to `/opt/hermes/tmp/jail.local.tmp` (a shared host path mounted into both containers), then `docker exec hermes_fail2ban cp` it into `/config/fail2ban/jail.local` inside the fail2ban container. The two-step copy is required because the `hermes_commandbox` container can't write directly to fail2ban's `/config` mount.
5. **Reload** with `docker exec hermes_fail2ban fail2ban-client reload`, then flip both `intrusion_prevention_settings.config_synced` and every row's `intrusion_prevention_jails.config_synced` to `1`.

If any step fails, `ipSyncSuccess` stays `false`, the sync flags are **not** flipped, and the page surfaces the error banner from `cfcatch.message`. The next attempt retries from scratch — there is no half-applied state to clean up.

## What happens when IPS is disabled

The master `enabled = 0` toggle does two things synchronously, before the redirect:

1. Walks every enabled jail, runs `fail2ban-client status <jail>` to get the live banned IP list, then `fail2ban-client set <jail> unbanip <ip>` for each one. iptables drop rules are removed immediately.
2. Truncates `fail2ban_ips` so the DB ledger matches the now-empty iptables state.

After that, Apply Settings rewrites `jail.local` with `enabled = false` on every jail and reloads fail2ban — meaning **no new bans will be created**, and any in-flight attacker is immediately ungated. This is the right behavior for an emergency "I locked myself out" scenario, but the price is loss of the entire current ban list. Re-enabling does not restore prior bans.

## The IP Whitelist

Whitelist entries are static CIDR ranges that fail2ban's `ignoreip` directive treats as never-banable. The page accepts:

| Format | Example | Validation |
|---|---|---|
| IPv4 single | `192.168.1.100` | `inc/validate_ip_address.cfm` regex |
| IPv4 CIDR | `10.0.0.0/8` | IPv4 regex + numeric prefix 0–32 |
| IPv6 single | `::1` | `inc/validate_ip_address_ipv6.cfm` regex |
| IPv6 CIDR | `fe80::/10` | IPv6 regex + numeric prefix 0–128 |

The three protected entries (localhost v4, localhost v6, the Docker `172.16.0.0/12` block) are seeded on install and the delete handler refuses to remove them. The `172.16.0.0/12` entry exists because internal container-to-container traffic shows up in dovecot/authelia logs as coming from the Docker bridge — without it, an Authelia auth_request loop or a Dovecot LMTP redelivery could end up self-banning the gateway. The lock icon on those rows in the table reflects this.

## Manual Ban and Manual Unban

The Banned IPs card surfaces every row in `fail2ban_ips`, joined to `intrusion_prevention_jails` so the display picks up the friendly name and the bantime for the countdown column. Two admin actions sit on top of it:

### Manual Ban

`inc/intrusion_prevention_manual_ban.cfm` accepts an IP and a jail (or "ALL" to span every enabled jail). For each target jail:

1. Pre-check `fail2ban_ips` for an existing (IP, jail) row — skip if already banned in that jail.
2. Run `docker exec hermes_fail2ban fail2ban-client set <jail> banip <ip>`. Return value 1 (or "already banned" in the output) is treated as success.
3. Sleep 500 ms so the fail2ban action's `hermes-api-notify.sh` invocation has time to insert the row first.
4. `UPDATE fail2ban_ips SET ban_type='MANUAL', ban_source='ADMIN', note='Manually banned via Intrusion Prevention GUI' WHERE ip=... AND jail=...` — overwriting the AUTOMATIC row the notify script just inserted.

The 500 ms sleep is load-bearing: without it, the notify-script INSERT can race the manual UPDATE and the admin attribution is lost.

### Manual Unban

`inc/intrusion_prevention_manual_unban.cfm` accepts pipe-delimited `<ip>|<jail>` pairs from the checkbox row selection, runs `fail2ban-client set <jail> unbanip <ip>` for each pair, and deletes the matching row from `fail2ban_ips`. Errors from individual unbans don't abort the batch — the script counts successes and reports failures separately.

Manual bans are flagged as **Permanent** in the time-remaining column because they have no `bantime` from a jail — the absence of an automatic expiry is the whole point of a manual ban. The admin must explicitly unban them.

## The countdown timer

The Banned IPs DataTable renders a per-row countdown badge using the `banned_at + bantime` arithmetic done CFML-side, then a `data-unban-timestamp` attribute drives a 1-Hz JavaScript tick that recolors the badge as it counts down (yellow > red > expired). The countdown is purely cosmetic — the actual unban happens inside fail2ban's process based on the same arithmetic. If a row shows "Expired" but is still present, it just hasn't been reaped from `fail2ban_ips` yet; reload the page after a few seconds and it'll be gone.

## Operational truths about iptables backends

Modern Ubuntu hosts ship two iptables binaries: `iptables-legacy` (xtables / kernel `xt_*` modules) and `iptables-nft` (nftables backend with iptables-compatible CLI). The fail2ban container ships both. The page surfaces both command variants in the Info card precisely because the right one depends on which backend the host (and Docker) negotiated at install time:

```
docker exec hermes_fail2ban update-alternatives --display iptables
```

Picking the wrong one isn't catastrophic — it just shows empty chains, which can be confusing during a "why isn't my ban working?" investigation. The `hermes-iptables-*` action templates inside fail2ban use the alternatives-resolved `iptables` binary, so the daemon itself always picks the correct backend.

## License gating

The page is wrapped in the standard Pro check:

```cfml
<cfif NOT isDefined("session.edition") OR session.edition NEQ "Pro">
    <cfset proFeatureName = "Intrusion Prevention">
    <cfinclude template="./inc/license_pro_required.cfm">
    <cfabort>
</cfif>
```

Community installs see the gating panel and cannot reach the UI. The `hermes_fail2ban` container continues to run, its seeded jails default to disabled, and `jail.local` on disk reflects whatever was last applied. There is no behind-the-scenes auto-disable on license-state change — switching from Pro to Community does not flip jails off.

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_intrusion_prevention.cfm` | `hermes_commandbox` | Main page (cards, modals, action handlers) |
| `config/hermes/var/www/html/admin/2/inc/intrusion_prevention_generate_config.cfm` | `hermes_commandbox` | Render `jail.local` + reload fail2ban |
| `config/hermes/var/www/html/admin/2/inc/intrusion_prevention_get_status.cfm` | `hermes_commandbox` | Live `fail2ban-client status` parsing for jail/ban counters |
| `config/hermes/var/www/html/admin/2/inc/intrusion_prevention_manual_ban.cfm` | `hermes_commandbox` | Multi-jail manual ban with API-notify race handling |
| `config/hermes/var/www/html/admin/2/inc/intrusion_prevention_manual_unban.cfm` | `hermes_commandbox` | Batch unban handler |
| `config/hermes/var/www/html/admin/2/inc/generate_container_ips.cfm` | `hermes_commandbox` | Writes `/opt/hermes/tmp/container_ips.env` for the notify script |
| `config/hermes/var/www/html/admin/2/inc/fail2ban_ban_unban.cfm` | `hermes_commandbox` | API endpoint hit by `hermes-api-notify.sh` (token-authed) |
| `config/fail2ban/config/fail2ban/jail.local` | `hermes_fail2ban` (mounted) | Live jail config — rewritten on every Apply |
| `config/fail2ban/scripts/hermes-api-notify.sh` | `hermes_fail2ban` | Posts ban/unban events back to Hermes API |
| `config/fail2ban/scripts/detect-iptables-backend.sh` | `hermes_fail2ban` | One-shot at container start to pick legacy vs nft |
| `/opt/hermes/tmp/jail.local.tmp` | both | Ephemeral rendered config; `docker exec cp`-ed into the fail2ban mount |
| `/opt/hermes/tmp/container_ips.env` | both | DB and Commandbox IPs for the API notify script (host networking has no DNS) |

## Related

- [Admin Console Firewall](https://docs.deeztek.com/books/administrator-guide/page/console-firewall) — the complementary static-allowlist layer; IPS is reactive, Console Firewall is preventative
- [Authentication Settings](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings) — Authelia's own Login Regulation (per-account brake) — the inner brake that complements this page's per-source-IP brake
- [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) — RemoteAuth-mode users also count against the authelia jail
- [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — changing the console host triggers a full nginx regen but does not touch fail2ban

# LDAP RemoteAuth

# LDAP RemoteAuth

_Pro Edition feature._ Maps to **System > LDAP RemoteAuth** (`view_remoteauth.cfm`, `edit_remoteauth_mapping.cfm`).

RemoteAuth lets Hermes authenticate selected users against an **upstream LDAP or Active Directory** server instead of storing their password in Hermes's own OpenLDAP. The page configures the upstream-to-domain mapping, global TLS settings, a one-shot bind test, and the apply-to-LDAP sync. Active Directory, OpenLDAP, 389 Directory Server, and FreeIPA are all supported through the same plumbing.

## What RemoteAuth is — and isn't

| Is | Isn't |
|---|---|
| A pass-through bind: at web login, Hermes binds against the upstream DN with the supplied password and accepts or rejects accordingly | A directory sync. Hermes does not import users, groups, photos, or attributes from upstream. |
| Per-user opt-in, via `auth_type = 'remote'` + `remoteauth_domain` on the recipient/system-user row | A whole-installation toggle. Local-auth and remote-auth users coexist in the same directory and the same UI. |
| Implemented as an **OpenLDAP `remoteauth` overlay** in Hermes's `hermes_ldap` container | A reinvented bind proxy. The heavy lifting is `slapo-remoteauth(5)` against a stub user with a `seeAlso` pointer. |
| The credential path for **web login only** — `/users`, `/nc`, `/admin` (via Authelia → LDAP bind) | The credential path for **IMAP/SMTP/CalDAV/CardDAV**. Those continue to authenticate against Hermes-issued app passwords; see [Credential Model](https://docs.deeztek.com/books/administrator-guide/page/credential-model) for the full picture. |

> **Operational consequence.** A remote-auth user's mail-client / DAV passwords still live in Hermes (`app_passwords` table, hashed). The upstream directory password is never exposed to Dovecot or Nextcloud DAV — only to the web gate. If the customer's IT team rotates the upstream password, the user's app passwords keep working until they are explicitly revoked. This is by design (see [Credential Model § Local-auth users vs. remote-auth users](https://docs.deeztek.com/books/administrator-guide/page/credential-model#local-auth-users-vs-remote-auth-users)).

## How it works under the hood

```
Web login (/admin, /users, /nc)
        │
        ▼
   Authelia
        │  LDAP bind to Hermes OpenLDAP
        ▼
hermes_ldap  (slapd)
        │
        │  user entry has seeAlso=<upstream DN>
        │  user entry has associatedDomain=<mapping key>
        │
        ▼
slapo-remoteauth overlay
        │  matches associatedDomain → upstream server URI
        │  rewrites the bind to the seeAlso DN
        ▼
External AD / LDAP server  (customer's DC)
        │
        ▼  bind result returned up the chain
   Authelia decision: PASS or FAIL
```

The overlay is configured in `cn=config` on the `mdb` database. Hermes's CFML never bind-checks the upstream itself at login time — that is the overlay's job. The CFML only **writes** the overlay configuration when an admin clicks **Apply Settings**.

## OpenLDAP remoteauth is a singleton overlay

This is the single most important constraint to understand when reasoning about why the page works the way it does.

| Constraint | Consequence in the UI |
|---|---|
| `slapo-remoteauth` allows **only one overlay instance** per database | All mappings live inside the same overlay |
| `olcRemoteAuthMapping` is multi-valued **but has no equality matching rule** | You cannot `ldapmodify add` a single mapping to an existing overlay. The entire overlay must be rebuilt. |
| `olcRemoteAuthTLS` is a single string applied **to all mappings inside the overlay** | TLS settings (STARTTLS, certificate verification, CA cert path, retry count) are **global**, not per-mapping |

`inc/ldap_remoteauth_sync_all.cfm` therefore implements **full replacement on every save**: delete the existing overlay, rebuild it from `remoteauth_mappings` + `remoteauth_settings`. There is no incremental update path. The page's pending-changes badge reflects this — every edit marks `ldap_synced = 0` on both tables, and **Apply Settings** flips it back to `1` only after the full rebuild succeeds.

### Multiple upstream servers with different CAs

Because TLS is global, an installation that binds to multiple upstream LDAP servers signed by different CAs must upload a **concatenated CA bundle**:

```
cat dc01-ca.pem dc02-ca.pem dc03-ca.pem > ca-bundle.pem
```

The page accepts the bundle as-is in the **CA Certificate** file picker. OpenLDAP walks the bundle when validating any of the configured upstream servers.

## Database schema

Two tables drive the page. Both are in the `hermes` database.

| Table | Role |
|---|---|
| `remoteauth_settings` | Six rows, key/value: `enabled`, `tls_starttls`, `tls_reqcert`, `ca_cert_file`, `retry_count`, `ldap_synced` |
| `remoteauth_mappings` | One row per upstream-LDAP-to-domain mapping (`domain_name` UNIQUE, `server_address`, `server_port`, `remote_dn_pattern`, `description`, `enabled`, `ldap_synced`) |

Two user-bearing tables carry RemoteAuth references:

| Table | Columns | Role |
|---|---|---|
| `recipients` | `auth_type ENUM('local','remote')`, `remoteauth_domain VARCHAR(255)` | Relay recipients can be RemoteAuth-mode |
| `system_users` | `auth_type ENUM('local','remote')`, `remoteauth_domain VARCHAR(255)` | Console admins / reader users can be RemoteAuth-mode |

The `mailboxes` table does **not** carry `auth_type` yet. RemoteAuth-for-mailboxes is planned but not yet wired (see [Future work](#future-work)).

## DN pattern placeholders

The `remote_dn_pattern` column stores the upstream DN with four substitutable tokens. Substitution happens in `inc/ldap_add_user_remoteauth.cfm` at user-create time, baked into the `seeAlso` attribute on the local stub entry.

| Token | Source | Notes |
|---|---|---|
| `{username}` | Local part of email (`jsmith@company.com` → `jsmith`) — uses `ListFirst(..., "@")`. For console admins where the username has no `@`, the whole string is used. | Matches `sAMAccountName`/`uid` patterns |
| `{firstname}` | `givenName` field on the add form | Required if the DN pattern uses it |
| `{lastname}` | `sn` field on the add form | Required if the DN pattern uses it |
| `{email}` | Full email address as entered | Useful for `mail=` patterns |

Common patterns the in-page help surfaces:

| Directory type | Pattern |
|---|---|
| AD (display name as CN) | `cn={firstname} {lastname},ou=Users,dc=example,dc=com` |
| AD (sAMAccountName as CN) | `cn={username},ou=Users,dc=example,dc=com` |
| OpenLDAP / FreeIPA | `uid={username},ou=People,dc=example,dc=com` |

The pattern must match the upstream's actual naming convention **exactly**. A wrong pattern produces `ldap_bind: Invalid DN syntax` or `Invalid credentials` at login time; use the **Test** button before saving to confirm.

## The local stub entry

For each RemoteAuth user, Hermes creates a normal `inetOrgPerson + domainRelatedObject` entry in `ou=users,dc=hermes,dc=local` with **no `userPassword` attribute** and the two overlay-driving attributes set:

```
dn: cn=jsmith,ou=users,dc=hermes,dc=local
objectClass: inetOrgPerson
objectClass: domainRelatedObject
givenName: John
sn: Smith
displayName: John Smith
mail: jsmith@company.com
uid: jsmith
seeAlso: cn=John Smith,ou=Users,dc=company,dc=com    <-- expanded from {firstname}/{lastname}/etc.
associatedDomain: company                            <-- the mapping key
```

At bind time the overlay reads `associatedDomain`, looks up the matching `olcRemoteAuthMapping`, opens an LDAP connection to that upstream URI, and re-binds as `seeAlso` with the supplied password. The local entry has no password to validate against, so the overlay's decision is the only decision.

## Test Connection button

The Test modal does **not** consult the saved settings end-to-end — it does its own `ldapwhoami` against the mapping's `server_address:server_port`, applying the same DN pattern substitution the overlay would and honoring the global STARTTLS setting. The credentials entered in the modal are used for one bind attempt:

```
docker exec hermes_ldap ldapwhoami -x -H ldap://<server>:<port> \
    -D "<DN expanded from pattern>" -w "<password>"  [-ZZ if STARTTLS]
```

Success is detected by `dn:` or `u:` in the response. Failure surfaces the raw stderr from `ldapwhoami`. The bind credentials are never stored — they live only for the duration of the request, then disappear.

This is intentionally **separate from the overlay flow**: it lets an admin verify the DN pattern and network path before clicking Apply Settings (which would rebuild the overlay and potentially break live logins).

## DNS resolution prerequisite

The `hermes_ldap` container resolves hostnames through Hermes's own Unbound resolver — by default, public recursive DNS. **Internal-only AD/LDAP hostnames** (typical: `dc01.corp.example.com` on a split-horizon zone) will not resolve, and bind attempts fail with `remoteauth_bind operations error`.

Fix before creating a mapping: add a **DNS Local Record** at **System > [DNS Resolver](https://docs.deeztek.com/books/administrator-guide/page/dns-resolver)** pointing the upstream FQDN to its actual IP. Verify from inside the container:

```
docker exec hermes_ldap getent hosts <ad-hostname>
```

Publicly-resolvable hostnames don't need this step.

## TLS settings reference

| Setting | Values | Notes |
|---|---|---|
| **Use STARTTLS** | `yes` / `no` | Upgrades the connection on the standard `389` port. Mutually exclusive with LDAPS on `636` (use one or the other). |
| **TLS Certificate Requirement** | `never`, `allow`, `try`, `demand` | Maps directly to `TLS_REQCERT` in the libldap conf. `never` is the only mode that does **not** require a CA cert; the others all expect a valid `ca_cert_file` to compare against. |
| **CA Certificate** | PEM file (`.pem`, `.crt`, `.cer`) | Stored at `/opt/hermes/certs/remoteauth/global_remoteauth_ca.pem` (single canonical filename — uploading replaces). For multi-server installs, concatenate all CAs into a bundle. |
| **Retry Count** | `1`–`10` (default `3`) | Number of bind retries before reporting failure |

The CA field hides itself when `tls_reqcert = never` (purely a UX hint — the file still exists on disk if previously uploaded).

## Apply Settings — the sync flow

Every save handler (`add_mapping`, `update_mapping`, `delete_mappings`, `update_tls_settings`, `set_remoteauth_status`) sets `ldap_synced = 0` on the touched rows AND on `remoteauth_settings`. The page banner switches from green **Synced** to amber **Pending Changes**. Nothing has actually changed in LDAP yet.

**Apply Settings** runs `inc/ldap_remoteauth_sync_all.cfm`, which is a hard three-step sequence:

1. **Delete** the existing overlay (`ldap_remoteauth_delete_overlay.cfm`) — succeeds whether or not one exists.
2. If `enabled = 1` and at least one mapping has `enabled = 1`: **fetch the next overlay index** and the MDB database index (`ldap_remoteauth_get_overlay.cfm`), then **create** the new overlay with all enabled mappings baked in (`ldap_remoteauth_add_overlay.cfm`). The LDIF template is `/opt/hermes/templates/ldap_remoteauth_add_overlay.ldif`, populated via `REReplace` against `THE_OVERLAY_INDEX`, `THE_MDB_INDEX`, `THE_DEFAULT_DOMAIN`, `THE_MAPPING_LINES`, `THE_STARTTLS`, `THE_TLS_REQCERT`, `THE_TLS_CACERT`, `THE_RETRY_COUNT`.
3. **Flip `ldap_synced = 1`** on both tables.

If step 1 or 2 fails, the database `ldap_synced` flags are **not** flipped — the page stays amber, and the next attempt will retry from scratch. There is no half-applied state to clean up because the overlay is rebuilt from zero each time.

> **Failure semantics.** While the overlay is being rebuilt (typically subsecond), live remote-auth web logins will fail with `Operations error` until step 2 completes. Plan Apply Settings during low-login windows. Local-auth users are unaffected.

## Deletion validation

A domain mapping cannot be deleted if any user references it. The check runs against **two** tables at delete time:

```sql
SELECT remoteauth_domain, COUNT(*) FROM system_users
 WHERE auth_type = 'remote' AND remoteauth_domain IN (...);

SELECT remoteauth_domain, COUNT(*) FROM recipients
 WHERE auth_type = 'remote' AND remoteauth_domain IN (...);
```

If either returns rows, the delete is rejected with a list of the blocked domains. The admin must either reassign those users to a different mapping or delete the users first.

> **Known gap (#102 and the mailbox/relay TODO).** When RemoteAuth is extended to **mailboxes** (a planned feature), this validation must add a third query against the `mailboxes` table. Both `view_remoteauth.cfm` (bulk delete, line ~330) and `edit_remoteauth_mapping.cfm` (single delete, line ~129) need to be updated together — they implement the check independently.

## Adding RemoteAuth users in bulk — CSV format

`add_internal_recipients.cfm` (Relay Recipients > Add) supports a RemoteAuth dropdown when the page detects an enabled mapping. When the selected mapping's DN pattern uses `{firstname}` or `{lastname}`, the textarea switches to **CSV mode** because email-only input doesn't carry enough data to expand the pattern.

| DN pattern tokens used | Textarea format |
|---|---|
| `{username}` and/or `{email}` only | One email address per line |
| Includes `{firstname}` or `{lastname}` | `First,Last,Email` per line — one recipient per row |

Header rows (`"GivenName","Surname","Mail"`) are auto-detected and skipped. Unknown columns are ignored, so common export formats work as-is:

- **PowerShell**: `Get-ADUser -Filter * -Properties GivenName,Surname,Mail | Select GivenName,Surname,Mail | Export-Csv users.csv -NoTypeInformation`
- **CSVDE** (Windows Server built-in): `csvde -f users.csv -l "givenName,sn,mail"`
- **Excel / manual**: three columns saved as CSV

Each row is inserted with `auth_type = 'remote'` and `remoteauth_domain = <mapping key>`. The local LDAP stub is created via `ldap_add_user_relay_remoteauth.cfm`, which calls the same template/placeholder machinery described above. A welcome email is sent via `send_recipient_welcome_email_remoteauth.cfm` — the message tells the user to sign in with their **organization (AD/LDAP) password**, not a Hermes-issued one.

## Status, enable, disable

The **RemoteAuth Status** dropdown (`enabled = 0/1`) is the master switch. Disabling does **not** delete the overlay's mappings — it just causes the next Apply Settings cycle to skip step 2 entirely, leaving the overlay absent. Re-enabling and re-applying rebuilds it from the same `remoteauth_mappings` rows. This is useful for emergency cutover back to a local-only state without losing the mapping configuration.

The **LDAP Overlay** badge on the page reads the live state from `cn=config` (via `ldapsearch -Y EXTERNAL` against `(objectClass=olcRemoteAuthCfg)`) and reports **Active** or **Not configured**. This is independent of the DB-side `enabled` flag — if the two disagree (e.g., DB says enabled but the badge says Not configured), the next Apply Settings will reconcile.

## License gating

The page is wrapped in the standard Pro-only guard:

```
<cfif NOT isDefined("session.edition") OR session.edition NEQ "Pro">
    <cfinclude template="./inc/license_pro_required.cfm">
    <cfabort>
</cfif>
```

Community-edition installs see the standard "Pro feature required" panel and cannot reach the configuration UI. Pre-existing RemoteAuth-mode users continue to authenticate (the overlay itself is in `cn=config` and not license-checked), but no new mappings can be added or edited until a Pro license is activated.

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_remoteauth.cfm` | `hermes_commandbox` | Main page |
| `config/hermes/var/www/html/admin/2/edit_remoteauth_mapping.cfm` | `hermes_commandbox` | Edit single mapping |
| `config/hermes/var/www/html/admin/2/inc/ldap_remoteauth_sync_all.cfm` | `hermes_commandbox` | Apply Settings orchestrator |
| `config/hermes/var/www/html/admin/2/inc/ldap_remoteauth_add_overlay.cfm` | `hermes_commandbox` | LDIF render + `ldapadd` |
| `config/hermes/var/www/html/admin/2/inc/ldap_remoteauth_delete_overlay.cfm` | `hermes_commandbox` | `ldapdelete` of existing overlay |
| `config/hermes/var/www/html/admin/2/inc/ldap_add_user_remoteauth.cfm` | `hermes_commandbox` | Create local stub entry with `seeAlso`/`associatedDomain` |
| `config/hermes/opt/hermes/templates/ldap_remoteauth_add_overlay.ldif` | `hermes_commandbox` | Overlay LDIF template (placeholder-substituted) |
| `config/hermes/opt/hermes/templates/ldap_adduser_remoteauth.ldif` | `hermes_commandbox` | Stub-user LDIF template |
| `/opt/hermes/certs/remoteauth/global_remoteauth_ca.pem` | `hermes_ldap` (mounted) | CA / CA-bundle for upstream TLS |
| `/opt/hermes/tmp/<token>_remoteauth_add_overlay.ldif` | `hermes_commandbox`, `hermes_ldap` | Ephemeral rendered LDIF; deleted after `ldapadd` |
| `cn=config` (in `hermes_ldap`) | `hermes_ldap` | Live overlay configuration |

Every shell-out uses `docker exec hermes_ldap …` per the standard Hermes Docker pattern.

## Future work

- **#102** — when RemoteAuth is wired to mailboxes (currently relay-recipients and console users only), deletion validation in `view_remoteauth.cfm` and `edit_remoteauth_mapping.cfm` must add a third query against `mailboxes`.
- **Position-2 mapping unique index hardening** — `remoteauth_mappings.domain_name` is `UNIQUE` but the upstream `server_address` is not; an admin can accidentally create two mappings to the same DC under different domain keys. Not a bug, but worth surfacing in a validation hint.
- **Group-based authorization** — current model is "if the upstream bind passes, the user is in." There's no upstream-group filter (e.g., "only members of `cn=hermes-users` may log in"). For installs that need this today, restrict at the upstream side with a dedicated OU.

## Related

- [Credential Model](https://docs.deeztek.com/books/administrator-guide/page/credential-model) — full picture of how RemoteAuth slots into the four-credential architecture (web vs. mail vs. DAV)
- [System Users](https://docs.deeztek.com/books/administrator-guide/page/system-users) — creating console admins/readers with RemoteAuth mode
- [DNS Resolver](https://docs.deeztek.com/books/administrator-guide/page/dns-resolver) — required prerequisite for internal-only AD hostnames

# Mail Queue

# Mail Queue

Admin path: **System > Mail Queue** (`view_mail_queue.cfm`,
`inc/get_mail_queue_settings.cfm`, `inc/mail_queue_get_queue.cfm`,
`inc/mail_queue_action.cfm`, `inc/mail_queue_flush_mailqueue.cfm`,
`inc/mail_queue_set_queue_settings.cfm`, `view_mail_queue_message.cfm`,
`inc/mail_queue_view_message.cfm`).

This page is the operator's window into **Postfix's on-disk queue inside
`hermes_postfix_dkim`** — the messages Postfix has accepted but not yet
finally delivered or bounced. It does two unrelated jobs that share one
page:

1. **Queue Settings** — two Postfix tunables (`bounce_queue_lifetime`
   and `maximal_queue_lifetime`) stored in the `parameters` table and
   pushed into `main.cf` via the generic Postfix config regen path.
2. **Queue Viewer / Actions** — a live read of `mailq` plus per-message
   Hold / Unhold / Re-queue / Delete operations and a queue-wide Flush.

The viewer is read-only against `mailq`; everything that mutates the
queue goes through `postqueue` or `postsuper` inside the container.
Hermes never edits `/var/spool/postfix/*` directly, so admin actions
respect Postfix's own queue locking and are safe to run while mail is
flowing.

## The queue this page shows — and the ones it doesn't

```
  ┌────────────────────────────────────────────────────────────┐
  │ hermes_postfix_dkim   (the queue this page reads)          │
  │   /var/spool/postfix/{maildrop, incoming, active,          │
  │                       deferred, hold, corrupt}             │
  └─────────┬──────────────────────────────────────────────────┘
            │ (content filter loop)
            ▼
  ┌────────────────────────────────────────────────────────────┐
  │ hermes_mail_filter    (Amavis + ClamAV + SpamAssassin)     │
  │   transient per-message work, not a persistent queue       │
  └─────────┬──────────────────────────────────────────────────┘
            │
            ▼
  ┌────────────────────────────────────────────────────────────┐
  │ hermes_dovecot        (LMTP delivery to mailboxes)         │
  │   no Postfix queue here; failures bounce back to the       │
  │   postfix queue above                                      │
  └────────────────────────────────────────────────────────────┘
```

Postfix is the only component that maintains a persistent on-disk
spool. A message you see in this viewer is a message Postfix is still
holding — it has not been handed off to the next hop (LMTP to Dovecot,
remote MX, satellite Amavis), or it was handed off and bounced back
into `deferred`, or an admin moved it into `hold`. Amavis's transient
work is not a "queue" in the Postfix sense and is not visible here; if
the content filter is stuck, messages pile up in `active` on the
gateway side, which this page does surface.

## Queue Settings

Two values, both saved into rows of the `parameters` table keyed by
`parameter = 'bounce_queue_lifetime'` / `'maximal_queue_lifetime'`
(`child = 2` parent rows, with the user-selected value stored in the
`child = 1` row). The dropdowns range 0–90 days.

| Setting | `main.cf` directive | Meaning |
|---|---|---|
| **Bounce Queue Lifetime** | `bounce_queue_lifetime` | How long Postfix retries a bounce message that cannot be delivered to its envelope sender before giving up. `0` means single-delivery attempt only — failing bounces are double-bounced to the postmaster immediately. |
| **Max Queue Lifetime** | `maximal_queue_lifetime` | How long Postfix retries a normal message before generating a permanent failure (bounce) to the sender. `0` means single-delivery attempt only. |

Both values are stored as integers in the dropdown but written into the
DB with the `d` suffix (e.g. `5d`) so they go straight into `main.cf`
unmodified. Hermes regenerates `main.cf` from the `parameters` table on
save and reloads Postfix; there is no incremental edit path. See the
[Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup) doc for the broader Postfix regen
pipeline.

> **Why `0` is a real choice.** `bounce_queue_lifetime = 0` is the
> upstream-recommended default for relays — a bounce that cannot be
> delivered is more likely a forged sender than a real recipient
> mailbox, and keeping it in the queue for days wastes attempts on
> joe-job traffic. Leave the seed value unless you have a specific
> reason to change it.

## Queue Viewer — how the table is built

`inc/mail_queue_get_queue.cfm` does the live read in three phases:

1. **Summary probe.** Runs
   `docker exec hermes_postfix_dkim /bin/bash -c '/usr/bin/mailq | /usr/bin/tail -1'`
   to read just the trailing `-- N Kbytes in M Requests.` line and
   parse `M` out as the total queue count. This is cheap — no full
   parse, no full transfer of the queue contents.
2. **Overload gate.** If the total exceeds **500** (`maxQueueLoad`),
   the viewer refuses to load the queue at all. The page renders a red
   callout with the count and shell hints (`postsuper -d ALL`,
   `postsuper -H ALL`) for the admin to recover from the command line.
   This is a self-protection step — parsing tens of thousands of
   `mailq` lines in CFML would hang the page and lock a CommandBox
   worker thread.
3. **Full parse.** If under 500, runs `docker exec hermes_postfix_dkim
   /usr/bin/mailq` and parses the multi-line output in CFML into a
   query object with `QueueID`, `Sender`, `Recipient`,
   `ConnectionStatus`, and `MsgStatus`. The display table is capped at
   **100 rows** (`maxQueueDisplay`); a yellow callout appears if the
   queue has between 101 and 500 entries.

The parser reads the per-entry queue-ID suffix to derive the status
column. Postfix's `mailq` marks active messages with `*` and held
messages with `!` after the queue ID; everything else is treated as
`deferred` (rendered as `N/A` in the badge). This is by design — the
viewer is a snapshot, not a queue-state diff.

| Suffix | `mailq` meaning | Rendered as |
|---|---|---|
| `*` | currently being delivered (in `active`) | green `ACTIVE` badge |
| `!` | admin-held (in `hold`) | yellow `ON-HOLD` badge |
| (none) | waiting for retry (in `deferred`) | grey `N/A` badge |

The `ConnectionStatus` column is whatever Postfix put in parentheses on
the line after the message header (typically the SMTP error from the
last delivery attempt — `Connection refused`, `Greylisted, please
try again`, etc.). For messages that have never been attempted it is
blank.

### View Message (`view_mail_queue_message.cfm`)

Clicking the magnifying glass on a row opens a full dump of the queued
message — headers and body — via `docker exec hermes_postfix_dkim
/usr/sbin/postcat -q <queueid>`. The output is rendered into a plain
textarea with a print button. No edit, no resend; if you need the
message to go out, use Re-queue from the main viewer.

## Per-message actions

All four mutation actions converge on `inc/mail_queue_action.cfm`,
which validates the queue ID against `^[A-Fa-f0-9]+$` (defence against
shell injection) and shells out to `postsuper` with the right flag:

| Action | Postsuper flag | What it does | Typical use |
|---|---|---|---|
| **Hold** | `-h` | Moves the message into `hold/`. Postfix will not touch it again until unheld. | Pause a stuck loop, freeze a message for forensic copy, hold while debugging upstream issues |
| **Unhold** | `-H` | Moves the message back into `deferred/` so retries resume | Recover a held message after the underlying issue is fixed |
| **Re-queue** | `-r` | Re-injects the message through the cleanup daemon, re-applying milter chain (OpenDKIM, OpenDMARC, body milter), header_checks, etc. | Force a fresh content-filter pass — useful after fixing a milter, updating a header_check rule, or changing a relay map |
| **Delete** | `-d` | Removes the message from the queue **permanently**. No undo. | Drop spam, drop a stuck message you don't want re-delivered, drop a confirmed mail loop |

The action handler loops the selected queue IDs and invokes `postsuper`
once per ID via a generated temp script under `/opt/hermes/tmp/` —
`postsuper` writes its result to stderr, and the temp-script pattern
(with `2>&1`) is the only reliable way to capture it from `cfexecute`.
Per-ID success or failure is counted independently; the result alert
shows both the count and the queue IDs in each bucket.

> **Re-queue is not the same as Flush.** Re-queue re-injects through
> the milter / content-filter chain (so a fresh OpenDKIM signature is
> generated, the disclaimer milter runs again, etc.). Flush just nudges
> Postfix to retry delivery on what is already in `deferred`. If a
> message is broken because of a milter failure during the original
> intake, Re-queue can fix it; Flush will not.

## Flush Queue

The Flush button runs
`docker exec hermes_postfix_dkim /usr/sbin/postqueue -f`. This is a
queue-wide "retry now" — it scans the `deferred` queue and moves
eligible messages into `active` for an immediate delivery attempt.
Held messages are not touched.

A success result means `postqueue` exited cleanly, not that delivery
succeeded. If a deferred message's destination is still unreachable, it
goes right back into `deferred` after the attempt. Use the System Logs
page (or `/remotelogs/postfix/mail.log` for live tail) to see the
actual delivery outcomes.

## Overload mode — the bulk-recovery path

When the queue exceeds 500 messages the page deliberately refuses to
render the table. Both shell-hint commands in the callout are full
queue-wide operations that bypass the per-message UI:

```bash
# Delete everything in the queue (no exceptions, no confirmation)
docker exec hermes_postfix_dkim postsuper -d ALL

# Move every held message back to deferred
docker exec hermes_postfix_dkim postsuper -H ALL
```

These are the standard Postfix mass-action commands. There is no
selective `-d` for "delete only spam-bounce" or similar; if you need
granular cleanup of a large queue, filter first with `mailq` and a
custom shell pipeline, then run `postsuper -d` on the resulting list.

> **Why a hard cap and not pagination.** Pagination would require
> parsing the full `mailq` output to know the row count anyway, which
> is the expensive operation we are trying to avoid. The hard cap
> forces the admin into the command line where the right tools live for
> bulk queue work.

## Concurrent safety

Every action goes through `postqueue` or `postsuper`, which acquire
Postfix's own queue locks before touching files. Multiple admins
hitting the page in parallel cannot corrupt the queue — at worst, two
Delete clicks on the same queue ID will have one succeed and the other
return "no such queue file", which is rendered as a failure row in the
result alert. The viewer itself is read-only and the `mailq` snapshot
can race with mutations (a message you tick may have already been
delivered by the time you click the action), which is also fine — the
mutation just no-ops with the same "no such queue file" message.

## Related pages

- [Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup) — Postfix `myhostname`, `myorigin`,
  and the `parameters` → `main.cf` regen path that this page's Queue
  Settings hooks into.
- [System Logs](https://docs.deeztek.com/books/administrator-guide/page/system-logs) — where delivery outcomes for queued
  messages actually surface (Postfix logs to mail.* → rsyslog →
  `SystemEvents` → this viewer).
- [Intrusion Prevention](https://docs.deeztek.com/books/administrator-guide/page/ips) — IP-level bans for
  brute-force SMTP-AUTH that show up in Postfix's connection logs.

# Password Resets

# Password Resets

Admin path: **System > Password Resets** (`view_password_reset_requests.cfm`,
`inc/process_admin_password_reset.cfm`,
`inc/cancel_password_reset_requests.cfm`,
`inc/check_hibp.cfm`).

This is the admin-side **queue** for password-reset requests that
users have submitted from the public **Forgot Password** page
(`/user-auth/forgot_password.cfm`). Most requests resolve themselves
via email or Pushover and never need admin attention — the requests
that land on this page are the ones that **couldn't** be self-served.

The page is also where an admin can **manually reset** any user's
password (mailbox or relay) regardless of how the request arrived — it
is the single tool for forcing a password change.

## Where a request comes from

```
End user opens /user-auth/forgot_password.cfm
        │   (link from the /users portal login page; same page
        │    serves admin and user portals at the public URL)
        ▼
fills in email + CAPTCHA
        │
        ▼
process_password_reset_request.cfm runs:
  1. honeypot check (hidden field "fax_number_ext" must be empty)
  2. CAPTCHA validation (built-in math OR reCAPTCHA OR
     hCaptcha OR Turnstile — configured globally)
  3. 15-minute rate limit: refuse if a pending request for this
     email exists less than 15 minutes old
  4. LDAP lookup: find the user, determine type from group membership
        │
        ▼
route by user type
   ┌──────────────────┬──────────────────┬──────────────────┐
   ▼                  ▼                  ▼                  ▼
 RELAY            MAILBOX            ADMIN              REMOTE-AUTH
 (cn=relays)      (cn=mailboxes)     (cn=admins)        (any group)
   │                  │                  │                  │
   ▼                  ▼                  ▼                  ▼
 email token   secondary email     REFUSED              REFUSED
 to relay      verified?           (admins must         (password is
 user's        ├ YES → email       use peer-admin       upstream;
 external      │       to that     reset path on        Hermes never
 email         │       address     this page)           saw it)
               └ NO  → admin                            shown the same
                       queue (this                      generic "if an
                       page)                            account exists"
                                                        success page for
                                                        security
```

The route the request takes determines whether it ever shows up on this
page:

| Request shape | Lands here? |
|---|---|
| Relay user with valid email | **No** — email is sent automatically with a 15-minute reset link |
| Mailbox user with a verified secondary email | **No** — email is sent automatically to the secondary address |
| Mailbox user with no verified secondary email | **Yes** — admin must reset manually |
| Mailbox user with Pushover enabled | **No** — Pushover notification sent automatically |
| Admin self-service | **Never accepted** — admins must be reset by another admin from this page |
| RemoteAuth user (`auth_type = 'remote'`) | **Never accepted** — Hermes does not own the password (see below) |

> **By design.** Admin self-service password reset is blocked because
> a compromised admin email is an easy lateral-movement vector and the
> blast radius is the whole console. The forgot-password page shows
> the same generic "if an account exists, instructions have been sent"
> message for blocked admins as for blocked RemoteAuth users and for
> unknown emails — bots probing for admin usernames learn nothing.

## RemoteAuth requests are never accepted

For users with `recipients.auth_type = 'remote'` (or, in the future,
`mailboxes.auth_type = 'remote'`), the request flow short-circuits at
step 4 with the same generic success message as for unknown emails.
Hermes does **not** store, hash, or have any way to update the user's
password — it lives in the customer's upstream AD/LDAP.

These users must use their organization's own password-reset workflow
(self-service portal, helpdesk ticket, etc.). See
[LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) and
[Credential Model § Local-auth users vs. remote-auth users](https://docs.deeztek.com/books/administrator-guide/page/credential-model#local-auth-users-vs-remote-auth-users).

## Database schema — `password_reset_requests`

| Column | Purpose |
|---|---|
| `id` | PK |
| `email` | The address the user typed into the form |
| `ldap_username` | The `cn` resolved from LDAP at submission time |
| `user_type` | `relay`, `mailbox`, or `admin` (admin rows shouldn't exist in practice — the flow blocks them at submit) |
| `token` | 64-char random — the secret in the reset link emailed to the user |
| `notification_method` | `email`, `pushover`, or `admin` — how the user was notified |
| `status` | `pending`, `completed`, `expired`, `cancelled` |
| `requested_at` | When the user submitted the form |
| `expires_at` | NOW + 15 min for `email`/`pushover` methods; NULL for `admin` method (no link to expire) |
| `completed_at` | When the admin (or self-service flow) resolved it |
| `completed_by` | The admin username, or the system user that auto-resolved |

### Auto-cleanup runs on every page load

The page does **not** rely on a scheduled job for housekeeping. Two
DELETE queries run at the top of every request:

```sql
-- Cull expired pending requests (the reset link is dead anyway)
DELETE FROM password_reset_requests
 WHERE status = 'pending'
   AND expires_at < NOW();

-- Cull completed requests older than 30 days (audit window)
DELETE FROM password_reset_requests
 WHERE status = 'completed'
   AND completed_at < DATE_SUB(NOW(), INTERVAL 30 DAY);
```

This keeps the table bounded with no admin intervention. The 30-day
audit window is hardcoded — if you need longer retention for
compliance, that's a code change, not a configuration knob.

## The page surface

| Column | Notes |
|---|---|
| (checkbox) | Only renders for `pending` rows |
| Email | The user's submitted address |
| User Type | Badge: relay (info-blue), mailbox (primary), admin (warning) |
| Method | Icon + label: email envelope, Pushover bell, admin shield |
| Requested | Submission timestamp |
| Expires | NULL for admin-method rows; for time-bound rows, shows the timestamp + an "Expired" red badge if past and still pending |
| Status | pending (yellow), completed (green), expired (gray), cancelled (red) |
| Completed By | Admin username + timestamp once resolved |

Two action buttons sit above the table:

- **Reset Password** — opens the reset modal for the single selected
  pending row (alerts if zero or more than one is selected)
- **Cancel Request(s)** — opens a confirmation modal that hard-deletes
  every selected pending row

### Why notify-user is shown only for relay rows

The reset modal shows a **Notify user via email** checkbox **only**
when the selected row is a relay user. Mailbox and admin users have
their primary email == their mailbox address, which won't deliver
because the admin is about to change their login credential to a
mail-protocol component that's part of the same auth chain. Relay
users hold an external email address, so sending them a "your
password was reset" notification to that external address works.

## Admin reset flow

When the admin clicks **Reset Password** and confirms the modal,
`process_admin_password_reset.cfm` runs:

```
1. Form validation: passwords match, length >= 8, request_id present
2. (optional) HIBP check via api.pwnedpasswords.com — k-anonymity
   prefix lookup; reject on match
3. Lookup the row — must still be status='pending'
4. docker exec hermes_ldap slappasswd \
        -o module-load=argon2.la -h {ARGON2} \
        -s <new_password>
        --> returns {ARGON2}$argon2id$...
5. Render /opt/hermes/templates/ldap_modifyuserpassword.ldif
   (THE_USERNAME, THE_OU=users, THE_PASSWORD placeholders)
   to /opt/hermes/tmp/<token>_modifyuserpassword.ldif
6. docker exec hermes_ldap ldapmodify -Y EXTERNAL \
        -H ldapi:///... -f /opt/hermes/tmp/<token>_modifyuserpassword.ldif
7. Delete the temp LDIF
8. If the user has a Nextcloud account (mailboxes.nextcloud_enabled=1):
        docker exec -e OC_PASS=<new> -u www-data hermes_nextcloud \
          php /var/www/html/occ user:resetpassword \
          --password-from-env <email>
   (sync NC's local password column — see Credential Model for why
    NC keeps a local password that no human knows)
9. UPDATE password_reset_requests
        SET status='completed', completed_at=NOW(), completed_by=<admin>
10. UPDATE password_reset_requests SET status='expired'
        WHERE email=<email> AND status='pending' AND id != <this one>
    (clears stale pending duplicates the user may have submitted)
11. If notify_user checked (relay rows only):
        cfmail via hermes_postfix_dkim:10026 — generic "your password
        was reset by an administrator" template with the console URL
```

Two non-obvious bits:

- **Two hashing tools, one outcome.** This page uses `slappasswd` with
  the OpenLDAP argon2 module loaded; [System Users](https://docs.deeztek.com/books/administrator-guide/page/system-users)
  uses the Authelia CLI image. Both produce `{ARGON2}$argon2id$...`
  hashes that the same OpenLDAP overlay validates. They are
  interchangeable; the difference is historical (this page predates
  the Authelia-image hashing pattern). Either is correct.

- **Nextcloud password sync via temp shell script.** Step 8 writes a
  shell script to `/opt/hermes/tmp/` and runs it instead of
  `cfexecute`ing `docker exec` directly. The script wrapper exists
  because Lucee's `cfexecute` mishandles stderr, quoting, and `OC_PASS`
  env-var injection on commands of this shape, and the temp-script
  pattern is the established Hermes workaround.

## Cancel flow

`cancel_password_reset_requests.cfm` performs a hard `DELETE` against
every selected `pending` row. There is no soft-delete — the row is
gone, the user must submit a new request if they still need help. This
is the right shape because the request never carried valuable data;
it's just a "please help me" signal.

The admin username doing the cancel is **not** recorded — only
completions record `completed_by`. If audit trail matters for
cancellations, that's a planned schema extension.

## CAPTCHA — the public side

The forgot-password page picks a CAPTCHA provider from `system_settings`
at runtime. Four providers are supported today:

| `captcha_provider` | What appears on the page |
|---|---|
| `builtin` (default) | Math word-problem ("What is three plus seven?") — no third-party JS, no cookie, no API key required. ~225 unique combinations across addition (1-10), subtraction (1-10, positive result), and small multiplication (1-5). |
| `recaptcha` | Google reCAPTCHA v2 — site key + secret key required |
| `hcaptcha` | hCaptcha — site key + secret key required |
| `turnstile` | Cloudflare Turnstile — site key + secret key required |

All four use the same flow: client-side widget posts a token with the
form, server-side `process_password_reset_request.cfm` validates the
token (for external providers, via HTTPS POST to the provider's
`siteverify` endpoint). Failed validation always redirects back with
reason code `9` ("invalid CAPTCHA"). For external providers, if the
provider's API is unreachable from Hermes, the page treats the request
as invalid — failing closed is the right call on a brute-force
defense surface.

A **honeypot** field (named `fax_number_ext`, hidden via CSS) runs
**before** the CAPTCHA check. Real users never see or fill it; bots
that submit the entire form are silently rejected with the same
generic success page so they can't tell their submission was
discarded.

## Rate limiting — the 15-minute window

`process_password_reset_request.cfm` queries for any `pending` row
with the same email submitted in the last 15 minutes; if one exists,
the new submission is refused with reason `8`. The window is per-email,
not per-IP — a malicious actor enumerating addresses can still hit
many emails in parallel, but cannot spam any single one.

The window is hardcoded; if you need longer cool-down for a
high-noise environment, that's a code change.

## Token security

For email and Pushover methods, the user receives a link of shape:

```
https://<console>/user-auth/reset_password.cfm?token=<64-char-random>
```

- The token is 64 hex chars from `inc/generate_customtrans.cfm` —
  cryptographically strong, single-use.
- It expires after **15 minutes** (`expires_at` column).
- It is **single-use**: when the user successfully completes the
  reset, the row's `status` flips to `completed`, and the
  reset_password.cfm endpoint rejects further use.
- Submitting a new request invalidates any earlier pending request
  for the same email (step 10 of the admin reset above; the
  user-side reset endpoint does the equivalent).

For the `admin` method (the rows that show up on this page), the
token still exists in the row but the **expires_at is NULL** — there
is no email link to expire because no email was sent. The admin
resolves the request when they get to it; the queue serves as the
notification channel.

## What this page does NOT do

| Concern | Lives on |
|---|---|
| Admin's own password change | They sign in to `/admin/`, go to **My Settings** (or have another admin reset it from [System Users](https://docs.deeztek.com/books/administrator-guide/page/system-users)'s edit modal) |
| Configuring CAPTCHA provider + keys | Configured via `system_settings` rows; admin UI for this is planned. Defaults to `builtin` math CAPTCHA. |
| Configuring the rate-limit window | Hardcoded 15 minutes — code change required |
| Configuring the token TTL | Hardcoded 15 minutes — code change required |
| Pushover credentials per-user | Set on the user portal's **Account Settings** page; this page just consumes them |
| The reset email template / branding | Hardcoded in `process_password_reset_request.cfm` and `process_admin_password_reset.cfm`; uses `hermes_logo_new_orange2.png` as a CID attachment |
| 2FA device deletion | [System Users](https://docs.deeztek.com/books/administrator-guide/page/system-users)'s **Delete 2FA Devices** button — runs `authelia storage user totp delete` |

## Failure semantics

| What breaks | What happens |
|---|---|
| `hermes_ldap` down during admin reset | The `slappasswd` and `ldapmodify` calls fail; the admin sees the raw error, the request row stays `pending`, no password change. Retry after LDAP recovers. |
| `hermes_postfix_dkim` down during user-initiated email request | The cfmail throws; `process_password_reset_request.cfm` catches, flips the request row to `status='failed'`, and shows reason `6` ("Unable to send password reset"). |
| HIBP API unreachable | Server-side check silently passes (the JavaScript on the modal already warned the user; defense-in-depth pattern). The reset still completes. |
| Token guessed / brute-forced | Computationally infeasible at 64 hex chars (256 bits of entropy). |
| `hermes_nextcloud` down during admin reset step 8 | LDAP password is already updated; the NC sync step fails silently (caught in a non-fatal cftry). The user can log in to `/users` immediately; webmail and DAV will work as soon as NC is back. |

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_password_reset_requests.cfm` | `hermes_commandbox` | Page (table + 2 modals + auto-cleanup queries) |
| `config/hermes/var/www/html/admin/2/inc/process_admin_password_reset.cfm` | `hermes_commandbox` | Admin reset handler (LDAP + NC sync + audit + optional notify) |
| `config/hermes/var/www/html/admin/2/inc/cancel_password_reset_requests.cfm` | `hermes_commandbox` | Hard-deletes selected pending rows |
| `config/hermes/var/www/html/user-auth/forgot_password.cfm` | `hermes_commandbox` | Public-facing request entry point (CAPTCHA + honeypot + LDAP lookup) |
| `config/hermes/var/www/html/user-auth/inc/process_password_reset_request.cfm` | `hermes_commandbox` | Rate-limit check + token mint + INSERT + route to email/Pushover/admin |
| `config/hermes/var/www/html/user-auth/inc/ldap_get_user_groups.cfm` | `hermes_commandbox` | Determines user type from LDAP group membership |
| `config/hermes/var/www/html/user-auth/reset_password.cfm` | `hermes_commandbox` | Token-consuming endpoint that actually changes the password (user side) |
| `/opt/hermes/templates/ldap_modifyuserpassword.ldif` | `hermes_commandbox` | LDIF template for the password-replace operation |
| `/opt/hermes/tmp/<token>_modifyuserpassword.ldif` | `hermes_commandbox`, `hermes_ldap` | Ephemeral rendered LDIF; deleted after `ldapmodify` |
| `/opt/hermes/tmp/<token>_nc_pwd_update.sh` | `hermes_commandbox` | Ephemeral shell script for the NC `occ user:resetpassword` step |
| `password_reset_requests` table | `hermes_db_server` (`hermes` DB) | The queue itself |

Every shell-out uses `docker exec hermes_ldap …`, `docker exec hermes_nextcloud …`, or the standard `hermes_postfix_dkim:10026` re-injection port per the canonical Hermes pattern.

## Related documentation

- [System Users](https://docs.deeztek.com/books/administrator-guide/page/system-users) — admin-account CRUD; password changes for admins happen there, not on this page
- [Credential Model](https://docs.deeztek.com/books/administrator-guide/page/credential-model) — why mailbox users carry both a web-login password (reset here) and separate per-device app passwords (reset elsewhere)
- [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) — why remote-auth users cannot be reset through this page
- [Authentication Settings](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings) — the Authelia JWT secret used for the reset-link signature on the user-side reset endpoint
- [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — the console hostname embedded in the reset-link emails
- [Intrusion Prevention](https://docs.deeztek.com/books/administrator-guide/page/ips) — Fail2ban `authelia` jail; layered defense against brute-force on the login surface this page protects

# Scheduled Tasks

# Scheduled Tasks

Admin path: **System > Scheduled Tasks** (`view_scheduled_tasks.cfm`,
`inc/ofelia_generate_config.cfm`, `inc/run_scheduled_task_action.cfm`,
`inc/toggle_ofelia_job_action.cfm`, `inc/restart_ofelia.cfm`).

This page is the admin surface over **Ofelia**, Hermes's cron runner.
Ofelia (`mcuadros/ofelia:latest`) sits next to the application
containers, mounts the Docker socket, and on a schedule does `docker
exec <container> <command>` for each configured job. The page lists
every job in the `ofelia_jobs` table, displays its humanized schedule
and last manual-run timestamp, and exposes per-row **Enable/Disable**
and **Run Now** controls.

Hermes does not use the host's crond. Every recurring task — certificate
renewal, the daily update check, quarantine notifications,
mail-queue health checks, DMARC report processing, malware-feed refresh,
log rotation — runs through this single Ofelia container and is
manageable from this page.

## Why Ofelia and not host cron

A traditional host crontab does not fit Hermes's deployment model:

| Requirement | Host cron problem | Ofelia behavior |
|---|---|---|
| Run a command inside `hermes_commandbox` or `hermes_dmarc` on a schedule | Host cron has to `docker exec` from outside; failure modes (missing container, wrong user) surface in syslog, not in the admin UI | Ofelia speaks Docker natively; jobs are `job-exec` blocks against a named container |
| Notify the admin when a job fails | Cron emails the local UNIX user; meaningless inside a container deployment | Ofelia has a built-in SMTP notifier that emails `admin_email` via `hermes_postfix_dkim:10026` (the auto-DKIM-signing re-injection port) when `mail-only-on-error = true` |
| Survive a host reboot the same way every other Hermes service does | Cron units have to be packaged separately | `hermes_ofelia` is just another container in [`docker-compose.yml`](https://github.com/deeztek/Hermes-Secure-Email-Gateway/blob/main/docker-compose.yml); `restart: unless-stopped` covers it |
| Be inspectable and runnable on demand from the web UI | Out-of-band; admin would need shell access | This page reads the same table Ofelia reads and can re-fire any job synchronously |

The trade-off is that `config.ini` is regenerated from the database — so
direct hand-edits to `/etc/ofelia/config.ini` are **overwritten on every
save**. The DB is the source of truth.

## How a scheduled job flows through the stack

```
+-----------------------+
| ofelia_jobs (MariaDB) |    <-- canonical source of truth
+-----------+-----------+
            |
            | Save / Toggle / install --apply-schema
            v
+-----------------------------------------------------+
| inc/ofelia_generate_config.cfm                      |
|   1. SELECT * FROM ofelia_jobs WHERE active = '1'   |
|   2. Render /opt/hermes/tmp/<tok>_ofelia_jobs       |
|   3. dos2unix (CRLF safety)                         |
|   4. Read /opt/hermes/conf_files/ofelia_config.ini  |
|        (template with POSTMASTER_EMAIL,             |
|         ADMIN_EMAIL, OFELIA_JOBS_GO_HERE markers)   |
|   5. REReplace each marker with live values         |
|   6. Move final file to /etc/ofelia/config.ini      |
|   7. cfinclude restart_ofelia.cfm                   |
+-----------------------------+-----------------------+
                              |
                              v
+-----------------------------------------------------+
| hermes_ofelia container                             |
|   reads /etc/ofelia/config.ini on start             |
|   fires `docker exec <container> <command>` on      |
|   each job's schedule, capturing stdout/stderr      |
|   on failure: emails admin_email via 10026          |
+-----------------------------------------------------+
```

## Configuration storage

| Table | Role |
|---|---|
| `ofelia_jobs` | One row per scheduled job |
| `scheduled_job_runs` | Append-only history of **manual** Run Now invocations from this page; Ofelia's own scheduled executions are not recorded here |

`ofelia_jobs` schema (relevant columns):

| Column | Type | Notes |
|---|---|---|
| `job_name` | `varchar(255)` | The **full bracketed header** as Ofelia consumes it, e.g. `[job-exec "hermes-quarantine-notify"]`. The display-friendly name shown in the table is the text between the quotes (the page extracts it with a regex). |
| `schedule` | `varchar(255)` | Ofelia format — either 6-field cron (`sec min hr dom mon dow`), 5-field cron, or `@every <duration>` (e.g. `@every 60s`, `@every 10m`, `@every 1h`) |
| `command` | `varchar(255)` | The shell command Ofelia runs inside the container |
| `container` | `varchar(255)` | Target container — `hermes_commandbox` for most jobs, `hermes_dmarc` for DMARC report processing, `hermes_mail_filter` for fangfrisch |
| `active` | `int(11)` | **`1` = enabled, `2` = disabled.** Disabled jobs stay in the DB but are filtered out of the generated `config.ini`. |
| `no_overlap` | `tinyint(3)` | When `1`, Ofelia emits `no-overlap = true` so a still-running invocation prevents the next tick from firing. Used for short-interval jobs (`@every 60s` cert-queue, quarantine-notify). |
| `type` | `varchar(255)` | Category tag for grouping (`certbot`, `hermes`, `dmarc`, `pushover`, `malware_feeds`, `system`) |

## The seeded job set

A fresh install (`hermes_install.sql`) seeds these jobs. All start
enabled.

| Job | Schedule | Container | What it does |
|---|---|---|---|
| `renew-acme-certificate` | Daily 12:05 | `hermes_commandbox` | Runs certbot renew across all ACME-issued certs; reloads dependent services on success |
| `hermes-message-cleanup` | Daily 01:30 | `hermes_commandbox` | Enforces `msgs` retention policy (Pro: per-policy; Community: global) |
| `hermes-update-check` | Daily 04:30 | `hermes_commandbox` | Polls GitHub Releases; writes the cache file the dashboard reads. See [System Update § Daily update check](https://docs.deeztek.com/books/administrator-guide/page/system-update#daily-update-check). |
| `acme-validate-ip` | Every 30 min | `hermes_commandbox` | Refreshes mailbox-domain SAN cert state when the gateway's public IP changes |
| `hermes-health-check-mailqueue` | Every 15 min | `hermes_commandbox` | Pushover alert when `mailq` count exceeds the threshold |
| `hermes-dmarc-report` | Daily 02:30 | `hermes_dmarc` | Fetches DMARC RUA reports, parses them into the `opendmarc` DB |
| `hermes-authelia-log-rotate` | Daily 02:00 | `hermes_commandbox` | Rotates Authelia's access logs |
| `hermes-quarantine-notify` | Every 60s, `no-overlap` | `hermes_commandbox` | Issues quarantine-release emails to recipients with pending messages |
| `hermes-process-cert-queue` | Every 60s, `no-overlap` | `hermes_commandbox` | Drains the encryption cert lookup queue for outbound S/MIME / PGP recipients |
| `hermes-fangfrisch-refresh` | Every 10 min | `hermes_mail_filter` | Refreshes third-party ClamAV signature feeds (SecuriteInfo, Sanesecurity, etc.) |

New jobs added by later features (signature-map regen for the body
milter, the post-upgrade hook caller, etc.) appear here automatically as
they are seeded into `ofelia_jobs`. The page renders whatever is in the
table — there is no hardcoded job list in the CFML.

## The page columns

The DataTable renders one row per `ofelia_jobs` row.

| Column | What it shows |
|---|---|
| **Name** | The display-friendly name (text between the quotes in `job_name`) |
| **Type** | The `type` category tag |
| **Schedule** | Humanized form — `@every 60s` becomes "Every 60 seconds", `0 30 04 * * *` becomes "Daily at 04:30", `0 0 02 * * *` becomes "Daily at 02:00", and so on. Hover for the raw cron expression (commit `8e954d1d`). Anything the humanizer can't cleanly parse falls through to the raw string. |
| **Container** | Target container (`hermes_commandbox`, `hermes_dmarc`, `hermes_mail_filter`, ...) |
| **Command** | The literal command Ofelia runs |
| **Status** | Bootstrap-switch toggle (Enabled / Disabled), AJAX-driven |
| **Last Run (manual)** | Most recent **Run Now** click from this page; Ofelia's own scheduled fires do not write here |
| **Actions** | The **Run Now** button |

## Enable / Disable toggle

The switch posts to `inc/toggle_ofelia_job_action.cfm` with the
`job_name` and `new_state` (`1` or `2`). The handler:

1. Looks up the row; rejects if not found.
2. `UPDATE ofelia_jobs SET active = ?`.
3. Re-runs `ofelia_generate_config.cfm`, which writes a fresh
   `config.ini` containing only the enabled rows.
4. Restarts `hermes_ofelia` via `restart_ofelia.cfm`.
5. On any failure during step 3 or 4, **rolls the `active` flag back**
   and returns the error in JSON. The UI reverts the switch and
   surfaces the error.

The transactional behavior matters — a half-applied state where the DB
says "disabled" but Ofelia is still running the job is exactly the
confusing situation an admin would not be able to diagnose from this
page.

The JS layer surfaces a confirm prompt before disabling jobs on a
**critical list** (`renew-acme-certificate`, `hermes-update-check`,
`hermes-process-cert-queue`, `hermes-quarantine-notify`). The backend
trusts the request — admins with web access already have the means to
disable everything via direct SQL if they want to. The prompt is a
guard against an accidental click, not an authorization gate.

## Run Now

The button posts to `inc/run_scheduled_task_action.cfm`, which executes
the job's `command` synchronously and returns JSON with status,
duration, exit code, and output (capped at 2048 bytes for the DB
history, full body in the response). The result is displayed in a modal
with a spinner-then-summary view.

Three execution strategies, picked from the command shape:

| Command shape | Strategy |
|---|---|
| `/usr/bin/curl --silent http://localhost:8888/schedule/<name>.cfm` | Routed via `cfhttp` for clean body capture. This is the majority of Hermes jobs — the actual work is implemented as a CFML schedule script and Ofelia is just a trigger. |
| `container != hermes_commandbox` | Proxied via `cfexecute docker exec <container> <command>`. Used for `hermes-dmarc-report` (targets `hermes_dmarc`) and `hermes-fangfrisch-refresh` (targets `hermes_mail_filter`). |
| Anything else inside `hermes_commandbox` | `cfexecute` directly — the page itself runs inside `hermes_commandbox`, so this is equivalent to what Ofelia would do. |

Hard cap on the manual-trigger path is **300 seconds**. Ofelia's own
scheduled runs have no such cap; if a job legitimately needs to run
longer, scheduled execution is fine but Run Now will time out.

Every Run Now invocation appends a row to `scheduled_job_runs` —
including failures, including runs of disabled jobs (the page allows
firing a disabled job on demand without re-enabling it). The Last Run
column reads from this table.

> **By design.** Run Now and the schedule run independently. Firing a
> job manually does **not** reset Ofelia's next-scheduled-fire clock.
> If you Run Now a job that is also scheduled to fire in 30 seconds, it
> will fire again 30 seconds later — for the `no-overlap` jobs, Ofelia
> will skip the scheduled fire if the manual run is still in progress;
> for the others, both runs will happen.

## The config.ini template

`config/hermes/opt/hermes/conf_files/ofelia_config.ini` is a small
placeholder file:

```ini
[global]
smtp-host = hermes_postfix_dkim
smtp-port = 10026
email-to = ADMIN_EMAIL
email-from = POSTMASTER_EMAIL
mail-only-on-error = true

OFELIA_JOBS_GO_HERE
```

`ofelia_generate_config.cfm` does three `REReplace` passes against this
template — `ADMIN_EMAIL` and `POSTMASTER_EMAIL` from `system_settings`,
`OFELIA_JOBS_GO_HERE` from the rendered `[job-exec ...]` blocks — and
writes the result to `/etc/ofelia/config.ini`. The intermediate work
happens under `/opt/hermes/tmp/<customtrans3>_*` with a final atomic
`move` into place, which is also why a partial regen does not leave the
live file half-written.

## When direct edits to config.ini are appropriate

There are exactly two situations where editing `/etc/ofelia/config.ini`
directly makes sense:

1. **Debugging Ofelia itself** — flipping `mail-only-on-error` to
   `false` so every successful run notifies, or adding `verbose = true`
   to the global block to flood `docker logs hermes_ofelia` with detail.
2. **Adding a one-shot job that you don't want in the DB** — e.g., a
   migration script that should run once at the next scheduled time.

In both cases, the change survives until the next save on this page or
the next install-script run. If you need a persistent custom job, add
it to `ofelia_jobs` directly via SQL and the regen will pick it up.

## Failure semantics

| What breaks | What happens |
|---|---|
| `ofelia_jobs` is empty | The page shows a warning callout; Ofelia generates no jobs and idles. Re-run `install_hermes_docker.sh --apply-schema` to re-seed. |
| Toggle handler fails mid-regen | `active` flag rolled back to its previous value; switch reverts in the UI; error surfaced. Live `config.ini` is unchanged. |
| `restart_ofelia.cfm` fails (container missing, Docker socket gone) | Toggle response carries the error message; live `config.ini` is the new one but Ofelia hasn't reread it yet. Manual `docker compose restart hermes_ofelia` recovers. |
| Run Now times out (>300s) | `cfexecute` raises; the JSON response is `success: false` with an exception; `scheduled_job_runs` still gets the failure row. |
| Run Now command exits non-zero | Modal shows the stderr in the output pane; the row still inserts into `scheduled_job_runs` with `exit_code` set to whatever the process returned. |
| Ofelia's own scheduled run fails | Ofelia emails `admin_email` via 10026 (auto-DKIM-signed). **Not** reflected in the Last Run column on this page — that column is manual-only. |
| `dos2unix` not installed inside `hermes_commandbox` | Regen aborts with `error.cfm` traceback. The shipped image has it; only relevant for custom builds. |

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_scheduled_tasks.cfm` | `hermes_commandbox` | The page (renders the table, hosts the toggle + Run Now JS) |
| `config/hermes/var/www/html/admin/2/inc/run_scheduled_task_action.cfm` | `hermes_commandbox` | Run Now AJAX endpoint |
| `config/hermes/var/www/html/admin/2/inc/toggle_ofelia_job_action.cfm` | `hermes_commandbox` | Enable/Disable AJAX endpoint |
| `config/hermes/var/www/html/admin/2/inc/ofelia_generate_config.cfm` | `hermes_commandbox` | Config regenerator — reads `ofelia_jobs`, writes `config.ini` |
| `config/hermes/var/www/html/admin/2/inc/restart_ofelia.cfm` | `hermes_commandbox` | `docker container restart hermes_ofelia` wrapper |
| `config/hermes/opt/hermes/conf_files/ofelia_config.ini` | `hermes_commandbox` | Template with `ADMIN_EMAIL` / `POSTMASTER_EMAIL` / `OFELIA_JOBS_GO_HERE` markers |
| `config/ofelia/config.ini` | `hermes_ofelia` (live) | Regen target |
| `ofelia_jobs` table | `hermes_db_server` (`hermes` DB) | Canonical job list |
| `scheduled_job_runs` table | `hermes_db_server` (`hermes` DB) | Manual-run history |
| `/var/run/docker.sock` (host mount → `hermes_ofelia`) | host filesystem | How Ofelia issues `docker exec` against other containers |

## Future work

- **Inline schedule editing** — today, schedule + command edits happen
  on feature-specific pages (e.g., the Malware Feeds settings page edits
  `hermes-fangfrisch-refresh`'s schedule). A "create new job" and inline
  edit on this page is planned for a later release.
- **External job triggers via API** — issues #222 (Hermes Internal API)
  and #223 (API tokens) will eventually let external systems POST to
  `/api/scheduled-tasks/<name>/run` with a token, replacing the
  web-UI-only Run Now flow. Not yet built.
- **Surface Ofelia's scheduled-run history** — `scheduled_job_runs`
  records manual runs only because that is what the page writes.
  Ofelia's own per-run history sits in `docker logs hermes_ofelia` and
  is not currently tabled. A future enhancement could parse Ofelia's
  stdout into a similar history table.

## Related

- [System Update](https://docs.deeztek.com/books/administrator-guide/page/system-update) — the `hermes-update-check` job is the daily GitHub Releases poll that drives the dashboard's update-available cell
- [DNS Resolver](https://docs.deeztek.com/books/administrator-guide/page/dns-resolver) — most scheduled jobs depend on outbound DNS resolution flowing through `hermes_unbound`
- [System Certificates](https://docs.deeztek.com/books/administrator-guide/page/system-certificates) — the `renew-acme-certificate` job is what actually keeps Let's Encrypt certs current; the page only registers and binds them
- [System Settings](https://docs.deeztek.com/books/administrator-guide/page/system-settings) — `admin_email` (Ofelia failure notification target) and `postmaster` (sender) are both read from here at config regen
- [System Status](https://docs.deeztek.com/books/administrator-guide/page/system-status) — dashboard cells reflect outputs that several of these jobs produce (mail queue, update status)
- [Storage Topology](https://docs.deeztek.com/books/installation-reference/page/storage-topology-5-tiers) — `hermes_ofelia` is stateless; its config lives in the Config tier (`config/ofelia/`)

# Server Setup

# Server Setup

Admin path: **System > Server Setup** (`view_server_setup.cfm`,
`inc/save_server_identity.cfm`, `inc/generate_postfix_configuration.cfm`,
`inc/generate_nextcloud_configuration.cfm`).

This page configures **how Hermes identifies itself to other mail
servers** — the Postfix `myorigin` domain, the `myhostname` FQDN used
in SMTP banners and HELO/EHLO greetings, and the host IPv4 address
used by Nextcloud's `trusted_domains`. These are foundational, mostly
install-time values; changing them in production has visible downstream
effects on outbound mail acceptance and on email-client configuration.

Pairs with [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings), which configures
the web-side identity (Console Address and certificate). The two pages
together define every name Hermes presents to the world: the mail side
on this page, the web side on Console Settings.

## What this page does NOT configure

| Concern | Lives on |
|---|---|
| The hostname/IP that nginx terminates HTTPS on for `/admin`, `/users`, `/nc` | [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — Console Address |
| The TLS certificate presented to mail clients on `:25`, `:465`, `:587` | [SMTP TLS Settings](https://docs.deeztek.com/books/administrator-guide/page/smtp-tls-settings) — separate cert binding from the console cert |
| The TLS certificate presented to the web console | [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — Console Certificate |
| Per-domain mail routing, accepted-domain lists, relay maps | Email Relay > Domains and Email Server > Domains |
| The Docker subnet (`IPV4SUBNET` in `.env`) | Currently hardcoded in 15+ config files. See [Known limitation](#known-limitation--docker-subnet-is-hardcoded) below. |
| Initial install — admin password, LDAP base, secrets generation | `scripts/install_hermes_docker.sh` (see [Release engineering and updates](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology)) |

## Configuration storage — the `parameters` / `parameters2` split

This page is one of the cleanest examples of the **dual-role
`parameters` table** in Hermes. Two of the three fields live there
(under their Postfix directive names), and the third lives in
`parameters2`.

### `myorigin` and `myhostname` — `parameters` table

In the `parameters` table, the same directive is stored as **two
rows**:

| Row | Role | Linked by |
|---|---|---|
| `child = 2` row | The directive **name** (the Postfix keyword), e.g. `parameter = 'myorigin'` | `parent_name` on the value row points back to this row's `parameter` |
| `child = 1` row | The directive **value** (the actual domain/hostname), e.g. `parameter = 'example.com'`, `parent_name = 'myorigin'` | — |

The page reads from the `child = 1` row (the value) and writes back to
the same `child = 1` row when an admin saves. The `child = 2` row's
`enabled` flag is set to `1` on every save to guarantee the directive
is included when Postfix `main.cf` is regenerated.

```sql
-- The name row (directive)
parameter = 'myorigin', child = '2', enabled = '1', conf_file = 'main.cf', module = 'postfix'

-- The value row (the actual domain)
parameter = '<your-domain>', parent_name = 'myorigin', child = '1',
    module = 'postfix', conf_file = 'main.cf'
```

The same shape applies to `myhostname`. Seeded defaults are
`domain.tld` and `hermes.domain.tld` respectively.

> **Why the split.** The dual-row pattern lets Hermes treat any Postfix
> directive uniformly: the parent (`child = 2`) carries metadata —
> display name, help text, default, enable flag — and one or more value
> rows (`child = 1`) carry the actual configuration. Multi-value
> directives (`mynetworks`, `smtpd_recipient_restrictions`, etc.) just
> have more `child = 1` rows under the same `parent_name`. Single-value
> directives like `myhostname` have exactly one.

### Host IP Address — `parameters2` table

Host IP lives in `parameters2` because it is not a Postfix directive
— it is a free-floating piece of installation state consumed by
Nextcloud's `trusted_domains` config.

```sql
parameter = 'server_ip', value2 = '<ip>', module = 'network'
```

Read by `generate_nextcloud_configuration.cfm` and substituted into
`config.php` as `NEXTCLOUD_TRUSTED_DOMAIN_IP`. The same value is also
used by the install script and any other code that needs the
operator-confirmed host IP without parsing it out of `ip addr`.

## Fields on the page

### Mail Server Domain (Postfix `myorigin`)

The origin domain Postfix appends to unqualified sender addresses on
outbound mail. If a local process submits a message from
`root@localhost`, Postfix rewrites it to `root@<myorigin>` before
sending. For internal-only setups this can stay at the install default;
for any system that sends external mail, set it to the operator's
canonical domain.

Validated by the email-trick: `IsValid("email", "test@<value>")` must
return true. Empty input is rejected with `session.m = 2`; invalid
format with `session.m = 4`.

### Mail Server Hostname (Postfix `myhostname`)

The fully-qualified hostname Hermes announces in its SMTP banner and
HELO/EHLO greeting. This is the value other mail servers see when they
connect to Hermes (and that Hermes presents when it connects to them).
Three downstream consequences:

| Consumer | What goes wrong if this doesn't match DNS |
|---|---|
| Receiving MTAs' reverse-DNS checks (PTR lookup → A lookup → match) | Recipient servers reject outbound mail with `450/550 helo not match` errors |
| TLS certificate Common Name / SAN match on SMTP | Strict STARTTLS verifiers refuse to deliver to Hermes |
| Authoritative SPF / DKIM / DMARC alignment for `mailfrom` | Indirect — bounces may align poorly if MAIL FROM uses an unmatched domain |

> **Do not change this in production without planning.** The page
> wraps the field in a red warning callout for a reason. The page
> warning enumerates the user-visible breakages:
>
> - All external email clients (Thunderbird, Outlook, iOS Mail, etc.)
>   need their IMAP/SMTP server hostname reconfigured
> - CalDAV/CardDAV clients need new server URLs
> - Nextcloud Mail profiles for **remote-auth** mailboxes (auto-discovered
>   via the external FQDN) re-prompt for the user's AD password and
>   auto-update on the next login
> - Nextcloud Mail profiles for **local-auth** users are unaffected —
>   those profiles use internal Docker hostnames (`hermes_postfix_dkim`,
>   `hermes_dovecot`), not the external FQDN
>
> Plan the change for a maintenance window, notify users, and have new
> client setup instructions ready.

Validation: email-trick again (`IsValid("email", "test@<value>")`).
Empty → `session.m = 3`; invalid → `session.m = 5`.

After a successful save, also ensure a matching TLS certificate is
bound for SMTP on [SMTP TLS Settings](https://docs.deeztek.com/books/administrator-guide/page/smtp-tls-settings). The
hostname change does not automatically rebind the cert; both must
match for STARTTLS handshakes to verify.

### Host IP Address

The operator-confirmed IPv4 address of the Docker host. Used to
populate Nextcloud's `trusted_domains` so NC accepts requests routed
through the IP literally (some autoconfig and CalDAV/CardDAV clients
hit the IP before they have the FQDN).

Validation: `^(\d{1,3}\.){3}\d{1,3}$` — basic IPv4 dotted-quad. Empty
is allowed (skips the regen of that field). Invalid → `session.m = 6`.

**The Host IP and the Console Address are independent.** If the
Console Address on [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) is set to an
**IP** (rather than an FQDN) and the host IP changes, you must update
both pages — neither cascades into the other. If Console Address is an
FQDN, only this page needs the IP update.

## Save flow

Clicking **Save & Apply Settings** posts `action=save_settings`, which
runs `save_server_identity.cfm`:

```
1. Validate all three fields (presence + format)
2. UPDATE parameters2.value2 WHERE parameter = 'server_ip'
3. UPDATE parameters.enabled = '1' WHERE parameter IN ('myorigin','myhostname')
   AND child = '2' AND module = 'postfix'         (re-arm both directives)
4. UPDATE parameters.parameter = <domain>
   WHERE parent_name = 'myorigin'  AND child = '1' AND module = 'postfix'
5. UPDATE parameters.parameter = <hostname>
   WHERE parent_name = 'myhostname' AND child = '1' AND module = 'postfix'
6. INCLUDE generate_postfix_configuration.cfm   (rewrites main.cf + reload)
7. INCLUDE generate_nextcloud_configuration.cfm (rewrites NC config.php)
8. cflocation back to view_server_setup.cfm with session.m = 1 (success)
```

There is no nginx restart in this cascade — only **Postfix** and
**Nextcloud** are touched. That is deliberate: nothing in the
nginx-served path consumes `myorigin`, `myhostname`, or the network
`server_ip` (the nginx vhosts use the **Console** Address, configured
separately). The save flow is therefore much lighter than Console
Settings: typically 5–10 seconds, no overlay spinner, no preload-style
restart.

`generate_postfix_configuration.cfm` re-templates
`config/postfix-dkim/etc/postfix/main.cf` from the live `parameters`
rows (walking every `child = 2` row that has `enabled = 1`, emitting
each as `<keyword> = <value>` with values pulled from the matching
`parent_name`-linked `child = 1` rows), copies the result into the
`hermes_postfix_dkim` container, and runs `postfix reload`. The reload
is a SIGHUP — it does **not** drop in-flight SMTP connections; mail
flow continuity is preserved across the save.

`generate_nextcloud_configuration.cfm` rewrites the entire
`config.php` from its template (`/opt/hermes/templates/config.php`),
substituting the host IP into `trusted_domains` along with all the
other NC settings the regenerator owns. Existing
installation-specific values (`passwordsalt`, `secret`, `instanceid`,
`version`) are read back from the live file first and preserved — the
regenerator never invents new versions of these or NC would think it
needs to re-install.

## Failure semantics

| What breaks | What happens |
|---|---|
| Validation fails on any field | `session.m = 2..6`, `cflocation` back to the page, no DB write |
| `parameters` UPDATE succeeds but `generate_postfix_configuration.cfm` fails to write | DB is ahead of the live config. Next save (or any other Postfix-config save) re-regenerates `main.cf` from the same DB rows and catches up. |
| `postfix reload` fails inside the container | DB and on-disk config are in sync but the running Postfix is still on the old config. Symptom: outbound mail still uses the old `myhostname`. Recovery: `docker exec hermes_postfix_dkim postfix reload` manually, or re-save. |
| `generate_nextcloud_configuration.cfm` fails (e.g., NC container down) | Postfix change is committed; NC is stale. Recovery: bring NC up and re-save, or re-run the regen include directly. |
| Hostname change breaks reverse DNS at the recipient | Hermes accepts the change cleanly; the visible failure is **deferred** — outbound mail starts getting rejected by other MTAs minutes to hours later. Always verify PTR + matching A record **before** changing `myhostname`. |

The save flow has no rollback. The previous `main.cf` lives at
`config/postfix-dkim/etc/postfix/main.cf.HERMES` (the CFML write-time
backup convention) and can be restored manually if a regen produces
broken syntax — but the DB has already advanced.

## Known limitation — Docker subnet is hardcoded

The Docker subnet that Postfix and Amavis trust (`IPV4SUBNET=172.16.32`
in `.env`) is **not** managed on this page. It is currently hardcoded
into 15+ config files spanning Postfix (`mynetworks`, `master.cf`),
Amavis (`@inet_acl`), Dovecot (`login_trusted_networks`), Ciphermail
(`authorizedAddresses`), OpenDKIM/OpenDMARC (`TrustedHosts`), and
several CFML queries.

If you need to change the subnet for IP-conflict reasons, **all 15+
files must be updated coherently** or mail flow will break in
subtle ways (Amavis rejecting messages from Hermes itself, OpenDKIM
not signing outbound, etc.). This is a tracked tech-debt item — when
templating is added, the subnet will move into `system_settings` and
get its own admin page rather than living on this one.

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_server_setup.cfm` | `hermes_commandbox` | Page |
| `config/hermes/var/www/html/admin/2/inc/save_server_identity.cfm` | `hermes_commandbox` | Save handler |
| `config/hermes/var/www/html/admin/2/inc/generate_postfix_configuration.cfm` | `hermes_commandbox` | `main.cf` regen + `postfix reload` |
| `config/hermes/var/www/html/admin/2/inc/generate_nextcloud_configuration.cfm` | `hermes_commandbox` | NC `config.php` regen (trusted_domains) |
| `config/postfix-dkim/etc/postfix/main.cf` | `hermes_postfix_dkim` (mounted) | Live Postfix config — regen target |
| `config/postfix-dkim/etc/postfix/main.cf.HERMES` | `hermes_postfix_dkim` (mounted) | Write-time backup of the previous live config |
| `/var/www/html/config/config.php` inside `hermes_nextcloud` | `hermes_nextcloud` | Live Nextcloud config — regen target |
| `parameters` rows where `module = 'postfix'`, `parent_name IN ('myorigin','myhostname')` | `hermes_db_server` (`hermes` DB) | The directive values |
| `parameters2` row where `parameter = 'server_ip'` | `hermes_db_server` (`hermes` DB) | Host IP |

The Postfix reload uses the standard
`docker exec hermes_postfix_dkim /usr/sbin/postfix reload` pattern.
The Nextcloud regen rewrites the bind-mounted `config.php` directly,
no `occ` calls — NC picks up the change on the next request because
`config.php` is read per-request.

## Related

- [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — the web-side identity (Console Address, Console Certificate). Companion to this page.
- [SMTP TLS Settings](https://docs.deeztek.com/books/administrator-guide/page/smtp-tls-settings) — bind a TLS certificate to the Mail Server Hostname so STARTTLS handshakes verify
- [System Certificates](https://docs.deeztek.com/books/administrator-guide/page/system-certificates) — issue / renew the cert that SMTP TLS Settings binds
- [System Settings](https://docs.deeztek.com/books/administrator-guide/page/system-settings) — other globals (timezone, language) not part of server identity
- [Release engineering and updates](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology) — initial install flow that populates these values for the first time

# SMTP TLS Settings

# SMTP TLS Settings

Admin path: **System > SMTP TLS Settings** (`view_smtp_tls_settings.cfm`,
`inc/get_smtp_tls_settings.cfm`, `inc/get_smtp_tls_policies.cfm`,
`inc/edit_smtp_tls_settings.cfm`, `inc/smtp_tls_save_settings.cfm`,
`inc/smtp_tls_add_domain.cfm`, `inc/smtp_tls_edit_domain.cfm`,
`inc/smtp_tls_delete_domain.cfm`, `inc/generate_tls_policy.cfm`,
`inc/generate_postfix_configuration.cfm`).

This page configures **Postfix TLS** end to end: the global
inbound/outbound TLS mode (Disabled / Opportunistic / Mandatory), the
certificate Postfix presents on `:25`/`:587`, and per-destination-domain
TLS policy overrides for outbound delivery.

Pairs with [System Certificates](https://docs.deeztek.com/books/administrator-guide/page/system-certificates), which owns
the certificate **store**; this page is the **binding** of one of those
certs to the Postfix `smtpd_tls_*` / `smtp_tls_*` directives. Pairs
also with [Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup), which owns the SMTP banner
hostname (`myhostname`) — the cert's Common Name or SAN must match that
hostname for strict STARTTLS verifiers to accept the handshake.

## TLS modes

```
+----------------------------------------------------------------+
|                    Postfix smtpd_tls_security_level             |
|                    +  smtp_tls_security_level                   |
+----------------------------------------------------------------+
   |                       |                          |
   |  '' (Disabled)        |  'may' (Opportunistic)   |  'encrypt' (Mandatory)
   v                       v                          v
 no STARTTLS         STARTTLS offered; clear-     STARTTLS required;
 advertised          text fallback if peer        peer must support it
 (cleartext only)    can't negotiate              or delivery fails
```

| Mode (`tlsmode` form value) | Postfix value | Use when |
|---|---|---|
| **Disabled** (`""`) | (directive value cleared) | Cleartext-only environments (test, isolated networks); production Internet exposure not recommended |
| **Opportunistic TLS** (`may`) — **Recommended** | `may` | Standard public-Internet config. STARTTLS is advertised; peers that support it use it, peers that don't fall back to cleartext |
| **Mandatory TLS** (`encrypt`) — **NOT recommended for Internet-facing servers** | `encrypt` | Closed networks where every peer is known to support TLS. On the open Internet this **drops mail** from any sender that can't negotiate STARTTLS, which is a long tail of misconfigured small senders |

The mode applies symmetrically to inbound (`smtpd_*`) and outbound
(`smtp_*`). Both directive rows are written on save.

## Selecting a certificate

The **SMTP TLS Certificate** field is a free-text autocomplete that
searches `system_certificates` via the `getcertificates.cfm` ajax
endpoint (the same endpoint used by Console Settings). Picking a row
populates a hidden `certificateno_1` field with the row ID plus four
read-only display fields (Subject, Issuer, Serial, Type).

The certificate picker is **hidden when TLS mode is Disabled**
(`#tlscertificate` div toggled by `#tlsmode` change handler). Switching
back to Opportunistic or Mandatory slides it back into view.

### The system-cert refusal

If an admin tries to save with the system-managed (bootstrap snakeoil)
cert selected, the handler refuses with **error 3**:

> You cannot select the system-self-signed Certificate for SMTP TLS.

This is intentional. A self-signed cert on `:25` would defeat the
purpose — strict STARTTLS verifiers on the receiving side reject the
handshake, and Hermes would silently lose all outbound mail to those
recipients. The refusal forces the admin to import a real cert
(commercial CA, internal PKI, or Let's Encrypt) before flipping TLS on.

The error message text is dated — the comparison is against
`certificateno_1 = 1` in `edit_smtp_tls_settings.cfm`, which works on
Docker fresh installs (where the bootstrap row is `id = 1`) but does
**not** work on installs where the system cert was assigned a different
ID (notably DEV's `ssl-cert-snakeoil` row at `id = 29`). The
[System Certificates](https://docs.deeztek.com/books/administrator-guide/page/system-certificates#the-system-column-and-the-system-badge)
runtime helper resolves this for the deletion guard; the SMTP-TLS save
handler still uses the hardcoded `id = 1` check. Practical impact is
small because in either case the admin should not be selecting the
system row, but if you migrate from a legacy install with a non-`id=1`
system row, the SMTP page won't refuse the snakeoil even though the
System Certificates page will block its deletion.

## How directive values are stored

This page is the canonical example of the dual-row `parameters` table
pattern documented in
[Server Setup § Configuration storage](https://docs.deeztek.com/books/administrator-guide/page/server-setup#configuration-storage--the-parameters--parameters2-split).
Each Postfix directive has two rows:

| Row | `parameter` | `child` | `parent_name` | Role |
|---|---|---|---|---|
| Name row | `smtpd_tls_security_level` | `2` | — | Directive **name** |
| Value row | `may` / `encrypt` / `""` | `1` | `smtpd_tls_security_level` | Directive **value** |

Save handler `edit_smtp_tls_settings.cfm` writes to the value row only:

```sql
UPDATE parameters
   SET parameter = '<tls_mode>'
 WHERE parent_name = 'smtpd_tls_security_level'
   AND child = '1'
   AND enabled = '1';

-- same for smtp_tls_security_level (outbound)
-- and smtpd_tls_cert_file, smtpd_tls_key_file, smtpd_tls_CAfile
--   (paths resolved from system_certificates.file_name + type)
```

The selected cert's on-disk paths are derived from
`system_certificates.type` + `file_name`:

| `type` | `smtpd_tls_cert_file` | `smtpd_tls_key_file` | `smtpd_tls_CAfile` |
|---|---|---|---|
| `Imported` | `/opt/hermes/ssl/<file_name>_hermes.pem` | `/opt/hermes/ssl/<file_name>_hermes.key` | `/opt/hermes/ssl/<file_name>_hermes.chain.pem` |
| `Acme` | `/etc/letsencrypt/live/<file_name>/cert.pem` | `/etc/letsencrypt/live/<file_name>/privkey.pem` | `/etc/letsencrypt/live/<file_name>/chain.pem` |

The same path-derivation logic is implemented globally in
`inc/get_active_cert_paths.cfm` for the console binding; the SMTP save
handler open-codes it here (technical debt — the path arithmetic should
be moved to the helper so there's only one place that knows the layout).

The new directive values land in the `parameters` table, then
`generate_postfix_configuration.cfm` regenerates `main.cf` from the
live rows and runs `postfix reload`. Mode changes therefore take effect
on the next SMTP connection without dropping in-flight sessions
(`postfix reload` is a SIGHUP, not a restart).

## What this page does NOT configure

Hermes' TLS surface is opinionated by design. The page deliberately
omits several knobs that Postfix exposes:

| Concern | Status |
|---|---|
| Cipher suite (`smtpd_tls_ciphers`, `smtpd_tls_mandatory_ciphers`) | Hardcoded in `main.cf` baseline; no UI |
| Protocol versions (`smtpd_tls_protocols`, `smtpd_tls_mandatory_protocols`) | Hardcoded in `main.cf` baseline; no UI |
| DH parameters (`smtpd_tls_dh1024_param_file`) | Same ECDHE-only decision as Console Settings — DH is not offered |
| TLS session cache | Hardcoded defaults |
| EECDH curve | Hardcoded defaults |
| Per-mailbox-domain certs (autoconfig/autodiscover) | Lives on [SAN Management](https://docs.deeztek.com/books/administrator-guide/page/san-management); this page binds the **single** cert Postfix presents on the public SMTP banner |
| Dovecot IMAP/POP cert | Email Server > Settings (separate `mail.certificate` binding) |
| Console (nginx) cert | [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) |

The cipher / protocol decisions are baked into the Postfix baseline
config because they have global security implications and changing them
needs more than a dropdown — there's no curated "modern / intermediate
/ legacy" preset UI yet, and the right defaults for an SEG track
[Mozilla's modern profile](https://wiki.mozilla.org/Security/Server_Side_TLS)
which doesn't churn often enough to warrant operator-tunable UI.

## TLS Policy Domains — per-destination outbound overrides

Below the global card is the **TLS Policy Domains** table. Each row
forces a stricter-than-global TLS policy for outbound mail to a specific
recipient domain.

| Field | Meaning |
|---|---|
| **Domain** | Recipient domain (`example.com`) or domain-and-subdomains pattern (`.example.com` — leading dot matches all subdomains) |
| **Encryption Mode** | Currently always **Mandatory** (`encrypt`) for manually-added rows. Per-row mode tunables are tracked but not exposed. |
| **Note** | Free-text description shown in the row |

Adding a row generates `/etc/postfix/tls_policy` (via
`generate_tls_policy.cfm`), runs `postmap` to compile it into a hash
map, and reloads Postfix:

```
docker exec hermes_postfix_dkim /usr/sbin/postmap /etc/postfix/tls_policy
```

The Postfix daemon then consults the map for every outbound SMTP
connection — entries matching the destination domain override
`smtp_tls_security_level` for that specific destination.

> **Operational consequence.** Adding a `encrypt` policy for a recipient
> domain whose MX **doesn't actually support STARTTLS** silently breaks
> outbound mail to that domain. Postfix will defer + bounce. Verify the
> recipient MX advertises STARTTLS before adding a Mandatory entry. The
> warning callout on the page itself spells this out.

### Auto-added rows (managed by Domains)

When a domain on Email Server > Domains or Email Relay > Domains is
configured to require SASL authentication, Hermes auto-inserts a TLS
policy row to enforce encryption for that destination. These rows are
marked by `description = 'Auto-added: domain requires authentication'`
and rendered with a special **Managed by Domains** badge:

- The row's **checkbox** is suppressed (cannot be bulk-deleted from
  here)
- The row's **Edit** button is suppressed (must be edited on the
  managing page)
- The Note column links to **view_domains.cfm** so the admin lands on
  the right page

This is the same pattern used elsewhere in Hermes for system-owned
rows that would otherwise look user-editable — surface that the row is
managed somewhere else and link to the managing page.

## Save flows

### Save SMTP TLS Settings (`save_settings`)

```
1. Validate form.tlsmode in ("", "may", "encrypt")
2. UPDATE parameters value rows for smtpd_tls_security_level + smtp_tls_security_level
3. If tlsmode is not "" :
     a. Validate certificateno_1 exists in system_certificates
     b. Refuse if certificateno_1 = 1 (legacy bootstrap-id check)
     c. UPDATE parameters2 smtp.certificate
     d. Derive cert/key/CA paths from type + file_name
     e. UPDATE parameters value rows for smtpd_tls_cert_file / smtpd_tls_key_file / smtpd_tls_CAfile
4. generate_postfix_configuration.cfm  (regenerate main.cf + postfix reload)
5. session.m = 35 ("settings saved successfully. Postfix reloaded.")
6. cflocation back to view_smtp_tls_settings.cfm
```

### Add / Edit / Delete TLS Policy Domain (`add_domain` / `edit_domain` / `delete_domain`)

```
1. Validate domain (email-trick: IsValid("email", "bob@<domain>"))
   - Leading "." accepted; validator prepends "subdomain"
2. INSERT / UPDATE / DELETE in tls_policies
3. generate_tls_policy.cfm   (rewrite /etc/postfix/tls_policy + postmap)
4. generate_postfix_configuration.cfm  (postfix reload)
5. session.m = 37 / 39 / 34 (per action)
6. cflocation back to view_smtp_tls_settings.cfm
```

Both save flows end in a `postfix reload`, which is a SIGHUP — no
in-flight SMTP connections are dropped, and queued mail continues
delivering normally.

## Failure semantics

| What breaks | What happens |
|---|---|
| Mode = Opportunistic/Mandatory + Certificate empty | `m = 1`, "SMTP TLS Certificate cannot be blank when TLS Mode is set to Opportunistic or Mandatory" |
| Certificate ID does not exist in `system_certificates` | `m = 2`, "The SMTP TLS Certificate you entered is not valid" |
| Certificate ID is `1` (legacy bootstrap check) | `m = 3`, "You cannot select the system-self-signed Certificate for SMTP TLS" |
| Domain validation fails on add/edit | `m = 4` |
| Duplicate domain on add | `m = 5` |
| Duplicate domain on edit | `m = 6` |
| Missing required form field | `m = 20` |
| `generate_tls_policy.cfm` fails (cp / mv / postmap) | DB is ahead of the live `tls_policy.db`. Next save re-renders cleanly. The previous live map is preserved as `/etc/postfix/tls_policy.HERMES.BACKUP`. |
| `postfix reload` fails inside the container | DB and on-disk config in sync; running daemon stale. Recovery: `docker exec hermes_postfix_dkim postfix reload` manually. |

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_smtp_tls_settings.cfm` | `hermes_commandbox` | Page |
| `config/hermes/var/www/html/admin/2/inc/edit_smtp_tls_settings.cfm` | `hermes_commandbox` | Save handler (mode + cert binding) |
| `config/hermes/var/www/html/admin/2/inc/smtp_tls_save_settings.cfm` | `hermes_commandbox` | Action handler wrapper around `edit_smtp_tls_settings.cfm` |
| `config/hermes/var/www/html/admin/2/inc/smtp_tls_add_domain.cfm` | `hermes_commandbox` | TLS Policy add |
| `config/hermes/var/www/html/admin/2/inc/smtp_tls_edit_domain.cfm` | `hermes_commandbox` | TLS Policy edit |
| `config/hermes/var/www/html/admin/2/inc/smtp_tls_delete_domain.cfm` | `hermes_commandbox` | TLS Policy delete |
| `config/hermes/var/www/html/admin/2/inc/generate_tls_policy.cfm` | `hermes_commandbox` | Render `/etc/postfix/tls_policy` + `postmap` |
| `config/hermes/var/www/html/admin/2/inc/generate_postfix_configuration.cfm` | `hermes_commandbox` | `main.cf` regen + `postfix reload` |
| `/etc/postfix/main.cf` | `hermes_postfix_dkim` (mounted) | Live Postfix config — regen target |
| `/etc/postfix/tls_policy` + `tls_policy.db` | `hermes_postfix_dkim` (mounted) | Live TLS-policy map (text + postmap-compiled) |
| `/etc/postfix/tls_policy.HERMES.BACKUP` | `hermes_postfix_dkim` (mounted) | Write-time backup of the previous live map |
| `parameters` rows for `smtpd_tls_*` and `smtp_tls_*` | `hermes_db_server` (`hermes` DB) | Directive values |
| `parameters2.smtp.certificate` | `hermes_db_server` (`hermes` DB) | Active SMTP cert binding (FK into `system_certificates.id`) |
| `tls_policies` table | `hermes_db_server` (`hermes` DB) | Per-destination overrides |

Every shell-out uses `docker exec hermes_postfix_dkim ...` per the
standard Hermes Docker pattern. `postmap` is the one operation that
absolutely **must** run inside the container — the `tls_policy.db`
hash format is libdb-version-sensitive, and running it on the host
produces a file Postfix inside the container can't read.

## Related

- [System Certificates](https://docs.deeztek.com/books/administrator-guide/page/system-certificates) — the certificate store this page selects from; system-managed certs cannot be bound for SMTP
- [Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup) — Mail Server Hostname (`myhostname`); the SMTP cert's Subject CN or SAN should match this for strict STARTTLS verifiers
- [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — the console-side analogue of this page (binds a System Certificate to nginx)
- [Authentication Settings](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings) — Authelia / SASL; per-domain SASL requirements auto-insert `tls_policies` rows here
- [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) — upstream LDAP TLS settings; separate CA store at `/opt/hermes/certs/remoteauth/`, not part of System Certificates
- [SAN Management](https://docs.deeztek.com/books/administrator-guide/page/san-management) — per-mailbox-domain certs for autodiscover/autoconfig; orthogonal to the single SMTP cert this page binds
- [Intrusion Prevention](https://docs.deeztek.com/books/administrator-guide/page/ips) — Fail2ban; not TLS-related but relevant for hardening the SMTP service this page configures
- [Admin Console Firewall](https://docs.deeztek.com/books/administrator-guide/page/console-firewall) — IP allowlist for the console (not SMTP); SMTP is open to the Internet for inbound mail

# System Certificates

# System Certificates

Admin path: **System > System Certificates** (`view_system_certificates.cfm`,
`inc/cert_action.cfm`, `inc/import_certificate.cfm`,
`inc/acme_request_certificate.cfm`, `inc/acme_request_san_certificate.cfm`,
`inc/delete_system_certificate.cfm`, `inc/parse_certificate_details.cfm`,
`inc/get_system_cert_ids.cfm`, `inc/get_active_cert_paths.cfm`).

This is the **canonical certificate store** for Hermes. Every X.509
certificate the gateway presents to the outside world is registered as a
row in `system_certificates` and selected by ID from one of the binding
pages: [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) (web console),
[SMTP TLS Settings](https://docs.deeztek.com/books/administrator-guide/page/smtp-tls-settings) (Postfix SMTP banner), Email
Server > Settings (Dovecot IMAP/POP/Submission), and
[SAN Management](https://docs.deeztek.com/books/administrator-guide/page/san-management) (per-mailbox-domain
autodiscover/autoconfig).

The page itself is purely a CRUD store plus a CSR generator and the
Let's Encrypt (ACME) integration. It does **not** bind certs to services
— that happens on the consuming pages, each of which writes to its own
row in `parameters2`.

## Where certificate files live

The store has two ingest paths plus a system-managed placeholder. Each
lays down files in a different directory tree.

| `type` | On-disk pattern | Source |
|---|---|---|
| `Imported` | `/opt/hermes/ssl/<file_name>_hermes.pem` (leaf), `.key`, `.chain.pem`, `.bundle.pem` (leaf + chain concatenated) | **Import Certificate** modal or **Generate CSR** → external CA → import |
| `Acme` | `/etc/letsencrypt/live/<file_name>/{fullchain,cert,privkey,chain}.pem` | **Request ACME Certificate** modal; renewals via Ofelia-scheduled certbot runs |
| `Imported` (system) | `/opt/hermes/ssl/bootstrap_hermes.{bundle.pem,key,...}` (Docker fresh installs); `/etc/ssl/{certs,private}/ssl-cert-snakeoil.{pem,key}` (legacy non-Docker) | Installer (`install_hermes_docker.sh`) or Ubuntu `ssl-cert` package |

The `bootstrap` cert is a self-signed snakeoil that ships with every
fresh Docker install — Hermes needs **something** to bind to before the
admin imports a real cert. It is reserved as a placeholder for newly-added
mailbox domains; consumers that actually need a publicly-trusted cert
(SMTP TLS, the console) refuse to bind to it (see
[SMTP TLS Settings § Selecting a certificate](https://docs.deeztek.com/books/administrator-guide/page/smtp-tls-settings#selecting-a-certificate)).

## The `system` column and the SYSTEM badge

The `system_certificates.system` column (added by issue #252) is a
boolean flag marking install-generated rows. The UI surfaces this two
ways:

| Surface | Behavior when `system = 1` |
|---|---|
| **SYSTEM badge** next to the friendly name | Rendered as a gray pill in the Name column |
| **Delete** button | Disabled with a tooltip ("System-managed certificate — cannot be deleted. Used as a placeholder when binding mailbox domains before a real cert is imported.") |

The delete-protection gate lives in `cert_action.cfm` and re-checks
`system = 1` server-side so a crafted POST cannot bypass the disabled
button.

> **Legacy vs Docker file_name.** Fresh Docker installs have
> `file_name = 'bootstrap'`. Legacy non-Docker installs that survived a
> migration have `file_name = 'ssl-cert-snakeoil'` (from the Ubuntu
> `ssl-cert` package). Both are flagged `system = 1` on installs where
> the column exists. The `inc/get_system_cert_ids.cfm` helper resolves
> the row IDs at runtime — code that needs to know "is this a system
> cert" reads from the helper, never from a hardcoded `id = 1`. This is
> the only correct gating signal; `version_no = 'Docker'` does **not**
> tell you which file_name pattern applies because both DEV (Docker,
> legacy install vintage) and Test (Docker, fresh install) report the
> same version string.

## Cert path resolver — `get_active_cert_paths.cfm`

Most consumers don't want the row ID — they want the actual on-disk
paths to pass to `nginx ssl_certificate`, `openssl cms -sign`, Postfix's
`smtpd_tls_cert_file`, etc. The path layout differs between Imported
(`/opt/hermes/ssl/...`) and ACME (`/etc/letsencrypt/live/.../...`), and
the same logical name maps to different files for different consumers
(`fullchain.pem` for nginx vs `cert.pem` for openssl signer).

`inc/get_active_cert_paths.cfm` is the single place that knows this
layout. It reads the active console certificate from `parameters2`,
joins to `system_certificates`, and writes six caller-visible variables:

| Variable | Purpose |
|---|---|
| `hermesCertType` | `"Imported"`, `"Acme"`, or `"Snakeoil"` |
| `hermesCertIsSnakeoil` | `true` when no real cert is bound (signing callers must skip) |
| `hermesCertNginxPath` | Cert for `nginx ssl_certificate` (bundle for Imported, fullchain for Acme) |
| `hermesCertKeyPath` | Private key |
| `hermesCertSignerPath` | Leaf cert only — for `openssl cms -sign` |
| `hermesCertChainPath` | Intermediates only — for `openssl cms -sign -certfile` |

Any new code that touches certificate files should `cfinclude` this
helper rather than reinventing the path arithmetic. The legacy hardcoded
fallback (`/etc/ssl/certs/ssl-cert-snakeoil.pem`) was removed in #251
because the minimal Docker container doesn't have the `ssl-cert` package
and nginx crashed with `BIO_new_file` errors on the missing file.

## Three ingest paths

### 1. Request ACME Certificate (Pro feature)

The **Request ACME Certificate** button issues a Let's Encrypt cert via
an ephemeral certbot container. Disabled when no Pro license is active.

```
Admin clicks Request -> view_system_certificates.cfm action=requestacme
   -> inc/acme_request_certificate.cfm
       docker run --rm --name hermes_certbot --network host \
         -v <repo>/config/hermes/var/www/html:/var/www/certbot \
         -v <repo>/config/certbot/conf:/etc/letsencrypt \
         -v <repo>/config/certbot/logs:/var/log \
         certbot/certbot:latest \
         certonly --webroot --webroot-path /var/www/certbot \
         --email <admin> --agree-tos --no-eff-email \
         [--dry-run]   # staging mode
         -d <domain>
```

- **Staging** mode adds `--dry-run` and never lands a real cert. Always
  test with Staging first to confirm DNS + ports 80/443 work; Let's
  Encrypt's production rate limits will lock the domain out for a week
  if you burn through them with broken HTTP-01 challenges.
- The webroot is mounted to `/var/www/certbot` so certbot can write the
  challenge file where the live nginx vhost expects it.
- Certs land in `config/certbot/conf/live/<domain>/` (bind-mounted to
  `/etc/letsencrypt/live/<domain>/` in the commandbox container).
- Renewals are driven by Ofelia (Scheduled Tasks). Each renewal runs the
  same ephemeral certbot container with `renew`; if the renewal
  succeeds, dependent services (nginx, Postfix, Dovecot) reload to pick
  up the new files.
- ACME certs cannot be renewed manually from this page — the row exists
  for binding and deletion only; renewals are scheduled and silent.

Per-mailbox-domain ACME SAN certs (autoconfig + autodiscover + custom
prefixes) use a separate code path
(`inc/acme_request_san_certificate.cfm`) wired to
[SAN Management](https://docs.deeztek.com/books/administrator-guide/page/san-management). Both paths land
rows in the same `system_certificates` table.

### 2. Import Certificate

For certs issued by any CA other than Let's Encrypt (commercial CA,
internal PKI, etc.). The admin pastes three PEM blobs in the **Import
Certificate** modal:

| Field | Contents |
|---|---|
| Certificate (PEM) | Leaf cert between `-----BEGIN CERTIFICATE-----` and `-----END CERTIFICATE-----` |
| Unencrypted Key (PEM) | Private key — must be **unencrypted** (no passphrase). Encrypted keys are rejected because nginx / Postfix cannot prompt for a passphrase at startup. |
| Root & Intermediate CA Certificates (PEM) | Chain — root + intermediates concatenated, leaf-omitted |

On save, Hermes writes four files under `/opt/hermes/ssl/`:

```
<file_name>_hermes.pem            (leaf only)
<file_name>_hermes.key            (private key)
<file_name>_hermes.chain.pem      (CA chain, no leaf)
<file_name>_hermes.bundle.pem     (leaf + chain — for nginx ssl_certificate)
```

`<file_name>` is derived from the friendly name with special characters
sanitized. The row is inserted with `type = 'Imported'` and the
extracted Subject/Issuer/Serial/Fingerprint cached in the table for the
expandable row preview.

### 3. Generate CSR

For admins who want to use their own CA but don't have a key+CSR yet.
The modal collects DN fields (Country, State, Locality, Organization,
Department) plus a **Certificate purpose** radio toggle that drives the
rest of the form:

| Purpose | CN source | SANs |
|---|---|---|
| **Server certificate** (single-name DV, ~$10/yr) | Admin enters Common Name field directly | Admin-entered FQDNs only |
| **Mailbox certificate** (SAN / UCC, $50–$200/yr) | Auto-derived as `<first-prefix>.<mailbox_domain>` matching Pro ACME's first-`-d`-flag behavior | Mandatory: `autoconfig.<domain>`, `autodiscover.<domain>`, plus every prefix from `additional_sans`. Additional admin entries auto-expand bare prefixes against the mailbox domain. |

Smart default: if `mailbox_domains` has any rows, the modal defaults to
**Mailbox**; otherwise it defaults to **Server**. The page-level
"Choosing the Right Certificate Type" card above the table walks the
admin through the cost difference and the "a basic DV cert will not
work for mailboxes" trap.

On submit, Hermes generates a 2048- or 4096-bit RSA key + matching CSR
and bundles them into a `.rar` archive at
`/opt/hermes/tmp/<token>_csr_key.rar`. The CSR-pending state is then
surfaced as a **persistent callout** at the top of the page (added by
#249) — the **Download CSR** button stays visible across page reloads
until the admin clicks **Discard**. Submit the CSR to the chosen CA,
receive the signed cert + chain, then come back and use **Import
Certificate** (steps 1 above) to register it. The private key in the
`.rar` is what you'll need for the import.

> **The CSR private key never leaves Hermes until the admin downloads
> the bundle.** If the admin clicks Discard without downloading,
> the key is gone — there is no recovery. The Discard button warns about
> this; the persistent callout pattern (#249) was introduced because the
> one-shot download button that used to live inside the success alert
> was easy to miss on a page reload.

## Service binding cross-reference

The certificate-store rows are referenced from four service-binding
locations. Each location keeps its **own** copy of the cert ID — there
is no cascading delete, so the deletion guard (next section) walks all
four before allowing a row to be removed.

| Service | Where the binding lives | Set on page |
|---|---|---|
| Console (admin, user portal, NC, Ciphermail) | `parameters2.parameter = 'console.certificate', module = 'console'` | [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) |
| SMTP (Postfix `smtpd_tls_cert_file`) | `parameters2.parameter = 'smtp.certificate', module = 'certificates'` | [SMTP TLS Settings](https://docs.deeztek.com/books/administrator-guide/page/smtp-tls-settings) |
| Webmail (Dovecot IMAP/POP) | `parameters2.parameter = 'mail.certificate', module = 'certificates'` | Email Server > Settings |
| Mailbox SAN (per-domain autodiscover/autoconfig) | `mailbox_domains.mailbox_certificate` (multiple rows possible) | Email Server > Domains, [SAN Management](https://docs.deeztek.com/books/administrator-guide/page/san-management) |

The page renders four YES/NO columns (Console / SMTP / Webmail /
Mailbox SAN) so an admin can see at a glance which services a given
cert is in use by.

## Deletion guard

`inc/delete_system_certificate.cfm` walks every consumer before allowing
a delete:

```
1. system column flag         -> system-managed, refuse
2. parameters2 console.certificate    -> assigned to Web Service, refuse
3. parameters2 smtp.certificate       -> assigned to SMTP Service, refuse
4. parameters2 mail.certificate       -> assigned to Mail Service, refuse
5. (mailbox_domains.mailbox_certificate check is in cert_action.cfm)
6. -> DELETE FROM system_certificates WHERE id = ?
7.    plus filesystem cleanup:
        Imported: rm /opt/hermes/ssl/<file_name>_hermes.{pem,key,chain.pem,bundle.pem}
        Acme:     docker run --rm certbot/certbot:latest delete --cert-name <file_name>
                  + DELETE FROM mailbox_domains_sans WHERE acme_certificate = ?
```

The guard is **stop-on-first-match** with a specific error message per
case so the admin knows which binding is blocking the delete and where
to go to unbind. There is no "force delete" — the only way past the
guard is to unbind on the consuming page first.

## Certificate downloads (gated)

Each row has an expandable details panel with **Download Certificate**,
**Download Private Key**, and **Download CA Chain** buttons. By default
these are **disabled** for safety (downloading a private key over a web
page is a sensitive operation). To enable, set

```
ALLOW_CERT_DOWNLOAD=yes
```

in `/opt/hermes/config/security.conf` on the host filesystem. The page
reads this file on every load (cached in the local request). When the
toggle is off, the buttons render disabled with a tooltip telling the
admin where to set the flag.

Downloads are streamed via a hidden iframe + `class="no-preloader"`
pattern (standard Hermes binary-download convention) so the page's
spinner overlay doesn't get stuck.

## SAN validation sub-table (Pro feature)

When a row is bound to one or more entries in `mailbox_sans`
(autodiscover/autoconfig/custom subdomains for a mailbox domain), the
expanded details panel includes a **Mailbox SAN Validation** sub-table
showing IP-resolve and DNS-resolve status for each SAN. This is
populated by the
[SAN Management](https://docs.deeztek.com/books/administrator-guide/page/san-management) validator and is
read-only here — it answers "do all the SANs on this cert actually
resolve to this server?" at a glance.

## Failure semantics

| What breaks | What happens |
|---|---|
| CSR field validation (Country != 2 chars, bad CN chars, etc.) | `session.m` set with the specific error, `cflocation` back to the page, no file/DB writes |
| Mailbox CSR with empty `additional_sans` table | Refused with "No SAN prefixes configured in SAN Management. Cannot generate a mailbox certificate without at least autoconfig + autodiscover." |
| ACME staging dry-run fails (DNS, port 80, rate limit) | Raw certbot stderr surfaced in the error alert; no DB row added |
| ACME production fails | Same as staging — error alert with raw stderr |
| Import with mismatched key + cert | The import script's openssl-modulus check fails; error alert with detail |
| Delete blocked by binding | "The Certificate you are attempting to delete is assigned to the X Service" — admin must unbind first on the consuming page |
| `certbot delete` fails on ACME row | DB row kept, error surfaced; manual cleanup of the `/etc/letsencrypt/live/<name>/` tree may be needed |

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_system_certificates.cfm` | `hermes_commandbox` | Page |
| `config/hermes/var/www/html/admin/2/inc/cert_action.cfm` | `hermes_commandbox` | Action router (CSR, import, ACME, delete, discard) |
| `config/hermes/var/www/html/admin/2/inc/acme_request_certificate.cfm` | `hermes_commandbox` | Single-domain ACME via certbot container |
| `config/hermes/var/www/html/admin/2/inc/acme_request_san_certificate.cfm` | `hermes_commandbox` | Multi-SAN ACME (mailbox certs) |
| `config/hermes/var/www/html/admin/2/inc/import_certificate.cfm` | `hermes_commandbox` | PEM paste-in handler |
| `config/hermes/var/www/html/admin/2/inc/delete_system_certificate.cfm` | `hermes_commandbox` | Deletion guard + filesystem cleanup |
| `config/hermes/var/www/html/admin/2/inc/parse_certificate_details.cfm` | `hermes_commandbox` | Single `openssl x509` parse for subject/issuer/SAN/etc. |
| `config/hermes/var/www/html/admin/2/inc/get_system_cert_ids.cfm` | `hermes_commandbox` | Resolver — which rows are system-managed |
| `config/hermes/var/www/html/admin/2/inc/get_active_cert_paths.cfm` | `hermes_commandbox` | Resolver — on-disk paths for the active console cert |
| `/opt/hermes/ssl/` | `hermes_commandbox` (bind-mounted) | Imported cert files |
| `/etc/letsencrypt/live/<domain>/` | `hermes_commandbox` (bind-mounted from `config/certbot/conf/`) | ACME cert files |
| `/opt/hermes/tmp/<token>_csr_key.rar` | `hermes_commandbox` | Pending CSR bundle |
| `/opt/hermes/config/security.conf` | host filesystem | `ALLOW_CERT_DOWNLOAD` toggle |
| `system_certificates` table | `hermes_db_server` (`hermes` DB) | The canonical store |
| `certbot/certbot:latest` image | docker.io | Pulled on demand; ephemeral per request |

Every certbot invocation is `docker run --rm` against the public
`certbot/certbot:latest` image — Hermes never runs certbot directly on
the host. The container shares the host network (`--network host`) so
Let's Encrypt's HTTP-01 challenge can reach port 80 on the public IP.

## Related

- [SMTP TLS Settings](https://docs.deeztek.com/books/administrator-guide/page/smtp-tls-settings) — bind a System Certificate to Postfix SMTP TLS
- [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — bind a System Certificate to the web console (nginx) and its hardening toggles
- [Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup) — Mail Server Hostname; should match the CN/SAN on the SMTP cert for STARTTLS verification
- [Authentication Settings](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings) — Authelia; uses the console cert via its nginx-fronted vhost
- [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) — separate CA store at `/opt/hermes/certs/remoteauth/` for upstream LDAP; not a System Certificate
- [SAN Management](https://docs.deeztek.com/books/administrator-guide/page/san-management) — per-mailbox-domain SAN prefixes that drive mailbox-cert CSR + ACME SAN issuance
- [Intrusion Prevention](https://docs.deeztek.com/books/administrator-guide/page/ips) — Fail2ban; not cert-related but documents the same nginx-restart cascade pattern this page avoids by not regenerating any nginx config
- [Admin Console Firewall](https://docs.deeztek.com/books/administrator-guide/page/console-firewall) — IP allowlist for the console; layered above the TLS termination this page's certs drive

# System Logs

# System Logs

Admin path: **System > System Logs** (`view_system_logs.cfm`,
`schedule/message_cleanup.cfm`).

This page is a SQL-backed log viewer over **rsyslog's `SystemEvents`
table** in the `Syslog` database. Every mail-side container in the
stack ships its `mail.*` syslog stream to MariaDB via the `ommysql`
rsyslog output module; this page reads from that table with a date
range, optional facility filter, and a row limit, and renders the
result in a sortable DataTable.

Pairs with [Mail Queue](https://docs.deeztek.com/books/administrator-guide/page/mail-queue): the Mail Queue viewer shows
what Postfix is currently holding; this page shows the historical log
trail — connection negotiation, milter results, content-filter
verdicts, delivery outcomes, bounce generation — that explains *why*
a message did or did not make it through.

## The log pipeline — container `mail.*` to `SystemEvents`

```
  ┌──────────────────────────────┐
  │ hermes_postfix_dkim          │  mail.*
  ├──────────────────────────────┤
  │ hermes_mail_filter (Amavis)  │  mail.*
  ├──────────────────────────────┤
  │ hermes_opendmarc             │  mail.*
  ├──────────────────────────────┤
  │ hermes_openldap (slapd)      │  mail.*  (via slapd.conf rsyslog rule)
  ├──────────────────────────────┤
  │ hermes_openarc (optional)    │  mail.*
  └──────────────┬───────────────┘
                 │ each container runs its own rsyslogd with
                 │   /etc/rsyslog.d/mysql.conf:
                 │     $ModLoad ommysql
                 │     mail.* :ommysql:hermes_db_server,Syslog,USER,PASS
                 ▼
       ┌────────────────────────────┐
       │ hermes_db_server (MariaDB) │
       │   Syslog.SystemEvents      │
       │   Syslog.SystemEventsProperties (unused by viewer)
       └─────────────┬──────────────┘
                     │ SELECT ... ORDER BY ReceivedAt DESC LIMIT ?
                     ▼
       ┌────────────────────────────┐
       │ view_system_logs.cfm       │
       └────────────────────────────┘
```

The MySQL output config that wires each container into the pipeline is
templated at install time. Each service gets a per-container template
under `config/<service>/etc/rsyslog.d/mysql.conf.template` with
`__SYSLOG_USER__` / `__SYSLOG_PASS__` placeholders; the install script
substitutes the generated credentials and bind-mounts the rendered file
into the container at `/etc/rsyslog.d/mysql.conf`. There is no
container-side aggregator — every container talks to MariaDB directly.

> **Operational consequence.** If MariaDB is down or unreachable from a
> container, that container's `mail.*` log entries are buffered by
> rsyslog and then dropped when the buffer fills. Log gaps during a
> database outage are expected and are not a bug in the viewer.

## What lands in `SystemEvents` — and what doesn't

The `mail.*` selector covers everything that uses syslog facility 2
(`mail`) on the source containers. That is:

- All Postfix smtpd / cleanup / qmgr / smtp / lmtp / bounce / pickup
  output (connection logs, milter verdicts, `status=sent|deferred|
  bounced`, queue lifecycle events)
- All Amavis content-filter output (verdict, score, virus name,
  per-policy bank decisions)
- All OpenDMARC verdict lines (`policy=`, `disposition=`)
- All OpenDKIM signing and verifying output
- slapd's syslog output (only because `slapd.conf` is explicitly
  configured to use the `mail` facility — see [LDAP &
  RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth))
- OpenARC output if the optional service is enabled

What is **not** here:

- **nginx access / error logs** — not configured to ship to syslog;
  read them with `docker exec hermes_nginx tail -f /var/log/nginx/...`
  or via [Admin Console Firewall](https://docs.deeztek.com/books/administrator-guide/page/console-firewall) /
  [Intrusion Prevention](https://docs.deeztek.com/books/administrator-guide/page/ips) for the security
  view.
- **Authelia auth logs** — written to `/remotelogs/authelia/
  authelia.log` for fail2ban consumption; see
  [Authentication Settings](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings) and
  [Intrusion Prevention](https://docs.deeztek.com/books/administrator-guide/page/ips).
- **Dovecot login / IMAP logs** — written to
  `/remotelogs/dovecot/dovecot-info.log` for fail2ban; the LMTP
  delivery side that Postfix talks to is visible here because Postfix
  logs the LMTP handoff result.
- **CommandBox / Lucee application logs** — Lucee internal logs live
  under the Lucee server home on the data tier, not in `SystemEvents`.
- **Container stdout/stderr** — `docker logs <name>` only.

This page is the operator's one-stop view for *mail-flow* questions.
Auth and HTTP-side concerns have their own log surfaces.

## The `SystemEvents` schema

`config/database/syslog_schema.sql` defines the table (MyISAM,
`latin1_swedish_ci` — rsyslog's canonical schema, kept verbatim for
compatibility). The viewer touches only four columns:

| Column | Type | Used for |
|---|---|---|
| `ReceivedAt` | `datetime` | Date-range filter, sort order, displayed timestamp |
| `Message` | `text` | The log line body |
| `SysLogTag` | `varchar(60)` | Facility filter; rendered as a badge per row. Format is typically `<program>[<pid>]:` (e.g. `postfix/smtpd[12345]:`) |
| `Facility` | `smallint` | Present but not read by the viewer |

Two indexes ship in the baseline schema:

```sql
KEY `idx_systemevents_receivedat` (`ReceivedAt`),
KEY `idx_systemevents_tag_receivedat` (`SysLogTag`, `ReceivedAt`)
```

The composite covers both the bare date-range query and the
facility-filtered date-range query (which uses `SysLogTag LIKE
'<facility>%'` and an `ORDER BY ReceivedAt DESC`). Issue #184, which
tracked the missing indexes, was closed when these were added to
`syslog_schema.sql`; existing installs pick them up via the
`schema_updates.sql` path.

The Facility dropdown is populated by a separate query that pulls
distinct values of `SUBSTRING_INDEX(SysLogTag, '[', 1)` over the
current date range — so the available facilities reflect what actually
logged during the window, not a static enum.

## Fields on the page

### Log Retention

Stored in `parameters2` with `parameter = 'system_log_retention'` and
`module = 'systemlog'`. The dropdown offers 7 / 15 / 30 / 60 / 90 / 120
/ 180 days; the seed value is 30. Saving the form just updates the row
— it does not run the cleanup immediately.

The actual deletion runs in `schedule/message_cleanup.cfm`, scheduled
by Ofelia (the in-stack cron container) once per night. The cleanup
job reads the retention value, computes `today - N days`, and runs:

```sql
DELETE FROM SystemEvents WHERE ReceivedAt < '<cutoff-date>'
```

This is the same cleanup job that prunes the Amavis quarantine on the
data tier — log retention and quarantine retention share the schedule
but have independent thresholds. See [Scheduled
Tasks](https://docs.deeztek.com/books/administrator-guide/page/scheduled-tasks) for the full Ofelia job list and how to run
the cleanup on demand.

> **Operational consequence.** Changing the retention value does not
> shrink the table until the next nightly cleanup runs. To force an
> immediate prune after dialing the value down, trigger the cleanup
> job from the Scheduled Tasks page.

### Start Date / Time and End Date / Time

Tempus Dominus datetime pickers with second-level resolution. Defaults
are the last 24 hours (midnight-to-midnight on today rounded back).
Both go into the query as `cf_sql_timestamp` parameters via
`cfqueryparam` — there is no string concatenation in the SQL.

### Facility

A Tom Select multi-select. Empty (no chips) means "all facilities".
Selecting one or more populates a `SysLogTag LIKE '<facility>%'`
clause per chip, OR'd together. The facility list is recomputed every
time the page loads against the current date range — there is no
cached enum.

### Limit

One of `1000 / 1500 / 2500 / 5000 / 10000 / 15000`. The viewer
validates against this exact list and falls back to `1000` if an
out-of-range value is passed. A yellow callout appears when the
selected limit is 10000 or higher.

> **Why the cap and not unlimited.** The DataTable widget needs to
> render every row into the DOM up front (it does not use
> server-side pagination). A 10,000-row table is already heavy in the
> browser; an unbounded fetch on a multi-month-deep `SystemEvents`
> table would lock the page.

## Reading the badges

The Facility badge contains the raw `SysLogTag` value, which Postfix
and friends format as `<program>[<pid>]:`. A few high-frequency tags
worth recognising:

| Tag (prefix match) | Meaning |
|---|---|
| `postfix/smtpd` | Inbound SMTP — connection, EHLO, helo, rcpt, milter results |
| `postfix/cleanup` | Header normalisation, header_checks, milter signing |
| `postfix/qmgr` | Queue manager — message scheduling, expiry |
| `postfix/smtp` | Outbound delivery to remote MX |
| `postfix/lmtp` | Local delivery to Dovecot |
| `postfix/bounce` | Bounce message generation |
| `amavis` | Content-filter verdicts (`Passed CLEAN`, `Blocked SPAM`, virus names) |
| `opendkim` | DKIM signing on outbound, verifying on inbound |
| `opendmarc` | DMARC alignment verdicts |
| `openarc` | ARC seal verdicts (if enabled) |
| `slapd` | LDAP — bind / search / modify operations |

A row's badge is exact-match for sort but prefix-match for the
filter — selecting `postfix/smtpd` in the Facility dropdown matches
`postfix/smtpd[12345]`, `postfix/smtpd[12346]`, and so on.

## Performance notes

With the two baseline indexes the common query shapes are O(log n) on
`ReceivedAt`:

- Last-24-hour, all facilities, limit 1000 — fast on any table size.
- Last-24-hour, one facility, limit 1000 — covered by the composite
  index, also fast.
- Multi-month window, all facilities, limit 10000 — slow on large
  tables; the index narrows the range but 10000 rows of `text` data is
  the bottleneck. Pull a tighter window.
- `SELECT DISTINCT SUBSTRING_INDEX(SysLogTag, '[', 1)` for the
  facility dropdown — fast on a 24-hour window, noticeably slower on
  weeks-deep windows because the index does not help with the
  expression.

If the table has grown into the tens of millions of rows because
retention was left at 180 days on a high-traffic gateway, dial
retention down and let the next nightly cleanup prune, or run the
cleanup job manually from [Scheduled Tasks](https://docs.deeztek.com/books/administrator-guide/page/scheduled-tasks).

## Related pages

- [Mail Queue](https://docs.deeztek.com/books/administrator-guide/page/mail-queue) — live view of what Postfix is holding;
  pair with this page to trace a stuck message from queue to log.
- [Scheduled Tasks](https://docs.deeztek.com/books/administrator-guide/page/scheduled-tasks) — the Ofelia job that runs the
  retention cleanup.
- [LDAP & RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) — context on why slapd
  appears in `mail.*` (it is configured to use the `mail` facility).
- [Intrusion Prevention](https://docs.deeztek.com/books/administrator-guide/page/ips) and
  [Admin Console Firewall](https://docs.deeztek.com/books/administrator-guide/page/console-firewall) — auth-side and
  HTTP-side log surfaces that do not land in `SystemEvents`.
- [Authentication Settings](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings) — Authelia
  log location and what auth events look like on disk.

# System Notifications

# System Notifications

Admin path: **System > System Notifications** (`view_system_notifications.cfm`,
`inc/ofelia_generate_config.cfm`, `schedule/health_check_mailqueue.cfm`).

This page configures **how the gateway tells the operator that
something needs attention** when no admin is at the console. There are
two delivery channels (Pushover and e-mail) and a per-event toggle
list that decides which scheduled checks fire alerts on which channel.

The page itself is small — one settings card, one toggle list — but
its outputs land in three different places: a row in `system_settings`
(Pushover credentials), `active` flags on rows in `ofelia_jobs` (which
container-side scheduled jobs run), and a regenerated Ofelia config
file (`config.ini` on `hermes_ofelia`).

## What this page is — and isn't

| Is | Isn't |
|---|---|
| The configuration page for **outbound** operator alerts: Pushover push notifications + e-mail to `admin_email` | The **on-screen** dashboard alerts under the navbar (those come from `inc/system_alerts.cfm` and render at every page load — they are not configurable here) |
| A toggle list of which scheduled health checks send Pushover alerts when they fire | A free-form "send me this event" rule builder. The set of supported events is fixed and lives in the `pushover_notifications` table. |
| The owner of the Pushover API token + user/group key for the whole install | A per-user setting. There is one Pushover endpoint per gateway; use a **Pushover Group Key** if you need to fan out to multiple admins. |

> **Dashboard alerts vs. notifications.** The yellow / red callout
> banners that appear under the top navbar (license expiring, mail
> queue backed up, certificate near expiry, etc.) are rendered by
> `inc/system_alerts.cfm` and are not configurable. They fire whenever
> their underlying condition is true, every page load, no matter
> who is logged in. This page is for **emailed / pushed** alerts when
> nobody is looking at the console. Both systems can fire on the same
> underlying event (a mail queue spike will show as a callout AND
> trigger a Pushover push) but they are independent code paths.

## Where the values live

| Setting | Table.column | Default |
|---|---|---|
| Pushover master toggle | `system_settings.pushover_enabled` | `0` |
| Pushover API token (Application Token) | `system_settings.pushover_api_token` | empty |
| Pushover user / group key | `system_settings.pushover_user_key` | empty |
| Per-notification enable flag | `pushover_notifications.enabled` | `2` (disabled) — `1` = enabled |
| Per-notification Ofelia binding | `pushover_notifications.ofelia_job_name` | seeded |
| Ofelia job active flag | `ofelia_jobs.active` | per-job |
| Admin destination address | `system_settings.admin_email` | `someone@otherdomain.tld` |
| Notification `From:` envelope | `system_settings.postmaster` | `postmaster@domain.tld` |

The last two rows live on the [System Settings](https://docs.deeztek.com/books/administrator-guide/page/system-settings)
page, not here. This page **reads** them but does not write them — set
those first, then come back here.

`pushover_notifications` is the canonical registry of every alert that
*can* be sent. Each row pairs a display name + description (shown in
the toggle list) with an Ofelia job name that drives the actual check.
The current seed has one row:

| `name` | `display_name` | `ofelia_job_name` | `category` |
|---|---|---|---|
| `mailqueue_check` | Mail Queue Health Check | `[job-exec "hermes-health-check-mailqueue"]` | `health` |

New notification types are added by inserting a row in this table
(plus the matching row in `ofelia_jobs`) — no code change to the page
itself is needed.

## Pushover Settings card

Sets the per-install Pushover endpoint. Three fields:

| Field | Validation in `save_pushover` |
|---|---|
| Pushover Notifications (Enabled / Disabled) | Must be `0` or `1` |
| API Token (Application Token) | Required when enabled; must match `^[a-zA-Z0-9]{30}$` |
| User / Group Key | Required when enabled; must match `^[a-zA-Z0-9]{30}$` |

Get the values from [pushover.net](https://pushover.net): create an
**Application** to mint the API Token, and either use your own User
Key or create a **Group** to fan out to multiple admins.

After a successful save the form re-displays with a **Send Test
Notification** button that POSTs `action = test_pushover`. The test
sends a real Pushover message at priority `0` (default sound `pushover`)
and surfaces the HTTP status — anything non-200 reports the
`fileContent` as the error detail. Use this to confirm the token + key
pair is good before relying on the channel for real alerts.

## Save flow

```
POST action=save_pushover
   │
   ▼
 Validate pushover_enabled in {0,1}
 If enabled, validate token + key length + alphanumeric pattern
   │
   ▼
 UPDATE system_settings SET value=<x> WHERE parameter IN
   ('pushover_enabled','pushover_api_token','pushover_user_key')
   │
   ▼
 Sync ofelia_jobs.active per the rules below
   │
   ▼
 ofelia_generate_config.cfm  ──►  hermes_ofelia /config/config.ini
                                 (Ofelia re-reads on file change)
   │
   ▼
 cflocation back to view_system_notifications.cfm with session.m=1
```

The Ofelia sync rules are the moving part. The page wants two
conditions to BOTH be true before a notification job actually runs:

1. The **per-notification** toggle in the Available Notifications list
   is on (`pushover_notifications.enabled = 1`)
2. The **master** Pushover toggle is on (`system_settings.pushover_enabled = 1`)

| Master toggle | Per-notification toggle | `ofelia_jobs.active` becomes |
|---|---|---|
| `1` (on) | `1` (on) | `1` (job runs on schedule) |
| `1` (on) | `2` (off) | `2` (job dormant) |
| `0` (off) | any | `2` (all `type='pushover'` jobs dormant) |

So disabling the master Pushover toggle is a safe global kill switch
— every individual notification job stops scheduling. Re-enabling
restores only the per-notification rows that were previously on, not
all of them.

## Toggling a single notification

The Available Notifications card renders one row per
`pushover_notifications` entry, with a clickable toggle pill. Clicking
the pill POSTs `form_action = toggle_notification` with the
notification's row ID. The handler flips
`pushover_notifications.enabled` between `1` and `2`, applies the same
two-condition rule above to `ofelia_jobs.active`, regenerates the
Ofelia config, and redirects back with `session.m = 9`.

The card is **only rendered when the master Pushover toggle is on** —
if Pushover is off there is nothing to toggle per-event, so the list
is hidden.

## Ofelia is the scheduler

Hermes runs all of its recurring checks under [Ofelia](https://github.com/mcuadros/ofelia)
in the `hermes_ofelia` container. The `ofelia_jobs` table holds the
authoritative job definitions; `inc/ofelia_generate_config.cfm`
re-renders `config.ini` from the table and Ofelia hot-reloads. The
notification-side toggles on this page write `ofelia_jobs.active` for
rows where `type = 'pushover'`; other Ofelia jobs (DKIM cron,
certificate renewal, DMARC report processing, etc.) are managed by
their own pages or by [Scheduled Tasks](https://docs.deeztek.com/books/administrator-guide/page/scheduled-tasks).

The seeded mailqueue job runs every 15 minutes:

| Field | Value |
|---|---|
| `job_name` | `[job-exec "hermes-health-check-mailqueue"]` |
| `schedule` | `@every 15m` |
| `command` | `/usr/bin/curl --silent http://localhost:8888/schedule/health_check_mailqueue.cfm` |
| `container` | `hermes_commandbox` |
| `type` | `pushover` |

The CFM target (`schedule/health_check_mailqueue.cfm`) is the real
worker — Ofelia just curls it. The CFM reads
`system_settings.pushover_*`, runs `health_check_mailqueue.sh` to
count the Postfix queue, and on `count > 20` sends both a Pushover
warning AND an e-mail to `admin_email`. The Pushover path is wrapped
in `<cftry>` so a Pushover outage falls through to e-mail — both
channels fire for the same event by design, so the admin gets the
alert even if one channel is broken.

## E-mail delivery path

Notification e-mails are sent via `<cfmail server="hermes_postfix_dkim"
port="10026">`. Port `10026` is Postfix's **post-Amavis re-injection**
listener, which means:

| Property | Behaviour |
|---|---|
| `From:` | `system_settings.postmaster` |
| `To:` | `system_settings.admin_email` |
| Content filtering | **Skipped.** 10026 is post-Amavis — these messages never go through SpamAssassin or ClamAV. |
| DKIM signing | Applied normally (OpenDKIM milter on the post-Amavis path) |
| Transport | Normal SMTP from `hermes_postfix_dkim` to the destination MX |

Skipping content filtering is by design — if Amavis itself is the
thing that's broken, the notification still has to reach the admin.
The trade-off is that a hostile actor with write access to the
gateway could in principle use this same path to inject mail; the
mitigation is that only the gateway's own CFML scheduled jobs target
this port (it is not exposed to the world).

## Adding a new notification type

The page is data-driven — adding a new alert requires no UI change.
Three artefacts need to land together in a schema-update script:

1. A new row in `pushover_notifications` (`name`, `display_name`,
   `description`, `ofelia_job_name`, `category = 'health' | 'security' | ...`)
2. A matching row in `ofelia_jobs` (`type = 'pushover'`,
   pointing at the worker URL)
3. The worker CFM under `config/hermes/var/www/html/schedule/` that
   does the actual check and `cfhttp`-POSTs Pushover + `cfmail`s the
   admin

The Available Notifications card will pick up the new row at the next
page load. The master/per-event toggle rules above apply automatically.

## Pro-vs-Community

System Notifications is a **Community-tier** page. The Pushover
integration, e-mail alerts, and toggle list all work on Community
installs. The Pro license check on the page header (the small comment
block in the include's CFML preamble) is part of the file-fingerprint
manifest — it doesn't gate functionality, only proves the file is
unmodified.

## Failure semantics

| What breaks | What happens |
|---|---|
| Pushover credentials wrong | Save succeeds (no live validation), but Test Notification returns non-200; `session.m = 8` surfaces the API response in the error banner |
| API Token / User Key format wrong (not 30 alphanumeric chars) | Save rejected (`session.m = 4` / `5`); no DB write |
| Master Pushover toggle off | All `type='pushover'` Ofelia jobs flipped to `active = 2`; e-mail path still runs from `health_check_mailqueue.cfm` |
| Ofelia config regen errors | The toggle save still commits to the DB; the `cftry` wrapper around `ofelia_generate_config.cfm` swallows the error. Re-save to retry. |
| `admin_email` empty | `cfmail` will accept an empty `to=` and produce an undeliverable message in the queue; set `admin_email` on [System Settings](https://docs.deeztek.com/books/administrator-guide/page/system-settings) first |
| `pushover.net` unreachable | `health_check_mailqueue.cfm` falls through to e-mail; admin still gets the alert |
| `hermes_postfix_dkim:10026` listener down | E-mail path fails too. The on-screen dashboard alerts (from `inc/system_alerts.cfm`) are the last line of defence — they need no transport. |

## Files and tables touched

| Path / table | Role |
|---|---|
| `system_settings` (rows `pushover_enabled`, `pushover_api_token`, `pushover_user_key`, `admin_email`, `postmaster`) | Channel config + addresses |
| `pushover_notifications` | Registry of every alert type the page can toggle |
| `ofelia_jobs` (`type = 'pushover'` rows) | Per-notification scheduler entries |
| `config/hermes/var/www/html/admin/2/view_system_notifications.cfm` | Page |
| `config/hermes/var/www/html/admin/2/inc/ofelia_generate_config.cfm` | Re-renders `hermes_ofelia /config/config.ini` from `ofelia_jobs` |
| `config/hermes/var/www/html/schedule/health_check_mailqueue.cfm` | The mail queue worker; reads Pushover creds, sends push + e-mail |
| `https://api.pushover.net/1/messages.json` | Outbound HTTPS endpoint for every Pushover send (Test + live alerts) |

## Related

- [System Settings](https://docs.deeztek.com/books/administrator-guide/page/system-settings) — sets `admin_email` and `postmaster` (the addresses this page delivers to / from)
- [System Status](https://docs.deeztek.com/books/administrator-guide/page/system-status) — the dashboard-callout side of the alert system (`inc/system_alerts.cfm`)
- [Scheduled Tasks](https://docs.deeztek.com/books/administrator-guide/page/scheduled-tasks) — admin view of the full `ofelia_jobs` table; the same Ofelia container drives every recurring task on the gateway
- [Mail Queue](https://docs.deeztek.com/books/administrator-guide/page/mail-queue) — the page the mail queue alert is asking you to look at when it fires

# System Settings

# System Settings

Admin path: **System > System Settings** (`view_system_settings.cfm`,
`inc/get_system_settings.cfm`, `inc/edit_system_settings.cfm`,
`inc/add_serial_number.cfm`, `inc/update_system_email_addresses.cfm`,
`inc/update_system_timezone.cfm`, `inc/update_system_update_check.cfm`,
`inc/update_telemetry.cfm`, `inc/invalidate_user_sessions.cfm`).

This is the **catch-all configuration page** for the gateway's global
identity. Three cards live here:

1. **General Settings** — postmaster + admin e-mail addresses, server
   timezone, daily update check, telemetry, and the Pro Edition serial
   number.
2. **Bot Protection (CAPTCHA)** — chooses the CAPTCHA provider used on
   public-facing forms (Forgot Password, etc.) and stores the per-provider
   keys.
3. **Session Management** — the "Force Logout All Users" red button.

Pairs with [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) (web-facing host /
TLS cert) and [Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup) (mail-side host identity)
— those define **where** Hermes lives; this page defines **who** runs
it and which administrative addresses receive its automated traffic.
The [System Notifications](https://docs.deeztek.com/books/administrator-guide/page/system-notifications) page reads
`admin_email` from this page when it sends Pushover or e-mail alerts.

## Configuration storage

Every setting on this page lives in the `system_settings` table
(`parameter` UNIQUE key, `value` VARCHAR(1024)). There are no
`parameters2` rows in scope — that table is reserved for module-scoped
config (`console`, `smtp`, etc.). The `parameter`/`value` shape is
deliberately flat key/value; the seed in `config/database/hermes_install.sql`
sets the defaults at install time and every edit on this page is a
straight `UPDATE … WHERE parameter = '<key>'`.

| Card | `system_settings.parameter` | Default | Notes |
|---|---|---|---|
| General | `postmaster` | `postmaster@domain.tld` | Must be a valid e-mail at a **domain that already exists in `domains`** |
| General | `admin_email` | `someone@otherdomain.tld` | Valid e-mail; no domain check |
| General | `timezone` | `America/New_York` | Validated against the `timezones` table |
| General | `serial` | empty | Pro Edition serial; set via the **Add Serial Number** modal |
| General | `users` | `9999` | Set to `9999` automatically when a serial activates (legacy seat-cap field, no longer enforced) |
| General | `daily_update_check` | `2` (Disable) | `1` = enable, `2` = disable; controls the auto-update poll |
| General | `telemetry` | `1` (Enable) | `1` = enable, `2` = disable; anonymised usage data |
| General | `accepted` | `1` | Legacy AGPL acceptance flag; not surfaced in the UI |
| Release stamp | `version_no` | `Docker` | Sentinel that marks this as a Docker install |
| Release stamp | `build_no` | `v260119` | Current release tag |
| CAPTCHA | `captcha_provider` | `builtin` | One of `builtin`, `recaptcha`, `hcaptcha`, `turnstile` |
| CAPTCHA | `recaptcha_site_key` / `recaptcha_secret_key` | empty | reCAPTCHA v2 |
| CAPTCHA | `hcaptcha_site_key` / `hcaptcha_secret_key` | empty | hCaptcha |
| CAPTCHA | `turnstile_site_key` / `turnstile_secret_key` | empty | Cloudflare Turnstile |

The release-stamp rows (`version_no = 'Docker'`, `build_no = v<YYMMDD>`)
are the canonical signal that this install is a Docker install rather
than a legacy non-Docker one. They are surfaced read-only in the
sidebar footer and in INSTALL_SUMMARY output; the schema-update
orchestrator and several upgrade-path code paths gate on them.

## General Settings — fields

### Postmaster E-mail Address (required)

Where bounce notifications, postmaster-class mail, and several internal
alerts originate from. `edit_system_settings.cfm` enforces three rules
in sequence:

1. Must not be empty (`session.m = 2`)
2. Must validate as a real e-mail string (`session.m = 3`)
3. The **domain part must already exist in the `domains` table**
   (`session.m = 4`)

The third rule is the one that surprises people. A bare
`postmaster@example.com` will not save unless `example.com` is already
a recognised mailbox or relay domain on this gateway. If you are
setting this up on a fresh install, add the domain first
(Mailboxes > Domains or Email Relay > Relay Domains) and come back.

The postmaster address is also the `From:` on every notification e-mail
the gateway sends (see [System Notifications § Email path](https://docs.deeztek.com/books/administrator-guide/page/system-notifications#email-delivery-path)),
so it must be a deliverable address from the gateway's perspective —
which is exactly what the domain-existence check guarantees.

### Admin E-mail Address (required)

The destination address for every automated alert and notification
e-mail. Validates as a normal e-mail string (`session.m = 5` empty,
`session.m = 6` malformed) but has no domain-existence check — it is
deliberately allowed to be an external address (your monitoring inbox,
a shared mailbox at a different provider) so the gateway can still
reach you when its own mail flow is broken.

The [System Notifications](https://docs.deeztek.com/books/administrator-guide/page/system-notifications) page reads this
value at every send.

### TimeZone (required)

Free-text autocomplete backed by `inc/gettimezones.cfm` against the
`timezones` table. The submitted value is checked back against the
table before save (`session.m = 7` empty, `session.m = 8` unknown).
Drives every timestamp that Lucee renders in the UI plus the schedule
times shown on Scheduled Tasks.

> **The Lucee server's own timezone is set elsewhere.** Changing this
> field rewrites the application's display timezone; it does **not**
> change the container's `TZ` env var or the OS clock. If the two
> diverge you will see UI timestamps in one zone and log files in
> another.

### Serial Number (read-only here)

Display-only on the General card. To set or change a serial, use the
**Add Serial Number** button at the top of the page — that opens a
modal that POSTs to `inc/add_serial_number.cfm`.

The activation flow (only triggered when a serial is entered, not on
every page load):

```
Modal POST  serial_number + tos
      │
      ▼
  add_serial_number.cfm
      │  validate non-empty / alphanumeric-only / TOS accepted
      │  generate per-request token (customtrans3)
      │  read host UUID via dmi_decode.cfm
      │  RSA-encrypt "<UUID>@<serial>" with /opt/hermes/ssl/public.pem
      ▼
  POST https://activate.hermesseg.io  (TCP/443, no SSL interception)
      │
      ▼
  Server returns "<hash>@<expires>" on success
                  or  INVALID / ALREADY_ACTIVATED / EXPIRED / REVOKED / ERROR
      │
      ▼
  On success: UPDATE system_settings SET value=<serial> WHERE parameter='serial'
              updateRetentionPolicy("VALID", expires, serial, hash)  (cache the result)
              session.license = "VALID"
```

Every login after this point re-validates against
`https://validate.hermesseg.io` and falls back to the cached
`<hash>@<expires>` if the validation endpoint is unreachable (the
"offline mode" path). The page itself never re-runs validation — that
is the job of `inc/setsession.cfm` at login.

| `session.license` value visible after this page | Meaning |
|---|---|
| `VALID` + `session.edition = "Pro"` | Activation succeeded; Pro features available |
| `EXPIRED` | Cached license past expiry; renew at the vendor portal and re-login |
| `REVOKED` | Vendor revoked the serial; contact support |
| `INVALID` | Serial not recognised; double-check the value |
| `TAMPERED` | Pro template files don't match the signed fingerprint; reinstall the release |
| `PENDING_VALIDATION` | Cached license exists but no signed fingerprint baseline; reach the internet and re-login |
| `N/A` | No serial configured — Community Edition |

The two activation-server error paths (`session.m = 12` / `session.m = 13`)
both render the same root-cause hint: Hermes must reach
`activate.hermesseg.io` over HTTPS **without SSL interception**.
Inline-decrypt proxies will break activation because they re-sign the
RSA-encrypted payload.

> **By design.** Deleting the serial value from `system_settings`
> instantly demotes the install to Community Edition. The next login
> sees `session.license = N/A` and stops attempting remote validation.

### Daily Update Check / Telemetry

Two boolean (1 = enable, 2 = disable) selects. Daily Update Check is
the toggle for the auto-update poll that watches for new releases.
Telemetry is the anonymised usage-data feed; the in-card warning
callout links to the public privacy doc. Defaults are: Telemetry =
enabled, Daily Update Check = disabled.

### Save flow

**Save Settings** posts `action = edit`, which runs
`edit_system_settings.cfm` as a strict 5-step sequence
(postmaster → admin_email → timezone → update_check → telemetry).
Each validation failure short-circuits with `cflocation` back to
`view_system_settings.cfm` and `session.m` set to the matching alert
code — no partial state lands. On the final step, four small update
includes write to `system_settings` one parameter at a time
(`update_system_email_addresses.cfm`, `update_system_timezone.cfm`,
`update_system_update_check.cfm`, `update_telemetry.cfm`).

## Bot Protection (CAPTCHA)

CAPTCHA gates the public-facing forms that an unauthenticated visitor
can hit — primarily the Forgot Password flow on `/user-auth/` and
`/admin-auth/`. The provider is chosen here; the form templates check
the same `system_settings` keys at render and validation time. Four
providers are supported:

| Provider | What it needs |
|---|---|
| **Built-in (math)** | No keys. Renders a "what is 7 + 3?" style challenge. Default; works offline. |
| **Google reCAPTCHA v2** | Site key + secret key. Pick the *"I'm not a robot" Checkbox* flavour at the reCAPTCHA admin. |
| **hCaptcha** | Site key + secret key. Privacy-focused reCAPTCHA alternative. |
| **Cloudflare Turnstile** | Site key + secret key. Usually invisible — no user interaction in the happy path. |

`save_captcha` POSTs validate that the provider is one of the four
allowed values and that the matching pair of keys is non-empty when a
non-builtin provider is selected. All seven values are written on
every save regardless of which provider is active — this lets the
admin switch providers back and forth without re-entering keys.

> **Failure mode.** A misconfigured external provider (bad keys,
> domain mismatch) breaks Forgot Password silently for the end user
> — the form renders, the CAPTCHA widget loads, but the server-side
> `siteverify` call fails and the request is rejected. Test the
> provider end-to-end on `/user-auth/forgot_password.cfm` after every
> change.

## Session Management — Force Logout All Users

The red button at the bottom of the page flushes the **entire Authelia
session store** in one call. Every user (admin, mailbox, relay
recipient — and the operator clicking the button) is redirected to the
login page on their next request. There is no per-user logout on this
page; that happens automatically when a user's password is changed,
their account is deactivated, or their account is deleted, because
Authelia's session cookie is encrypted and only Authelia can
invalidate one. The bulk-flush button is the only way to forcibly log
people out from the admin UI.

Use this when:

- A shared admin credential has been rotated and you want every
  inherited session gone
- You suspect a compromised session token
- You have just changed [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) and
  want every old hostname-scoped cookie cleared at once

The action runs `inc/invalidate_user_sessions.cfm` with
`targetSessionUser = "*"` and surfaces `session.m = 36` on return.

## Edition badge — Pro vs Community

Although this page **stores** the serial number, the Pro / Community
edition badge that appears in the sidebar header and in
[System Status](https://docs.deeztek.com/books/administrator-guide/page/system-status) is rendered from `session.edition` /
`session.license` — both of which are set during login by
`inc/setsession.cfm`. Changing the serial here updates the row in
`system_settings`; the badge updates on the **next** login. Use
**Force Logout All Users** above if you need the change to be visible
to other admins immediately.

## Files and tables touched

| Path / table | Role |
|---|---|
| `system_settings` | Every setting on this page (key/value rows) |
| `domains` | Read at postmaster save to validate the domain part |
| `timezones` | Read at timezone autocomplete and save |
| `config/hermes/var/www/html/admin/2/view_system_settings.cfm` | Page |
| `config/hermes/var/www/html/admin/2/inc/edit_system_settings.cfm` | General-card save handler |
| `config/hermes/var/www/html/admin/2/inc/add_serial_number.cfm` | Serial activation against `activate.hermesseg.io` |
| `config/hermes/var/www/html/admin/2/inc/invalidate_user_sessions.cfm` | Force-logout call into Authelia |
| `config/hermes/var/www/html/admin/2/inc/setsession.cfm` | Reads serial + edition at login; this page's read-only Pro Edition state comes from here |
| `https://activate.hermesseg.io` | One-time serial activation endpoint |
| `https://validate.hermesseg.io` | Per-login Pro Edition re-validation endpoint |

## Related

- [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — web console host + TLS cert
- [Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup) — mail-side host identity (Postfix `myhostname` / `myorigin`)
- [System Notifications](https://docs.deeztek.com/books/administrator-guide/page/system-notifications) — consumes `admin_email` + `postmaster` from this page; also the home of Pushover settings
- [System Status](https://docs.deeztek.com/books/administrator-guide/page/system-status) — surfaces the same Pro / Community badge plus the dashboard-alert stream
- [System Update](https://docs.deeztek.com/books/administrator-guide/page/system-update) — when Daily Update Check is enabled, it is this page that drives the poll
- [Password Resets](https://docs.deeztek.com/books/administrator-guide/page/password-resets) — the public form that CAPTCHA actually protects

# System Status

# System Status

Admin path: **System > System Status** (`index.cfm` —
the post-login landing page; the sidebar entry points here, not to a
separate `view_system_status.cfm`). Supporting includes:
`inc/system_alerts.cfm`, `inc/check_system_update.cfm`,
`inc/get_system_version_build.cfm`, `inc/get_system_uptime.cfm`,
`inc/get_system_reboot_required.cfm`, `inc/get_system_resources.cfm`,
`inc/get_system_cpu_usage.cfm`, `inc/get_system_memory_usage.cfm`,
`inc/get_system_{root,data,vmail,nextcloud}_filesystem_usage.cfm`,
`api/get_system_resources.cfm`, `api/get_message_stats.cfm`.

System Status is the operator's at-a-glance picture of the running
gateway. It is the **default page after login** — every Authelia
post-login redirect lands here — and the union of every "is anything
broken right now" signal Hermes computes: license state, update
availability, host OS reboot pending, fresh-install onboarding nudges,
container resource usage, and live mail-processing volume.

The page is **graceful-degradation by design**: every widget catches
its own errors, every external call has a fallback, and a single
failed query (a missing setting row, an unreachable container, a
malformed log file) does not blank the dashboard. If you log in to
Hermes and the page renders at all, the page is doing its job.

## Page layout

```
+--------------------------------------------------------------+
| Top navbar    [license / fresh-install / update badges]      |
+--------------------------------------------------------------+
| Alert callouts (priority <= 5)                               |  rendered by top_navbar.cfm
|   * Templates Modified  (priority 1)                         |  from request.systemAlerts
|   * License Revoked     (priority 2)                         |  (populated in
|   * Placeholder hostname (priority 2)                        |   inc/system_alerts.cfm)
|   * Invalid / Pending / Grace-period expired (priority 3)    |
|   * Self-signed cert    (priority 3)                         |
|   * License Expired     (priority 4)                         |
|   * Offline Mode        (priority 5)                         |
+--------------------------------------------------------------+
| Welcome <user>   Last login: <timestamp>                     |
+--------------------------------------------------------------+
| System Info card                                             |
|   Version | Build | Edition | Uptime | Console IP/FQDN      |
|   | License Status | OS Updates | Hermes Update              |
+--------------------------------------------------------------+
| Messages Processed card                                      |
|   Donut chart + counts (Clean/Spam/Virus/Banned/             |
|   Bad Header/Other) over 15m / 1h / 8h / 12h / 24h           |
+--------------------------------------------------------------+
| System Resources card                                        |
|   CPU | Memory | Root FS | Data FS | Vmail FS | Nextcloud FS |
|   (seven progress rings, auto-refresh every 10s)             |
+--------------------------------------------------------------+
```

The two cards that show live data (Messages Processed and System
Resources) poll their own JSON endpoints in the background; the rest
of the page is rendered server-side once per load.

## Self-healing on first load

`index.cfm` is the **bootstrap convergence point** for several
secrets that the rest of the app depends on. If any are missing on
first load (fresh install, after a credential rotation, after a key
file deletion), they are generated in-place before the dashboard
renders:

| Missing artifact | Auto-generated by |
|---|---|
| `/opt/hermes/keys/hermes.key` (AES-256 application key) | `inc/generate_hermes_key.cfm` |
| `encryption_settings.user.serverSecret` (Ciphermail) | `inc/generate_ciphermail_server_secret.cfm` |
| `encryption_settings.user.clientSecret` (Ciphermail) | `inc/generate_ciphermail_client_secret.cfm` |
| `encryption_settings.user.systemMailSecret` (Ciphermail) | `inc/generate_ciphermail_mail_secret.cfm` |
| `/opt/hermes/scripts/container_ips.txt` (Fail2ban) | `inc/generate_container_ips.cfm` |

Each generator is idempotent — it checks "is the file/row empty?"
before writing, so subsequent loads are no-ops. This is why the very
first dashboard render on a fresh install is slightly slower than
subsequent loads.

## System Info card

The columns and what they mean:

| Column | Source | Notes |
|---|---|---|
| **Version** | `system_settings.version_no` | Always `Docker` in the Docker era. Hyphenated legacy values (e.g. `2024-08`) belong to bare-metal installs that have not yet been migrated. |
| **Build** | `system_settings.build_no` | Current release tag (`vYYMMDD`). This is the single value the update orchestrator compares against to decide what to apply. See [System Update](https://docs.deeztek.com/books/administrator-guide/page/system-update). |
| **Edition** | `session.edition` | `Community` or `Pro`. Community shows an `ENTER SERIAL` link to [System Settings](https://docs.deeztek.com/books/administrator-guide/page/system-settings); Pro shows the edition with state suffixes (`Pro (Templates Modified)`, `Pro (Validation Required)`) when the license is in a non-VALID state. |
| **Uptime** | `/opt/hermes/scripts/get_uptime.sh` via `cfexecute` | Host uptime in days, not container uptime. |
| **Console IP or FQDN** | `parameters2.console.host` | The value bound on [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings). |
| **License Status** | `session.license` + `session.licenseexpires` | One of `VALID`, `EXPIRED`, `REVOKED`, `INVALID`, `TAMPERED`, `PENDING_VALIDATION`, `VIOLATION`, `N/A`. Community shows `N/A`. The status text resolves through the same license-evaluation logic documented in `inc/setsession.cfm`. |
| **OS Updates** | `/var/run/reboot-required` (file exists?) | `REBOOT REQUIRED` when the kernel/glibc-class update on the host needs a reboot. Hermes does not reboot the host — the admin does, via SSH. |
| **Hermes Update** | `inc/check_system_update.cfm` (reads `/opt/hermes/updates/check_system_update.txt`) | `UPDATE BUILD vYYMMDD FOUND` (clickable, opens GitHub release notes modal), `LATEST VERSION`, `UPDATE CHECK PENDING`, or `UPDATE CHECK UNAVAILABLE`. The cache file is written by the daily `schedule/check_for_update.cfm` job; see [System Update § Daily update check](https://docs.deeztek.com/books/administrator-guide/page/system-update#daily-update-check). |

The Hermes Update cell never makes a network call. It reads the cache
file written by the Ofelia-scheduled CFML job, so the page renders
fast and works offline.

> **By design.** The dashboard never calls the GitHub Releases API at
> page-render time. All network IO for "is there an update" happens
> in the once-a-day Ofelia job. If the cache file is missing (first
> load after install, before the first scheduled run) you see
> `UPDATE CHECK PENDING`, never a hang.

## Release-notes modal

Clicking `UPDATE BUILD vYYMMDD FOUND` opens a modal that fetches the
release body from the GitHub Releases API:

```
GET https://api.github.com/repos/deeztek/Hermes-Secure-Email-Gateway/releases/tags/<vYYMMDD>
```

The response's `body` field (Markdown) is converted client-side to
HTML and rendered in the modal. If the fetch fails (rate limit,
offline, release deleted) the modal degrades to a "View Release on
GitHub" button. The GitHub release page is canonical; the modal is a
convenience.

The tag passed to the URL is the build number **as-is**. Earlier
revisions of this code prepended `build-` to match the legacy
update-server file-name convention; that prefix was removed during
the [#218](https://github.com/deeztek/Hermes-Secure-Email-Gateway/issues/218)
release-engineering pivot because GitHub release tags do not carry it.

## Alert callouts

`inc/system_alerts.cfm` builds a priority-ordered array of alerts
each page load. The array is sorted ascending by priority (lower
number = more urgent), then split:

| Priority | Surface |
|---|---|
| 1–5 | Full-width callout banner under the navbar, rendered by `top_navbar.cfm` |
| 6+  | Compact badge next to the user/edition pill in the navbar |

Every Hermes page that includes `top_navbar.cfm` participates — the
callouts are not exclusive to System Status — but System Status is
where an admin is most likely to be looking when they appear.

### License-state alerts

| Alert | Priority | Trigger |
|---|---|---|
| Templates Modified | 1 | `session.license = TAMPERED` |
| License Revoked | 2 | `session.license = REVOKED` |
| Invalid License | 3 | `session.license = INVALID` |
| Validation Required | 3 | `session.license = PENDING_VALIDATION` (no offline baseline yet) |
| Grace Period Expired | 3 | `session.license = GRACE_PERIOD_EXPIRED` |
| License Expired | 4 | `session.license = EXPIRED` |
| Offline Mode | 5 | `VALID` + `validationMode = cached`; includes remaining grace-period day count |
| Expires in <N> days | 10 | `VALID` + `licensevaliddays <= 30` (badge only, never a callout) |

### Fresh-install onboarding nudges

Two universal nudges fire when the gateway is still using seed
defaults. Both apply to every install regardless of topology
(relay-only, mail-server-only, hybrid) and they live here precisely
**because** they are topology-agnostic.

| Nudge | Priority | Trigger | Fix link |
|---|---|---|---|
| Placeholder hostname | 2 | `parameters.myhostname = 'hermes.domain.tld'` (Postfix seed) OR `parameters2.console.host = 'smtp.domain.tld'` (console seed) | [Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup) |
| Self-signed cert | 3 | Every row in `system_certificates` is flagged `system = 1` (only bootstrap cert exists) | [System Certificates](https://docs.deeztek.com/books/administrator-guide/page/system-certificates) |

Earlier iterations of this list included three more topology-specific
nudges (no relay domains, no relay networks, no recipients-or-mailboxes).
They were removed because they fired noisily on installs that legitimately
don't have those things — a relay-only install has zero mailboxes and
that is the correct configuration. Topology-specific onboarding guidance
lives in [`docs/install/get-started-docker.md`](https://docs.deeztek.com/books/installation-reference/page/get-started-docker)
instead, where it is read deliberately rather than nagged about every
page load.

### Other alerts (placeholders)

`system_alerts.cfm` includes guarded blocks for **Reboot Required**
(when `session.rebootRequired = true`) and **Cert Expiring** (when
`session.certExpiringSoon = true`). Neither flag is currently
populated by any code path — they are reserved for future widgets
that compute the values and stash them in the session.

## Messages Processed card

Polls `api/get_message_stats.cfm` on initial load and every 60s. The
period selector reloads with the new window value but does not
otherwise change the polling cadence.

| Bucket | Color |
|---|---|
| Clean | Green (`#28a745`) |
| Spam | Yellow (`#ffc107`) |
| Virus | Red (`#dc3545`) |
| Banned | Gray (`#6c757d`) |
| Bad Header | Dark (`#343a40`) |
| Other | Cyan (`#17a2b8`) |

The endpoint reads from the `msgs` table (Amavis-fed; covered in more
detail under [System Logs](https://docs.deeztek.com/books/administrator-guide/page/system-logs)) filtered to the selected
window. A 10,000-row hard cap is applied to keep page-load fast on
busy installs; when the cap is hit, the total is suffixed with `+`
and a small "Showing most recent 10,000 messages" note appears under
the breakdown.

## System Resources card

Seven progress rings, auto-refreshing every 10s via
`api/get_system_resources.cfm`:

| Ring | Source |
|---|---|
| CPU Utilization % | `/opt/hermes/scripts/get_cpu_usage.sh` |
| Memory Utilization % | `/opt/hermes/scripts/get_memory_usage.sh` |
| Root FileSystem % | `df` on `/` (host root) |
| Data FileSystem % | `df` on the Data tier mount (see [Storage Topology](https://docs.deeztek.com/books/installation-reference/page/storage-topology-5-tiers)) |
| Archive FileSystem % | `df` on the Archive tier mount (#260; Amavis quarantine) |
| Vmail FileSystem % | `df` on the Vmail tier mount |
| Nextcloud FileSystem % | `df` on the Nextcloud tier mount |

Each ring color-codes by threshold (`get_system_*_usage.cfm` returns
a hex color alongside the value). The rings degrade independently —
a missing tier mount renders that ring at 0 rather than failing the
whole card.

Tiers that share a host path (a smaller install where Archive, Vmail,
and Nextcloud are pinned to the same disk as Data) will show the same
percentage on multiple rings. That is the correct behavior; the
underlying `df` reading is the same.

## What is NOT on this page

System Status is intentionally a "snapshot" page, not an
investigation tool. It surfaces alerts and current resource state. It
does not surface:

| Want to see | Go here instead |
|---|---|
| Mail queue contents / deferred messages | [Mail Queue](https://docs.deeztek.com/books/administrator-guide/page/mail-queue) |
| Per-message processing history | [System Logs](https://docs.deeztek.com/books/administrator-guide/page/system-logs) |
| Detailed cert / SAN status | [System Certificates](https://docs.deeztek.com/books/administrator-guide/page/system-certificates), [SMTP TLS Settings](https://docs.deeztek.com/books/administrator-guide/page/smtp-tls-settings) |
| Container health (`docker ps` output, restart counts) | Host shell — Hermes does not surface raw Docker state in the web UI |
| Scheduled-job last-run / next-run | [Scheduled Tasks](https://docs.deeztek.com/books/administrator-guide/page/scheduled-tasks) |
| Fail2ban bans in effect | [Intrusion Prevention](https://docs.deeztek.com/books/administrator-guide/page/ips) |
| Past update history | The git log on the host (`git log --oneline -- updates/`) |

## Failure semantics

| What breaks | What happens |
|---|---|
| `/opt/hermes/updates/check_system_update.txt` does not exist | `hermesupdate = "UPDATE CHECK PENDING"`; cell renders cleanly |
| Ofelia job has been failing for days (cache stale or shows old build) | Page still renders; the **Hermes Update** cell reflects whatever the last successful run wrote |
| GitHub API rate-limited or unreachable when an admin clicks the release-notes link | Modal falls back to a "View Release on GitHub" button |
| `df` on a tier mount fails | That ring renders at 0 with default color; other rings render normally |
| `get_uptime.sh` exits non-zero | Page short-circuits to the error template — uptime is treated as critical because its absence usually means a broken commandbox |
| `system_settings.build_no` / `version_no` row missing | Empty value in the matching cell; license cells will display `N/A` |
| `inc/generate_*` first-load generator fails | Logged; affected feature degrades downstream (Ciphermail mail crypto disabled, etc.) — the dashboard itself still renders |

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/index.cfm` | `hermes_commandbox` | Page |
| `config/hermes/var/www/html/admin/2/inc/system_alerts.cfm` | `hermes_commandbox` | Alert array builder (license + nudges + future widgets) |
| `config/hermes/var/www/html/admin/2/inc/check_system_update.cfm` | `hermes_commandbox` | Cache-file reader (Docker path) |
| `config/hermes/var/www/html/admin/2/inc/get_system_*.cfm` | `hermes_commandbox` | Per-widget data fetchers |
| `config/hermes/var/www/html/admin/2/api/get_system_resources.cfm` | `hermes_commandbox` | JSON endpoint for progress-ring auto-refresh |
| `config/hermes/var/www/html/admin/2/api/get_message_stats.cfm` | `hermes_commandbox` | JSON endpoint for message-stats card |
| `/opt/hermes/scripts/get_uptime.sh`, `get_cpu_usage.sh`, `get_memory_usage.sh` | `hermes_commandbox` | Shell helpers invoked via `cfexecute` |
| `/opt/hermes/updates/check_system_update.txt` | `hermes_commandbox` | Cache file written by `schedule/check_for_update.cfm`; read here |
| `/var/run/reboot-required` | host filesystem (mounted into `hermes_commandbox`) | Ubuntu's standard "kernel upgrade pending" sentinel |
| `/opt/hermes/keys/hermes.key` | `hermes_commandbox` | Created on first load if missing |
| `encryption_settings` table | `hermes_db_server` (`hermes` DB) | Ciphermail secrets populated on first load if empty |

## Related

- [System Update](https://docs.deeztek.com/books/administrator-guide/page/system-update) — the page System Status's
  "Hermes Update" cell links to; covers the daily update check and
  the orchestrator that consumes the cache file
- [System Settings](https://docs.deeztek.com/books/administrator-guide/page/system-settings) — where the Pro serial number
  is entered to lift Community to Pro
- [System Certificates](https://docs.deeztek.com/books/administrator-guide/page/system-certificates) — the page the
  "Self-signed cert" nudge links to
- [Server Setup](https://docs.deeztek.com/books/administrator-guide/page/server-setup) — the page the "Placeholder
  hostname" nudge links to
- [System Logs](https://docs.deeztek.com/books/administrator-guide/page/system-logs) — drill-down for the message-volume
  numbers shown in the Messages Processed card
- [Storage Topology](https://docs.deeztek.com/books/installation-reference/page/storage-topology-5-tiers) — explains
  the five tiers the resource-usage rings reflect
- [Release and Update Methodology](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology)
  — methodology behind the version stamp this page surfaces

# System Update

# System Update

Admin path: **System > System Update**
(`view_system_updates.cfm`). Update infrastructure:
`config/hermes/var/www/html/schedule/check_for_update.cfm` (daily
GitHub Releases poll), `config/hermes/var/www/html/admin/2/inc/check_system_update.cfm`
(dashboard cache-file reader), `scripts/system_update_docker.sh`
(the update orchestrator), `config/ofelia/config.ini` (the cron
schedule that triggers the daily check).

This page tells an admin **whether a new Hermes release is available
and how to apply it**. It is intentionally thin: every detail of how
upgrades actually work — the artifact taxonomy, the orchestrator's
five phases, the idempotency rules, the release-cut procedure — lives
in [Release and Update Methodology](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology),
which is the canonical reference. This page documents the **admin
surface** that sits on top of that methodology.

> **Update is currently CLI-driven.** The page itself displays a
> notice that points at the docs and the [release-and-update
> methodology](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology);
> the actual upgrade is run on the Docker host via SSH using
> [`scripts/system_update_docker.sh`](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology#the-update-orchestrator-scriptssystem_update_dockersh).
> A future revision will move the launch button into the page itself;
> until it does, the CLI is the supported path.

## How an admin knows there is an update

Three independent surfaces converge on the same answer:

```
                           +------------------------------+
                           | GitHub Releases API          |
                           | repos/deeztek/                |
                           |   Hermes-Secure-Email-Gateway |
                           |   /releases/latest            |
                           +------------------------------+
                                        ^
                                        | daily 04:30 UTC
                                        |
              +-------------------------+--------------------------+
              |  schedule/check_for_update.cfm                      |
              |    - polls /releases/latest                         |
              |    - compares tag_name to system_settings.build_no  |
              |    - writes /opt/hermes/updates/check_system_update.txt
              |    - emails admin_email when UPDATEFOUND            |
              +-------------------------+--------------------------+
                                        |
            +---------------------------+---------------------------+
            |                           |                           |
            v                           v                           v
    +----------------+         +-----------------+         +----------------+
    | Dashboard cell |         | System Update   |         | Email to       |
    | (Hermes Update)|         | page            |         | admin_email    |
    | reads cache    |         | (today: docs    |         | (one-shot per  |
    | every load     |         |  notice; v2:    |         |  release-found |
    |                |         |  Run Update btn)|         |  detection)    |
    +----------------+         +-----------------+         +----------------+
```

All three are downstream of one cached value — the
`/opt/hermes/updates/check_system_update.txt` file. The dashboard
does not call GitHub on page load; the email is not sent on page
load; only the once-a-day Ofelia job actually hits the API.

## Daily update check

`config/ofelia/config.ini` schedules a single `job-exec` against the
`hermes_commandbox` container:

```
[job-exec "hermes-update-check"]
schedule =  0 30 04 * * *
container = hermes_commandbox
command = /opt/hermes/schedule/update_check.sh
```

The shell wrapper resolves to a `curl --silent
http://localhost:8888/schedule/check_for_update.cfm` against the
internal Lucee port — no auth dance, no X-Token header, same
convention as `hermes-message-cleanup`, `hermes-quarantine-notify`,
and every other Hermes scheduled job. The CFML target does the
actual work:

1. Read current `build_no` from `system_settings`.
2. `GET https://api.github.com/repos/deeztek/Hermes-Secure-Email-Gateway/releases/latest`
   with a 30s timeout.
3. On HTTP 200, parse `tag_name` and compare to the local build via
   simple string comparison (`vYYMMDD` sorts correctly as a string
   because the format is fixed-width calendar versioning — see
   [Release and Update Methodology § Calendar versioning](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology#calendar-versioning)).
4. Write `/opt/hermes/updates/check_system_update.txt` regardless of
   outcome — the dashboard reader needs **something** to display.
5. On `UPDATEFOUND`, send one notification email to `admin_email`.

### Cache file format

The file is a single `@`-delimited line. The format is preserved from
the pre-#218 legacy update server (`updates.deeztek.com`) for
backward-compat with the dashboard reader; for Docker installs,
several fields are unused.

| Position | Field | Docker meaning |
|---|---|---|
| 1 | status | `SUCCESS` (update available), `NOUPDATE`, or `UPDATE CHECK UNAVAILABLE` |
| 2 | build | The new tag (e.g. `v260601`) on `SUCCESS`, current tag on `NOUPDATE` |
| 3 | released | `yyyy-mm-dd` from `published_at` |
| 4 | filename | _empty_ (was tarball name on legacy server) |
| 5 | release_notes_url | GitHub `html_url` for the release |
| 6 | release_notes_file | _empty_ (was per-release HTML file on legacy server) |
| 7 | mysqlroot | _empty_ (was installer credential on legacy server) |
| 8 | dev | `daily_update_check` value from `system_settings` |

### Email notification

The notification is **once per release** — re-runs of the check
against the same latest tag do not re-send (the job re-detects
`UPDATEFOUND` every day, but the email path is gated on the cached
comparison; if the dashboard cell already reads `UPDATEFOUND`, the
admin is already informed). The email is sent through
`hermes_postfix_dkim` on port 10026 (the post-content-filter
re-injection port that auto-DKIM-signs), so the message is signed
under the gateway's own DKIM key like any other system mail.

The message includes a GitHub link and, when `console.host` is set,
a hint to open the admin console where the dashboard prompt is
waiting.

### Toggling the daily check

The `daily_update_check` row in `system_settings` is wired through
to the cache file (field 8 above), but the Ofelia schedule itself
is the actual on/off switch — to stop the daily check, remove or
comment the `[job-exec "hermes-update-check"]` block in
`config/ofelia/config.ini` and restart `hermes_ofelia`. The
`system_settings` toggle is a legacy UI surface from the
pre-Ofelia era; the modern path is the Ofelia config.

## Status values shown on the dashboard

The dashboard's **Hermes Update** cell (System Info card, last
column) is the operator-visible side of this whole pipeline. See
also [System Status § System Info card](https://docs.deeztek.com/books/administrator-guide/page/system-status#system-info-card).

| Cache status | Cell text | What it means |
|---|---|---|
| `SUCCESS` | `UPDATE BUILD vYYMMDD FOUND` (link → release-notes modal) | New release available. Click for GitHub release notes; act via the orchestrator below. |
| `NOUPDATE` | `LATEST VERSION` | Local `build_no` matches `tag_name` on GitHub. |
| `UPDATE CHECK UNAVAILABLE` | `UPDATE CHECK UNAVAILABLE` | GitHub API call failed (rate limit, offline, DNS). Check `hermes_update_check` log on `hermes_commandbox`. |
| _(cache file missing)_ | `UPDATE CHECK PENDING` | First-ever render before the 04:30 job has run. Wait one cycle or invoke manually (below). |

## Running the update

### Today (CLI)

The page is currently a notice that delegates to the docs. To
actually apply an update, SSH to the Docker host and run the
orchestrator:

```
cd /opt/hermes-seg-docker-gl
./scripts/system_update_docker.sh                 # apply latest
./scripts/system_update_docker.sh v260601         # apply a specific tag
./scripts/system_update_docker.sh --dry-run       # show what would run, change nothing
./scripts/system_update_docker.sh --skip-git      # containers + artifacts only
./scripts/system_update_docker.sh --skip-compose  # git + artifacts only
./scripts/system_update_docker.sh -y              # don't prompt for confirmation
```

The orchestrator walks five phases. For the full breakdown of each
phase — preflight, code pull, container update, per-release artifact
application, finalize, and the persistent post-upgrade hook — see
[Release and Update Methodology § The update orchestrator](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology#the-update-orchestrator-scriptssystem_update_dockersh).
For the categories of artifact the orchestrator applies (baseline vs
per-release vs persistent hook), see [§ Artifact taxonomy](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology#artifact-taxonomy--where-does-what-go).

A condensed version of what the orchestrator does:

| Phase | What it does | Idempotent? |
|---|---|---|
| Preflight | Refuses to run if working tree dirty, `hermes_db_server` down, or target older than current | Trivially |
| 1 — Pull new code | `git fetch --tags` + `git checkout <tag>` | Yes |
| 2 — Update containers | `docker compose pull` + `docker compose up -d` | Yes; only restarts services whose image or config changed |
| 3 — Apply per-release artifacts | Walks `updates/v*/` directories newer than current `build_no`, applies `sql/` → `cfml/` → `scripts/` in order; each release's `schema_updates.sql` advances `build_no` at its end | Yes (every artifact must be idempotent — see methodology doc) |
| 4 — Finalize | Restarts `hermes_commandbox`; logs reminders for `occ upgrade` (if `NCVERSION` bumped) and `*.HERMES` template re-render | Yes |
| 5 — Post-upgrade hook | `curl http://localhost:8888/schedule/post_upgrade.cfm` — runs any persistent migrations gated by the `migrations` table | Yes (per-block gated) |

Output is teed to a timestamped log under `install-logs/`:
`install-logs/hermes_update_YYYYMMDD_HHMMSS.log`. If anything fails,
the orchestrator aborts (`set -e`); inspect the log, fix the
underlying issue, and re-run. Idempotency makes mid-upgrade resume
safe — a failed Phase 3 picks up at the same release on the next
run and re-applies its full artifact set; `IF NOT EXISTS` and
`INSERT IGNORE` guards turn the second pass into a no-op.

### Tomorrow (in-page button)

The page is positioned to grow a "Check Now" button (force-runs the
daily check ahead of schedule) and a "Run Update" button (invokes
the orchestrator via a CFML wrapper). Neither is wired today; the
infrastructure they would call is already in place.

Track this in [#221](https://github.com/deeztek/Hermes-Secure-Email-Gateway/issues/221).

## Forcing a manual check

If you cannot wait for the 04:30 UTC schedule (e.g., a release just
shipped and you want the dashboard to update now), invoke the same
endpoint Ofelia does:

```
docker exec hermes_commandbox curl --silent http://localhost:8888/schedule/check_for_update.cfm
```

The response is the literal string `OK` and the cache file is
rewritten in place. The dashboard picks it up on the next page load.

The same invocation is what Ofelia would have run at 04:30 — there
is no difference between manual and scheduled execution.

## The version stamp

What the orchestrator and the dashboard both compare against is
the **`build_no`** row in `system_settings`:

| Setting | Value | Set by |
|---|---|---|
| `version_no` | `Docker` | Baseline (`hermes_install.sql`) on fresh install; never changes in the Docker era |
| `build_no` | `vYYMMDD` | Baseline at install; advanced by each release's `updates/v<DATE>/sql/schema_updates.sql` at its very end |

A successful Phase 3 ends with `build_no` matching the target tag.
If after an orchestrator run those two disagree, something in Phase
3 silently no-op'd a stamp-advance — inspect the log. See [Release
and Update Methodology § The release-cut procedure](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology#the-release-cut-procedure-developer-side)
for the exact `UPDATE system_settings ...` block every release's
`schema_updates.sql` ends with.

## Skipping releases

The orchestrator handles release-skipping natively. Upgrading from
`v260119` straight to `v260801` (skipping a hypothetical intermediate
`v260601`) walks **both** release directories in order during Phase
3 — `v260601/` first, then `v260801/`. `build_no` advances after each
release's `sql/` step, so the in-between cursor advancement is safe.

> **Operational consequence.** Releases are designed to be applied
> in chronological order; skipping is supported (and tested) but is
> not the optimized path. If you upgrade rarely, expect Phase 3 to
> take proportionally longer the further behind you are.

## Failure semantics

| What breaks | What happens |
|---|---|
| GitHub Releases API unreachable | `UPDATE CHECK UNAVAILABLE` in dashboard cell; cached value is overwritten with the unavailable marker. Logged to `hermes_update_check`. |
| GitHub Releases API rate-limited (HTTP 403 or 429) | Same as unreachable — anonymous polling is subject to GitHub's 60 req/hr per-IP limit. The daily schedule keeps usage trivial; the only way to hit the limit is repeated manual invocations. |
| `/releases/latest` returns 404 (no qualifying release on the repo) | Treated as `NOUPDATE`, not an error — the repo simply hasn't shipped its first qualifying release yet. |
| `published_at` in API response fails `ParseDateTime` | Falls back to the raw ISO string in the cache file — non-fatal. |
| `cfmail` notification fails | Logged to `hermes_update_check`; cache file write proceeds (notification is best-effort). |
| Cache file cannot be written (`/opt/hermes/updates/` not writable) | Logged; the dashboard falls through to `UPDATE CHECK PENDING`. |
| Orchestrator Phase 1 fails (tag not pushed, dirty tree) | Aborts before touching containers or DB. Working tree is unchanged. |
| Orchestrator Phase 2 fails (image pull error, registry unreachable) | Aborts; previous containers keep running with their existing images. Re-run after fixing the registry / network issue. |
| Orchestrator Phase 3 fails on a SQL artifact | Aborts; `build_no` reflects whatever the last successful release's stamp set it to. Re-run picks up at the failed release; idempotency guards re-apply the partial work safely. |
| Orchestrator Phase 5 fails | Logged as a warning, **not** treated as fatal — the orchestrator exits 0. Run `post_upgrade.cfm` manually after fixing the underlying issue: `docker exec hermes_commandbox curl --silent http://localhost:8888/schedule/post_upgrade.cfm` |

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_system_updates.cfm` | `hermes_commandbox` | The admin page (notice + future Run Update wiring) |
| `config/hermes/var/www/html/admin/2/inc/check_system_update.cfm` | `hermes_commandbox` | Reads the cache file for the dashboard cell |
| `config/hermes/var/www/html/schedule/check_for_update.cfm` | `hermes_commandbox` | Daily poll target |
| `config/ofelia/config.ini` (`hermes-update-check` job) | `hermes_ofelia` | Schedules the daily poll |
| `scripts/system_update_docker.sh` | host shell | The update orchestrator |
| `scripts/install_hermes_docker.sh --apply-schema` | host shell | Legacy pre-orchestrator schema-apply path; superseded by the orchestrator but still functional for emergency manual use |
| `/opt/hermes/updates/check_system_update.txt` | `hermes_commandbox` | Cache file; format above |
| `install-logs/hermes_update_<timestamp>.log` | host filesystem | Orchestrator output, teed live |
| `system_settings.build_no` / `system_settings.version_no` | `hermes_db_server` (`hermes` DB) | The version stamp the orchestrator and the dashboard both read |
| `migrations` table | `hermes_db_server` (`hermes` DB) | Tracks which Phase 5 migration blocks have run; see [Methodology § Phase 5](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology#phase-5--persistent-post-upgrade-hook) |
| `updates/v*/sql/schema_updates.sql` | repo working tree | Per-release SQL deltas; one of three artifact categories |
| `updates/v*/cfml/*.cfm` | repo working tree | Per-release CFML migrations (encryption / file IO / API calls) |
| `updates/v*/scripts/*.sh` | repo working tree | Per-release host-shell one-shots |
| `config/hermes/var/www/html/schedule/post_upgrade.cfm` | `hermes_commandbox` | The persistent cross-release migration hook |

## Related

- [Release and Update Methodology](https://docs.deeztek.com/books/installation-reference/page/release-and-update-methodology)
  — **the canonical reference** for everything covered on this page.
  Read it before adding a schema change, a one-shot migration, a
  service config edit, or cutting a release tag.
- [System Status](https://docs.deeztek.com/books/administrator-guide/page/system-status) — the dashboard that surfaces
  the **Hermes Update** cell this page's daily check populates
- [System Settings](https://docs.deeztek.com/books/administrator-guide/page/system-settings) — `admin_email` (target of
  the update-found notification email), `postmaster` (sender), and
  the legacy `daily_update_check` toggle
- [Scheduled Tasks](https://docs.deeztek.com/books/administrator-guide/page/scheduled-tasks) — the admin surface over the
  Ofelia config that schedules the daily check
- [System Logs](https://docs.deeztek.com/books/administrator-guide/page/system-logs) — where `hermes_update_check` log
  entries surface for debugging failed polls
- [Storage Topology](https://docs.deeztek.com/books/installation-reference/page/storage-topology-5-tiers) — the four
  storage tiers an upgrade touches (Config tier is where `git
  checkout` runs; Data tier holds `/opt/hermes/updates/`)

# System Users

# System Users

Admin path: **System > System Users** (`view_system_users.cfm`,
`inc/system_user_actions.cfm`, `inc/ldap_add_user.cfm`,
`inc/ldap_add_user_remoteauth.cfm`, `inc/ldap_add_user_groups.cfm`,
`inc/ldap_modify_user.cfm`, `inc/ldap_modify_user_password.cfm`,
`inc/ldap_change_user_access_control.cfm`,
`inc/ldap_delete_user.cfm`, `inc/delete_system_user.cfm`,
`inc/delete_system_user_devices.cfm`, `inc/generate_ldap_password.cfm`,
`inc/check_hibp.cfm`).

This page manages **admin console operators** — the accounts that can
sign in at `/admin/`. Mailbox users (Email Server) and relay recipients
(Email Relay) are not managed here even though they share the same
underlying LDAP tree; they have their own admin pages.

Each row written by this page lands in **two** stores: the `system_users`
table (Hermes DB — UI metadata, auth-type flag, `applied`/`ldap_synced`
status), and an LDAP entry under `ou=users,dc=hermes,dc=local` whose
group memberships in `ou=groups` give the user actual access. Authelia
binds against LDAP for every console login; the DB row exists so the
admin UI has something to display and edit.

## What this page creates — and what it doesn't

| Creates | Doesn't create |
|---|---|
| Console admin accounts (`cn=admins` group membership) | Mailbox accounts (those go through **Email Server > Mailboxes**, populate `mailboxes` + `cn=mailboxes`) |
| LDAP entry + DB row in lockstep | Relay-recipient accounts (those go through **Email Relay > Recipients**, populate `recipients` + `cn=relays`) |
| Local-auth (password lives in Hermes LDAP) **or** RemoteAuth (password lives in upstream AD/LDAP) admins | Authelia-side rows (Authelia is stateless against LDAP — no per-user provisioning needed) |
| `cn=one_factor` or `cn=two_factor` group membership at create time | The MFA enrolment itself — the user still has to enrol TOTP/WebAuthn/Duo from the user portal's Account Settings page once they sign in |

> **Operational consequence.** Every account this page creates is an
> admin. There is no "create a reader-only admin" or "create an
> auditor" path today. Granular role assignment is a planned extension;
> the current model is binary — either you're an admin (full console
> access) or you're not. The `access_control` column gates one-factor
> vs. two-factor at the **login gate**, not at the privilege level.

## How LDAP membership is structured

System users live under the same OU as every other identity in Hermes,
and the user's role is determined by **which groups contain their DN**
in the `member` attribute (see [Credential Model](https://docs.deeztek.com/books/administrator-guide/page/credential-model) for the full architecture).

```
dc=hermes,dc=local
├── ou=groups
│   ├── cn=admins              <-- every System User is added here
│   ├── cn=mailboxes           <-- mailbox users (not this page)
│   ├── cn=relays              <-- relay recipients (not this page)
│   ├── cn=one_factor          <-- access_control = one_factor
│   └── cn=two_factor          <-- access_control = two_factor
└── ou=users
    ├── cn=admin               <-- the install-time built-in admin
    ├── cn=jsmith              <-- example local-auth System User
    └── cn=corp_user           <-- example remote-auth stub entry
```

`inc/ldap_add_user_groups.cfm` adds the new System User's DN to **both**
`cn=admins` and the chosen access-control group in a single LDIF
operation. The LDIF template `/opt/hermes/templates/ldap_addusergroup.ldif`
contains two `changetype: modify` blocks that both reference the same
`THE_USERNAME` placeholder.

## Database schema — `system_users`

| Column | Purpose |
|---|---|
| `id` | PK |
| `username` | LDAP `cn` / `uid`. Immutable after create (the edit modal renders this field read-only). |
| `email` | `mail` LDAP attribute; also where forgotten-password notifications would go (but admin self-service reset is **disabled** for security — see Password Resets below) |
| `first_name`, `last_name` | `givenName`, `sn` |
| `password` | Argon2id hash with the `{ARGON2}` prefix that OpenLDAP's argon2 overlay expects. Empty string for RemoteAuth users (their password is upstream). |
| `access_control` | `one_factor` or `two_factor` — drives Authelia's access-control policy at login |
| `auth_type` | `local` or `remote` — drives the entire create/edit flow |
| `remoteauth_domain` | For `auth_type = 'remote'`, the `domain_name` key into `remoteauth_mappings`. NULL for local-auth. |
| `system` | `1` = install-time built-in admin (delete-protected). `2` = admin-created. |
| `applied` | `1` = current state synced to LDAP. `2` = pending sync (transient during a save). |
| `ldap_synced` | `1` = LDAP entry exists. `0` = DB row exists but LDAP entry doesn't (a half-sync state the edit handler explicitly detects and tries to repair). |
| `pushover_user_key`, `pushover_enabled` | Optional Pushover notifications for admin alerts |

## Local-auth user create flow

```
Admin clicks Create System User
        │
        ▼
form validation: username regex, email format,
first/last name regex, password length 8-64
        │
        ▼ (optional)
HIBP check: SHA-1 prefix sent to api.pwnedpasswords.com
        │   reject if hash suffix matches a known breach
        ▼
generate_ldap_password.cfm
        │   docker run --rm authelia/authelia:VERSION \
        │     authelia crypto hash generate argon2 \
        │     --password <plaintext>
        │   returns: {ARGON2}$argon2id$v=19$m=...$...$...
        ▼
INSERT INTO system_users (..., password='{ARGON2}...')
        │
        ▼
ldap_add_user.cfm  -- builds adduser LDIF from template,
        │             docker exec hermes_ldap ldapadd
        │             writes entry to ou=users with userPassword
        ▼
ldap_add_user_groups.cfm  -- adds DN to cn=admins
        │                    + cn=<one_factor|two_factor>
        ▼
UPDATE system_users SET ldap_synced = 1
        ▼
session.m = 20  ("System User was created successfully")
```

The Authelia hash generator runs as a **one-shot `docker run --rm`**
against the same Authelia image the platform already runs — zero host
dependency, format guaranteed to match what Authelia validates at
login. The hashing happens in `inc/generate_ldap_password.cfm`.

## RemoteAuth user create flow

When the **Authentication Type** dropdown is set to `Remote`, the form
shape changes: the password fields disappear and a **RemoteAuth
Domain** dropdown becomes required (populated from
`remoteauth_mappings` where `enabled = 1`). This option only appears
when (a) the install has a Pro license, (b) `remoteauth_settings.enabled = 1`,
and (c) at least one enabled mapping exists.

```
INSERT INTO system_users (..., password='', auth_type='remote',
                          remoteauth_domain='<key>')
        │
        ▼
ldap_add_user_remoteauth.cfm  -- writes a stub entry with NO password,
        │                        with seeAlso pointing at the upstream
        │                        DN (expanded from the mapping's
        │                        remote_dn_pattern) and associatedDomain
        │                        set to the mapping key
        ▼
ldap_add_user_groups.cfm  -- adds DN to cn=admins
                             + cn=<one_factor|two_factor>
```

At login, Authelia binds locally against the stub. Hermes's
`slapo-remoteauth` overlay sees the `associatedDomain`, finds the
matching upstream URI, and rebinds as the `seeAlso` DN. The local entry
has no `userPassword` to validate against — the upstream bind is the
only decision. See [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) for the
overlay mechanics.

> **Username uniqueness is global.** The `system_users.username` column
> is checked for collision across **both** auth types. If your upstream
> AD already has a user named `dedwards` and Hermes already has a
> local-auth admin named `dedwards`, the second account cannot be
> created with the same username. The form's error message suggests
> `username@domain` or `username.domain` as a workaround.

## Edit flow — what can and cannot change

Two fields are **immutable** after create and rendered read-only in
the edit modal:

| Field | Why immutable |
|---|---|
| **Username** | It's the LDAP RDN (`cn=`). Renaming would require a `modrdn` plus updating every group's `member` attribute that references the old DN. The "delete and recreate" path is simpler and safer. |
| **Authentication Type** | Switching local-to-remote or remote-to-local would change the LDAP entry's objectClass set (loses or gains a password attribute) and break the `seeAlso`/`associatedDomain` overlay reference. Recreate the user instead. |

Everything else is editable: email, first/last name, access-control
policy (one/two factor), and — for local-auth users only — the password
(via the **Set User Password = YES** toggle which reveals the password
fields). The password edit re-runs the same HIBP check and Argon2 hash
flow as create.

The access-control change is non-trivial: switching `one_factor` to
`two_factor` (or vice versa) means removing the DN from the old group
and adding it to the new one. `inc/ldap_change_user_access_control.cfm`
handles both ops in sequence.

### Half-synced repair

If a previous save crashed between the DB INSERT and the LDAP write
(`ldap_synced = 0`, no LDAP entry exists), the edit handler refuses
to save the row in a "NO password change" mode — there's no password
to push into LDAP. Alert code `16` surfaces the explicit instruction:
"set **Set User Password** to YES and enter a new password" so the
sync can complete on the next save attempt. The user's stored
password is **not** re-pushed because the DB column holds an Argon2
hash, not a plaintext.

## Built-in admin protection — the `system` column

The install script seeds a single built-in admin row (the username
chosen at install time) with `system = 1`. The page's UI rules:

- **Delete button** is hidden on the row.
- **Cannot delete self**: the row matching `session.userid` also hides
  its Delete button (a separate check).

Both gates are also enforced server-side in `system_user_actions.cfm`'s
`deleteuser` branch — the SQL lookup explicitly filters
`system <> '1' AND id <> <session.userid>` so a crafted POST cannot
bypass the hidden button.

## Delete flow

Soft-delete is **not** the model — the row is physically removed.

```
1. DB lookup: refuse if system='1' or id=session.userid
2. ldap_delete_user.cfm:
     docker exec hermes_ldap ldapdelete \
       cn=<username>,ou=users,dc=hermes,dc=local
     (this auto-removes the DN from any group's member attribute via
      the OpenLDAP referential-integrity overlay)
3. delete_system_user.cfm:
     DELETE FROM system_users WHERE id = <id>
4. delete_system_user_devices.cfm:
     docker exec hermes_authelia authelia storage user totp delete \
       <username> --config /config/configuration.yml
     docker exec hermes_authelia authelia storage user webauthn delete \
       <username> --config /config/configuration.yml --all
5. session.m = 1  ("System User was deleted successfully")
```

> **Duo Push devices do NOT delete here.** Duo enrolment lives on
> Duo's cloud servers, not Authelia's database. If the deleted user
> was Duo-enrolled, the admin must also remove them from the Duo
> Admin Panel — both the delete and the 2FA-only modals say so
> explicitly. See [Authentication Settings § Duo Security](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings#duo-security).

## Delete 2FA Devices — without deleting the user

The **yellow key** button on each row opens a dedicated **Delete 2FA
Devices** modal that runs only step 4 of the delete flow above. Use
this when:

- A user reports they've lost their phone / hardware key
- A user is stuck in a 2FA loop after a session expiry
- A user needs to re-enrol with a new TOTP app

After running this, the user is back to a one-factor login state for
the next sign-in, then can re-enrol from their Account Settings page.
The page waits 5 seconds before redirecting to give Authelia time to
flush the credential cache before the success banner appears.

> **Note on Authelia config path.** The two `authelia storage` commands
> reference `--config /config/configuration.yml`. That is the
> in-container path, which differs from where you'd expect to find the
> file from the host's perspective. Authelia's working config inside
> the container is `/config/configuration.yml`, NOT `/etc/authelia/`.
> See [Authentication Settings § Storage backend — MySQL, not SQLite](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings#storage-backend-mysql-not-sqlite)
> for why the MariaDB `authelia` database is what actually gets cleaned
> when these commands run.

## have-i-been-pwned (HIBP) check

The **Check Password Against haveibeenpwned.com** toggle (YES/NO,
default YES) sends only the first 5 hex chars of the password's
SHA-1 to `api.pwnedpasswords.com/range/<prefix>` (k-anonymity:
the full hash is never transmitted) and rejects the password if
the remaining 35 hex chars appear in the returned breach list.

If `api.pwnedpasswords.com` is unreachable (no outbound 443, DNS
broken, etc.) the create fails with alert `100` — the admin must
either restore outbound connectivity or disable the check explicitly
on the form. Silently skipping a security check on network failure
would be the wrong default.

## What this page does NOT do

| Concern | Lives on |
|---|---|
| Mailbox creation | [Email Server > Mailboxes](https://docs.deeztek.com/books/administrator-guide/page/mailboxes) — separate table, separate LDAP group |
| Relay-recipient creation | [Email Relay > Relay Recipients](https://docs.deeztek.com/books/administrator-guide/page/relay-recipients) — separate table, separate LDAP group |
| Per-user MFA enforcement (admin-policy flag) | The mailbox / relay-recipient detail pages set `enforce_mfa` for those user classes. System Users use `access_control` instead; if you set it to `two_factor`, Authelia challenges every login. There is no separate "encourage but don't require" middle state for admins — see [Authentication Settings § MFA enforcement is decoupled from the cn=two_factor LDAP group](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings#mfa-enforcement-is-decoupled-from-the-cntwo_factor-ldap-group-225). |
| Password reset queue (admin processes user-initiated requests) | [Password Resets](https://docs.deeztek.com/books/administrator-guide/page/password-resets) |
| Authelia session length, brute-force throttle, Duo / OIDC | [Authentication Settings](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings) |
| Upstream AD/LDAP mapping for RemoteAuth admins | [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) — must exist + be enabled before this page's Remote dropdown appears |
| Pushover token (per-admin alert notifications) | Set on the per-admin notification configuration page; the `pushover_user_key` column on `system_users` is populated there, not here |

## Failure semantics

| What breaks | What happens |
|---|---|
| `hermes_ldap` container down | Create + Edit fail at the LDAP step. The DB INSERT has already run, so the row exists with `ldap_synced = 0`. Recovery: restart LDAP, edit the user with **Set User Password = YES** to retry the sync (alert `16` will prompt for this on first reload). |
| `hermes_authelia` container down | Create + Edit + Delete still succeed at the DB + LDAP level; the user can't actually log in until Authelia is back. Delete 2FA Devices fails silently (caught and swallowed in the cftry block) — the next attempt after Authelia recovers will succeed. |
| HIBP API unreachable with HIBP check ON | Create + password-change Edit refuse to save (alert `100`). The admin must either fix outbound connectivity or set HIBP to NO. |
| RemoteAuth domain dropdown empty / RemoteAuth disabled | The Remote option doesn't appear in the dropdown at all. To restore: enable a mapping on [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) and click Apply Settings. |
| Username collision | Alert `13` with the suggested `username@domain` or `username.domain` workaround. |

## Files and containers touched

| Path | Owner | Role |
|---|---|---|
| `config/hermes/var/www/html/admin/2/view_system_users.cfm` | `hermes_commandbox` | Page (table + 4 modals) |
| `config/hermes/var/www/html/admin/2/inc/system_user_actions.cfm` | `hermes_commandbox` | Action router (create / edit / delete / deletedevices) |
| `config/hermes/var/www/html/admin/2/inc/generate_ldap_password.cfm` | `hermes_commandbox` | `docker run --rm authelia/authelia ... crypto hash generate argon2` |
| `config/hermes/var/www/html/admin/2/inc/ldap_add_user.cfm` | `hermes_commandbox` | LDIF render + `ldapadd` for local-auth entries |
| `config/hermes/var/www/html/admin/2/inc/ldap_add_user_remoteauth.cfm` | `hermes_commandbox` | Stub-entry LDIF render + `ldapadd` for remote-auth entries |
| `config/hermes/var/www/html/admin/2/inc/ldap_add_user_groups.cfm` | `hermes_commandbox` | Adds DN to `cn=admins` + access-control group |
| `config/hermes/var/www/html/admin/2/inc/ldap_change_user_access_control.cfm` | `hermes_commandbox` | Moves DN between `cn=one_factor` and `cn=two_factor` |
| `config/hermes/var/www/html/admin/2/inc/ldap_delete_user.cfm` | `hermes_commandbox` | `ldapdelete` of the user entry |
| `config/hermes/var/www/html/admin/2/inc/delete_system_user_devices.cfm` | `hermes_commandbox` | `authelia storage user totp delete` + `webauthn delete --all` |
| `config/hermes/var/www/html/admin/2/inc/check_hibp.cfm` | `hermes_commandbox` | HTTPS GET to `api.pwnedpasswords.com` |
| `/opt/hermes/templates/ldap_adduser.ldif` | `hermes_commandbox` | Add-user LDIF (placeholder-substituted) |
| `/opt/hermes/templates/ldap_adduser_remoteauth.ldif` | `hermes_commandbox` | Stub-user LDIF |
| `/opt/hermes/templates/ldap_addusergroup.ldif` | `hermes_commandbox` | Two-block LDIF for `cn=admins` + access-control group add |
| `system_users` table | `hermes_db_server` (`hermes` DB) | Admin metadata + LDAP sync state |
| `cn=admins,ou=groups,dc=hermes,dc=local` | `hermes_ldap` | Source of truth for who can sign in at `/admin/` |

## Related documentation

- [Credential Model](https://docs.deeztek.com/books/administrator-guide/page/credential-model) — full four-credential architecture; this page's accounts use only the web-login credential
- [LDAP RemoteAuth](https://docs.deeztek.com/books/administrator-guide/page/ldap-remoteauth) — required prerequisite for creating remote-auth System Users; covers mappings, DN patterns, TLS settings
- [Authentication Settings](https://docs.deeztek.com/books/administrator-guide/page/authentication-settings) — Authelia's session lifetime, login regulation, MFA capability vs. enforcement model
- [Password Resets](https://docs.deeztek.com/books/administrator-guide/page/password-resets) — the admin queue for user-initiated reset requests; the page's note on why admin self-service reset is blocked
- [Console Settings](https://docs.deeztek.com/books/administrator-guide/page/console-settings) — `/admin/` hostname, cert, and the IP allowlist that layers above this page's access control