Skip to main content

Link Guard

Link Guard

Pro Edition feature. Maps to Email Policies > Link Guard (view_linkguard.cfm, inc/linkguard_write_and_reload.cfm).

Protection travels with the link. It works in the inbox, after a forward to a colleague, or when the message is opened days later on a phone — because the verdict is computed on click, not on delivery.

Components

Component Role
hermes_body_milter Rewrites inbound links at SMTP receive time (LinkGuardModifier) and restores original links on outbound replies/forwards (LinkGuardRestoreModifier).
hermes_linkguard The verdict + redirect engine. Serves the public /lg/ click endpoint and a console-only management API. Holds the operational SQLite store (verdict cache, feeds, click log).
hermes_nginx Reverse-proxies /lg/ from the public console host to the Link Guard container's public port.
Admin console view_linkguard.cfm page; on save, inc/linkguard_write_and_reload.cfm pushes settings, scope, URL rules, and HMAC keys to the engine and reloads the milter maps.
Port Surface Exposure
8894 (public) GET /lg/?t=<token>, POST /lg/proceed, GET /healthz nginx-proxied; reachable by recipients clicking links
8895 (mgmt) POST /api/config, POST /api/keys, POST /api/feed-refresh, GET /api/stats console-only; never exposed publicly

The public surface can never push config or read keys; the management surface is never reachable from the internet.

Pipeline placement

INBOUND (rewrite)                          OUTBOUND (restore)
External MTA ──► Postfix smtpd             User reply/forward ──► Postfix smtpd
        │                                          │
        ▼                                          ▼
   smtpd_milters chain:                       smtpd_milters chain:
     1. OpenDKIM                                1. OpenDKIM
     2. OpenDMARC                               2. OpenDMARC
     3. hermes_body_milter                      3. hermes_body_milter
          └─ LinkGuardModifier:                      └─ LinkGuardRestoreModifier:
             rewrite links ──► /lg/ token              unwrap /lg/ tokens ──► original URLs
        │                                          │
        ▼                                          ▼
   Amavis ──► Ciphermail ──► deliver           Amavis ──► deliver to external

Rewriting happens at smtpd time, before content filtering. Hermes' own DKIM signs at the Postfix :10026 re-injection (downstream of the milter), so the signature always covers the rewritten body the recipient receives. Inbound mail that arrives already DKIM-signed, S/MIME-signed, or PGP-sealed is skipped — the same envelope-detection logic the disclaimer feature uses — so Link Guard never breaks an existing signature.

The click flow

Recipient clicks rewritten link
        │
        ▼
GET https://<console-host>/lg/?t=<token>   (nginx ──► linkguard :8894)
        │
        ├─ token invalid / expired ──► block page
        ▼
   resolve original URL from token
        │
        ▼
   verdict pipeline (see below) ──► {clean | suspicious | malicious}
        │
        ▼
   admin action for that tier:
     clean      ──► 302 redirect to the real URL   (default)
     suspicious ──► warning interstitial (resolved host shown; user may proceed)
     malicious  ──► block page (hard block, or block_override allowing proceed)

Every click is logged (recipient domain, URL hash, resolved host, verdict, source, action taken, client IP) for the reporting dashboard.

Verdict pipeline

verdict.resolve(url, recipient_domain) evaluates layers in precedence order and returns the first match:

# Layer Result Notes
1 Admin blocklist malicious Operator-curated, console-managed
2 Admin allowlist clean Operator-curated; trumps feeds and heuristics
3 Verdict cache cached result Avoids re-running heuristics / re-hitting external APIs
4 Open-redirect extraction escalate Embedded target in an open-redirect param / nested URL is resolved and inherits its verdict (no fetch)
5 Local feeds malicious URLhaus / OpenPhish, stored in SQLite; only ever escalate
6 Heuristics suspicious Lookalike/punycode, IP-literal host, @ in authority, known shorteners, excessive subdomains, abused cloud-storage/redirector hosts
7 GSB / VirusTotal malicious Optional, admin-supplied keys; string-reputation lookups, cached
8 Guarded redirect-follow escalate Optional (admin toggle): follow the 30x chain under SSRF guards, verdict the final destination
9 Default clean Nothing flagged it

Steps 1–7 never fetch the target URL — every reputation check sends the URL only as a string (Google Safe Browsing, VirusTotal). The only layer that makes an outbound request is the optional guarded redirect-follow (step 8), and only when an admin enables it; every hop is SSRF-fenced (see Redirect detection below). Local feeds only ever escalate a link to malicious; they never auto-allow.

URL shorteners are flagged suspicious (warn), not blocked — a shortener hides its real destination, which is exactly what time-of-click protection exists to surface. The warning interstitial shows the resolved host so the user can make an informed choice. VirusTotal requires ≥2 vendors flagging a URL before it counts as malicious, to cut false positives.

Verdict tiers and actions

Each verdict tier maps to an admin-configurable action:

Tier Setting Default action Behavior
clean action_clean redirect 302 straight to the destination
suspicious action_suspicious warn Interstitial; user may proceed
malicious action_malicious block Block page

Available actions: redirect / allow (pass through), warn (interstitial with proceed), block (hard block, no proceed), block_override (block page that allows an explicit proceed).

A hard block can never be bypassed. POST /lg/proceed re-resolves and re-authorizes the verdict server-side — only a warn tier, or a block_override tier with the override flag, is allowed to continue. A user cannot escape a hard block by replaying the proceed request.

Redirect detection

Attackers increasingly chain through reputable hostsstorage.googleapis.com, firebasestorage.googleapis.com, *.web.app, Azure *.blob.core.windows.net, Cloudflare *.r2.dev / *.pages.dev, and classic open redirectors like google.com/url?q=… — because the host reputation is clean, so a string-only check passes the link through. Link Guard adds three layers to catch this "living off trusted services" pattern.

Open-redirect extraction (always on, no fetch)

The engine scans a link's query and fragment (raw, once-, and twice-URL-decoded) for an embedded http(s):// target on a different host than the redirector — the real destination of …/url?q=https://evil.example. That embedded target is re-run through the verdict pipeline (string-only); if it is suspicious or malicious, the original link inherits the verdict (source = redirect, detail open-redirect -> <host>). A benign embedded URL is a no-op. No outbound request is made.

Abused-host heuristic (always on, no fetch)

Cloud-storage and app-hosting hosts commonly abused to host or bounce to phishing are flagged suspicious (warn) in the heuristics layer, so the user gets the interstitial and the resolved host instead of a silent redirect. Gated by flag_cloud_storage (default on). The match is suffix-based (host == suffix or host ends with .suffix).

The built-in list is a curated baseline — there is no authoritative machine-readable feed of "abused hosting platforms" (the actual-bad URLs are what the URLhaus / OpenPhish feeds cover). It ships on and grows with each release; the current set is drawn from public abuse research (Trustwave SpiderLabs, Proofpoint, Netskope, Phishing.Database) and covers object storage (storage.googleapis.com, firebasestorage.googleapis.com, *.firebaseapp.com, *.web.app, *.appspot.com, *.blob.core.windows.net, s3.amazonaws.com), edge/static-site hosting (*.r2.dev, *.pages.dev, *.workers.dev, github.io, *.netlify.app, *.vercel.app, *.herokuapp.com, *.onrender.com, surge.sh), free site builders (*.weebly.com, *.wixsite.com, 000webhostapp.com), and tunneling services (*.trycloudflare.com, *.ngrok.io, *.ngrok-free.app).

Two ways to tune it without a rebuild:

  • Add hosts via the extra_abused_hosts setting (a textarea in the UI; one host per line or comma-separated, *. prefix optional). The engine unions these with the baseline — react to a new abuse pattern immediately.
  • Suppress a host you trust by adding it to the admin URL allowlist — the allowlist wins over the heuristic, so e.g. a company that legitimately serves files from storage.googleapis.com/<your-bucket> can allow that host/path.

Guarded redirect-follow (optional, the one fetch layer)

When follow_redirects is enabled, the engine — at click time, after the string layers — follows the HTTP 30x redirect chain and verdicts the real destination, catching a trusted-host link that issues a server-side redirect to phishing (the storage.googleapis.com → phishing case). This is the only layer that makes an outbound request, so every hop is fenced (_safe_to_fetch):

  • http/https only, and only a standard web port (80/443).
  • Each hop's host must resolve to public IPs only — any answer in a loopback / RFC1918 / link-local / reserved / multicast range aborts the follow. This stops the follower being used as an SSRF pivot into the internal Docker network.
  • HEAD only, no body downloaded; bounded by follow_max_hops (default 5), a short per-hop timeout, and a loop guard.
  • No cookies/credentials sent; neutral User-Agent; Referrer-Policy: no-referrer.

Each followed hop is verdicted string-only (a follow never recurses into another follow). If the chain reaches a suspicious/malicious destination, the link inherits that verdict (source = redirect, detail redirect chain -> <final host>). A follow failure (timeout, guard stop, HEAD not allowed) fails closed for the follow — the link keeps whatever verdict the string layers produced; a click is never blocked by a follow error.

SSRF posture. Steps 1–7 never fetch the target. Enabling follow_redirects is a deliberate trade: it resolves redirect chains a string check cannot, at the cost of one guarded outbound request per uncached click (latency) and a controlled egress surface. The residual DNS-rebinding window (resolve-then-connect) is accepted for this release. Leave it off to preserve the zero-fetch guarantee; the two no-fetch layers above still run.

Out of scope for this release: following JavaScript or <meta> refresh redirects, which require fetching and parsing the page body — tracked as a later enhancement.

Tokens — stateful v2 (default) with stateless v1 fallback

v2 — stateful (default). The token is just 2.<128-bit opaque id>. The milter writes the mapping id → {original_url, recipient_domain, expiry} to a shared SQLite store (url_map.db) on the linkguard_data volume; the Link Guard container reads it. Because the token itself is tiny, there is no link-length limit — every link is protected regardless of how long the original URL is. This closes the v1 over-length fail-open gap (see below).

v1 — stateless (fallback + in-flight). The token is a self-contained HMAC signature: 1.<recipient_domain>.<url>.<expiry>.<signature>. The milter mints a v1 token if the shared store is unavailable (e.g. off-box deployment, or transient DB contention), so mail flow never depends on the store. v1 tokens already in delivered mailboxes continue to verify until they age out via the token TTL.

The milter's mint/verify logic is a byte-for-byte mirror of the container's lg_token.py, so the container verifies exactly what the milter mints. The url_map.db store uses a rollback journal (not WAL), so the container can read it cross-container without a -shm file.

Why v2 exists. Under v1, a URL longer than the inline cap was left unprotected (the original link was passed through unrewritten). An attacker could pad a URL past the cap to dodge Link Guard entirely. v2's short opaque id removes the length dependency, so nothing is ever skipped. The max_inline_url setting is now a fallback-only bound for the v1 path.

restore_outbound (default on) unwraps Link Guard tokens back to the original URLs on outbound mail — when a recipient replies to or forwards a protected message, the quoted history shows the real links again, not /lg/?t=... redirects. This keeps conversations readable and prevents Hermes redirect URLs from leaking to external parties. (Microsoft 365 was verified not to strip the tokens on manual replies, so restoration is the correct default.)

HMAC key rotation

The signing key for v1 tokens is rotatable from the console. Rotation keeps a current + previous overlap: newly minted tokens use the current key, while tokens signed with the previous key still verify until they age out. The teardown on a Pro license lapse blanks only the dispatch maps and never the keys, so in-flight links keep resolving and a renew resumes minting with the same key.

Settings reference

Settings live in the parameters2 table under module = 'linkguard' (not system_settings). On save they are pushed to the engine via POST /api/config.

Setting Default Meaning
enabled 0 Master on/off for Link Guard
redirect_base_url (console host) Public base URL for /lg/ links
action_clean redirect Action for clean verdicts
action_suspicious warn Action for suspicious verdicts
action_malicious block Action for malicious verdicts
restore_outbound 1 Unwrap tokens on outbound replies/forwards
token_ttl_days 14 How long a rewritten link stays valid
max_inline_url 4000 Fallback-only length bound for v1 stateless tokens
rate_limit_per_min 120 Per-client-IP rate limit on /lg/
flag_cloud_storage 1 Flag abused cloud-storage/redirector hosts as suspicious (warn)
extra_abused_hosts (empty) Operator additions to the abused-host baseline (one per line / comma-separated)
follow_redirects 0 Follow 30x redirect chains at click time (guarded outbound fetch)
follow_max_hops 5 Max hops to follow when follow_redirects is on
cache_ttl_clean_hours 24 Verdict cache lifetime — clean
cache_ttl_suspicious_hours 6 Verdict cache lifetime — suspicious
cache_ttl_malicious_hours 168 Verdict cache lifetime — malicious
feed_urlhaus_enabled 1 Pull the URLhaus blocklist feed
feed_openphish_enabled 1 Pull the OpenPhish blocklist feed
feed_refresh_minutes 60 Feed refresh interval
gsb_enabled / gsb_api_key 0 / — Google Safe Browsing lookups (optional key)
vt_enabled / vt_api_key 0 / — VirusTotal lookups (optional key)
clicks_retention_days 90 Click-log retention for reporting

Two additional console-managed lists drive the verdict pipeline:

  • Protected recipient domains (linkguard_domains) — which recipient domains have their inbound links rewritten. A _default catch-all entry protects all domains.
  • URL allow / block rules (linkguard_url_rules) — operator allow/block patterns that take precedence over feeds and heuristics (layers 1–2 above).

Reputation feeds and optional API lookups

  • URLhaus and OpenPhish are pulled on the feed_refresh_minutes interval into the container's SQLite store and matched as exact URL-hash lookups (a phishing URL on a shared host blocks only that URL, not the whole host).
  • Google Safe Browsing and VirusTotal are off by default; enable each and supply an API key to add a string-reputation layer. Results are cached per the cache-TTL settings to limit API calls.

Branded interstitials

The warning and block pages are served by the container (templates.py) and carry Hermes SEG branding — an inline logo, a "Hermes SEG Link Guard" header, and a footer link to hermesseg.io — rather than a generic browser error. The warning page shows the resolved host so a user can judge a shortened or suspicious link before proceeding.

Reporting and diagnostics

The admin page includes:

  • Check a URL — enter any URL to see the live verdict, which pipeline layer decided it, and the resolved host. This is side-effect-free (verdict.resolve(cache_write=False)) so it does not pollute the cache.
  • Recent activity — a table of recent clicks (domain, host, verdict, action) from the click log.
  • Troubleshooting commands — a collapsible card of docker exec one-liners for inspecting the scope map, store, and feeds.

Deployment — in-stack or separate host

Failure semantics

In every failure case the worst outcome is a missed rewrite or a fall-through verdict — never lost mail.

Files and data locations

Path Container Contents
/etc/hermes/body_milter/linkguard/linkguard_by_recipient_domain body_milter Scope map: protected recipient domains (_default = all)
/var/lib/linkguard/url_map.db body_milter (writer) / linkguard (reader) v2 token id → original URL store, on the shared linkguard_data volume
/opt/linkguard/app/ linkguard Engine code (server, verdict, feeds, store, token, templates)
Operational SQLite store linkguard Verdict cache, feed entries, click log

The scope map is mtime-watched by the milter and reloaded on the next message when it changes — no explicit milter reload step is needed after a console save.

Security properties (summary)

  • SSRF-fenced — steps 1–7 never fetch the target; the optional redirect-follow (step 8) is the only fetch, and every hop is restricted to http/https + standard ports + hosts that resolve to public IPs only.
  • Hard blocks are unbypassable/lg/proceed re-authorizes server-side.
  • Hardened port split — public click surface cannot push config or read keys.
  • Rate-limited public surface (rate_limit_per_min, per client IP).
  • Signature-safe — inbound S/MIME, PGP, and upstream-DKIM-signed mail is skipped, never re-bodied.
  • Mail-flow-safe — the container being down, off-box, or torn down on a license lapse never blocks delivery.