Site Guard

Documentation

BehalfID Site Guard

MVP website-route access checks for AI agent and crawler signals. Check before access, deny by default, and log the decision.

What Site Guard is

Site Guard lets a website owner check a simple site rule before protected content or workflows are served. Permission passports answer whether an agent may act for a user; Site Guard answers whether an AI agent or crawler signal may access a website route where the site installed the check.

This MVP is a policy endpoint, not a reverse proxy. It relies on your middleware, worker, gateway, or route code to call BehalfID and honor denied decisions.

Site Guard is not a replacement for application authentication. It is a pre-access policy check for AI agents and crawlers. Your app still enforces user auth, authorization, sessions, permissions, and route access controls.

Site keys (recommended)

Create a site key (bhf_site_...) from the site detail page in your dashboard. Site keys are scoped to a single site — the key cannot check a different site, even with a valid credential. Use Authorization: Bearer and omit siteId from the request body.

request
POST /api/site-guard/check
Authorization: Bearer bhf_site_xxx
Content-Type: application/json

{
  "path": "/docs/api",
  "userAgent": "ExampleBot/1.0",
  "agentIdentifier": "crawler_example"
}
response
{
  "allowed": true,
  "reason": "Path allowed by an active Site Guard rule.",
  "requestId": "req_xxx",
  "matchedRuleId": "sgr_xxx",
  "siteId": "site_xxx"
}

Set SITE_GUARD_KEY=bhf_site_xxx in your environment. The raw key is shown only once at creation time. Store it in a secret manager or environment variable — it cannot be retrieved again.

Developer token (legacy)

Developer tokens (bhf_dev_...) in x-developer-token are still accepted for backwards compatibility but are broader than ideal for website middleware. Prefer site keys for new integrations.

request
POST /api/site-guard/check
x-developer-token: bhf_dev_xxx
Content-Type: application/json

{
  "siteId": "site_xxx",
  "path": "/docs/api",
  "userAgent": "ExampleBot/1.0",
  "agentIdentifier": "crawler_example"
}

Rule behavior

RuleDeny by default

A route stays denied until an active matching rule allows its path.

RuleMatch simple signals

Rules match an exact agent identifier or a wildcard User-Agent pattern.

RuleBlock before allow

A matching blocked path overrides any matching allowed path.

RuleLog decisions

Existing-site checks record safe decision metadata and a request ID.

Path patterns support exact paths and * wildcards, such as /docs/api or /docs/*. User-Agent is a weak signal and is not proof of provider identity.

Next.js middleware

Place middleware.ts at the project root (same level as app/). It runs server-side before any route handler. See examples/site-guard-nextjs/ for the full example.

middleware.ts
import { NextResponse, type NextRequest } from "next/server";

const GUARDED_PREFIXES = ["/docs", "/admin"];

export async function middleware(request: NextRequest) {
  const { pathname } = request.nextUrl;

  // Skip Next.js internals and static assets.
  if (pathname.startsWith("/_next/")) return NextResponse.next();
  if (!GUARDED_PREFIXES.some((p) => pathname.startsWith(p))) {
    return NextResponse.next();
  }

  let decision;
  try {
    const r = await fetch(
      `${process.env.BEHALFID_BASE_URL}/api/site-guard/check`,
      {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          // Site key — server-side only, never sent to the browser.
          Authorization: `Bearer ${process.env.SITE_GUARD_KEY}`,
        },
        body: JSON.stringify({
          path: pathname,
          userAgent: request.headers.get("user-agent") ?? "unknown",
          agentIdentifier: request.headers.get("behalfid-agent") ?? undefined,
          // no siteId — the site key already encodes the site
        }),
      },
    );
    // Fail closed on non-2xx.
    if (!r.ok) return new NextResponse("Site Guard unavailable.", { status: 403 });
    decision = await r.json();
  } catch {
    // Fail closed on network error.
    return new NextResponse("Site Guard unavailable.", { status: 403 });
  }

  if (!decision.allowed) {
    return new NextResponse(decision.reason ?? "Denied by Site Guard.", { status: 403 });
  }
  return NextResponse.next();
}

export const config = { matcher: ["/docs/:path*", "/admin/:path*"] };

Express middleware

Call siteGuard() before the route handler. The middleware responds 403 on deny or error without calling next(). See examples/site-guard-express/ for the full example.

src/siteGuard.ts
import type { Request, Response, NextFunction } from "express";

export function siteGuard() {
  return async (req: Request, res: Response, next: NextFunction) => {
    const key = process.env.SITE_GUARD_KEY;
    // Fail closed — cannot verify without a key.
    if (!key) { res.status(403).send("SITE_GUARD_KEY not configured."); return; }

    let decision;
    try {
      const r = await fetch(
        `${process.env.BEHALFID_BASE_URL}/api/site-guard/check`,
        {
          method: "POST",
          headers: {
            "Content-Type": "application/json",
            Authorization: `Bearer ${key}`,
          },
          body: JSON.stringify({
            path: req.path,
            userAgent: req.headers["user-agent"] ?? "unknown",
            agentIdentifier: req.headers["behalfid-agent"],
            // no siteId — the site key already encodes the site
          }),
        },
      );
      if (!r.ok) { res.status(403).send("Site Guard error."); return; }
      decision = await r.json();
    } catch {
      res.status(403).send("Site Guard unavailable."); return;
    }

    if (!decision.allowed) { res.status(403).send(decision.reason); return; }
    next(); // allowed — route handler runs
  };
}

// Usage:
// app.get("/docs/:slug", siteGuard(), docsHandler);
// app.get("/admin/:page", siteGuard(), adminHandler);

Fail-closed rules

Every integration point must fail closed. A route must not be served unless Site Guard explicitly returns allowed: true.

BehaviorSITE_GUARD_KEY not set

Respond 403 — do not serve the route.

BehaviorNetwork error or timeout

Respond 403 — do not serve the route.

BehaviorBehalfID returns non-2xx

Respond 403 — do not serve the route.

Behaviordecision.allowed === false

Respond 403 — do not serve the route.

Behaviordecision.allowed === true

Allow — let the route handler run.

SITE_GUARD_KEY is server-side only. Never import the helper from a Client Component or any module in the browser bundle. When using a site key, omit siteId and domain from the request body — the key already encodes the site scope and a body-provided value cannot override it.

Test with curl

allowed path
curl https://behalfid.com/api/site-guard/check \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SITE_GUARD_KEY" \
  -d '{"path": "/docs/getting-started", "userAgent": "ExampleBot/1.0"}'
blocked path
curl https://behalfid.com/api/site-guard/check \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SITE_GUARD_KEY" \
  -d '{"path": "/admin/settings", "userAgent": "ExampleBot/1.0"}'

Middleware sketch (raw)

middleware.ts
const response = await fetch(`${process.env.BEHALFID_BASE_URL}/api/site-guard/check`, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": `Bearer ${process.env.SITE_GUARD_KEY}`
  },
  body: JSON.stringify({
    path: new URL(request.url).pathname,
    userAgent: request.headers.get("user-agent") ?? "unknown",
    agentIdentifier: request.headers.get("behalfid-agent") ?? undefined
    // no siteId — the key already encodes the site
  })
});

if (!response.ok || !(await response.json()).allowed) {
  return new Response("Denied by Site Guard.", { status: 403 });
}

Logs and limits

  • Existing-site checks log the request ID, site, matched rule when any, path, signals, result, reason, risk, and timestamp.
  • Logs do not store cookies, auth headers, tokens, query strings, page content, request bodies, or optional metadata.
  • The MVP has no crawler registry, provider-native identity, OAuth, billing, or advanced policy language.
  • Site Guard cannot block uninstrumented website traffic.