Files
GhostGrid/ARCHITECTURE.md
Brückner 49cd0ae4f6 feat(caddy): optional root redirect per route
Add a redirect_path column to the caddy table and an optional 'root redirect'
field in the route form. When set, buildCaddyfile emits 'redir / <path>' so the
bare host (e.g. checkmk.domain.local/) redirects to a sub-path (e.g.
/monitoring/check_mk/) while every other path still passes through to the
backend — the safe pattern for apps like CheckMK that bake their site path into
absolute URLs. Defensive ALTER TABLE keeps existing databases working.
2026-06-10 10:22:39 +02:00

40 KiB

GhostGrid

Architecture Reference

Status: Living document — single source of truth for the codebase

Use this document as the starting context for any future task on GhostGrid. It describes the whole application: purpose, stack, file layout, data model, REST API, frontend structure, integrations, background jobs, security, and deployment.


1. Executive Summary

GhostGrid is an internal, offline-capable network-lab and device-inventory tool for managing hardware lab environments. Teams use it to keep a device inventory with live status, define lab templates (devices + topology), book labs for a time window, and automatically run Ansible playbooks at booking start/end. It pulls live device status from CheckMK and can manage Caddy reverse-proxy routes and Microsoft Entra ID SSO from its own UI.

Key Design Decisions

Aspect Choice Rationale
Scope Single-tenant internal tool Small team / lab operations, not multi-tenant SaaS
Process model One Node.js process serving API + frontend Simple to deploy, no orchestration needed
Backend Express 4 + TypeScript Minimal, well-understood, fast to iterate
Frontend React 19 + Vite 6 Modern SPA, no router dependency (tab state)
Database SQLite (better-sqlite3, WAL) Zero-ops, single file, synchronous, perfect for air-gapped LAN
Styling Tailwind CSS v4 Utility-first, dark/light theme via class toggle
Auth Local JWT + optional Azure Entra ID OAuth Self-contained, SSO optional
Offline Fonts bundled via @fontsource No CDN / external runtime assets
Integrations CheckMK, Ansible Semaphore, Caddy All configured at runtime in the Settings UI (stored in DB)
Deployment Proxmox LXC + systemd, two instances Manual git pull && build && restart model

Core constraint: runs fully offline. No external code, assets, or CDN resources are loaded at runtime.


2. System Architecture

2.1 High-Level Architecture

+-----------------------------------------------------------------------------+
|                              GHOSTGRID PLATFORM                              |
+-----------------------------------------------------------------------------+
|  +---------------------------------------------------------------------+    |
|  |                         PRESENTATION LAYER                          |    |
|  |  +-----------------------------+   +----------------------------+   |    |
|  |  |   React 19 SPA (Vite)       |   |  Browser localStorage      |   |    |
|  |  |   - Tab-based navigation    |   |  - ghostgrid_token (JWT)   |   |    |
|  |  |   - Tailwind dark/light     |   |  - ghostgrid_user          |   |    |
|  |  +-----------------------------+   +----------------------------+   |    |
|  +---------------------------------------------------------------------+    |
|                                     |  authFetch > Bearer <JWT>             |
|  +---------------------------------------------------------------------+    |
|  |                  APPLICATION LAYER (server.ts)                      |    |
|  |  Single Express process — serves API + frontend                    |    |
|  |  +-----------+ +-----------+ +-----------+ +-----------+ +--------+ |    |
|  |  |   Auth    | |  Devices  | |   Labs    | | Bookings  | |  Logs  | |    |
|  |  | (JWT/MSAL)| |   CRUD    | |   CRUD    | |   CRUD    | |        | |    |
|  |  +-----------+ +-----------+ +-----------+ +-----------+ +--------+ |    |
|  |  +-----------+ +-----------+ +-----------+ +-----------+           |    |
|  |  |   Users   | |   Links   | |  Settings | |   Caddy   |           |    |
|  |  +-----------+ +-----------+ +-----------+ +-----------+           |    |
|  |  +---------------------------------------------------------------+ |    |
|  |  |  Background jobs (self-rescheduling setTimeout loops)         | |    |
|  |  |   - CheckMK status sync (default 60s)                         | |    |
|  |  |   - Semaphore setup/teardown trigger (30s)                    | |    |
|  |  +---------------------------------------------------------------+ |    |
|  +---------------------------------------------------------------------+    |
|                                     |                                        |
|  +---------------------------------------------------------------------+    |
|  |                          DATA LAYER (server-db.ts)                  |    |
|  |  +-------------------------------------------------------------+   |    |
|  |  |          SQLite — ghostgrid.db  (better-sqlite3, WAL)        |   |    |
|  |  |  users · devices · labs · bookings · logs · links           |   |    |
|  |  |  settings · caddy                                           |   |    |
|  |  +-------------------------------------------------------------+   |    |
|  +---------------------------------------------------------------------+    |
+-----------------------------------------------------------------------------+
|                          EXTERNAL INTEGRATIONS                               |
|   +-------------+  +------------------+  +-----------+  +----------------+    |
|   |   CheckMK   |  | Ansible Semaphore|  |   Caddy   |  | Microsoft      |    |
|   |  REST API   |  |    REST API      |  | Admin API |  | Entra ID (MSAL)|    |
|   | (status)    |  | (playbook tasks) |  | (/load)   |  | (OAuth 2.0)    |    |
|   +-------------+  +------------------+  +-----------+  +----------------+    |
+-----------------------------------------------------------------------------+

2.2 Component Breakdown

2.2.1 Presentation Layer

Component Technology Purpose
Web UI React 19 + TypeScript Dashboard, booking calendar, inventory, topology, settings
Build/dev server Vite 6 Bundles the SPA; mounted as Express middleware in dev
Session store Browser localStorage Persists JWT + user between reloads

2.2.2 Application Layer (server.ts — single process)

Route group Responsibility
Auth Local register/login (JWT), Azure Entra ID OAuth, /me, public /config
Devices Inventory CRUD; delete also scrubs the device from labs
Labs Lab-template CRUD; deviceIds/topology stored as JSON
Bookings Reservation CRUD; cancellation can trigger Semaphore teardown
Logs Audit/maintenance journal (read + manual create)
Users Team list, edit, delete (self/last-user guarded)
Links Shared quick-links dashboard CRUD
Settings Integration config (Azure, CheckMK, Semaphore, Caddy); secrets masked
CheckMK Manual sync trigger
Semaphore Template-list proxy, manual setup/teardown trigger
Caddy Status, route CRUD, Caddyfile push
Background jobs CheckMK sync loop + Semaphore trigger loop
Static serving Vite middleware (dev) / static dist/ + SPA fallback (prod)

2.2.3 Data Layer (server-db.ts)

Component Technology Purpose
Database SQLite via better-sqlite3 Single file ghostgrid.db, WAL journal mode, synchronous queries
Schema Idempotent CREATE TABLE IF NOT EXISTS 8 tables defined in full and created on boot (fresh-install model, no migrations)
Settings store key/value settings table Runtime config for all integrations, seeded with INSERT OR IGNORE

3. Technology Stack

3.1 Backend Stack

Node.js 20 LTS  (TypeScript ~5.8, ES modules)
+-- Web Framework
|   +-- express 4.21          (HTTP server, routing, JSON middleware)
|   +-- vite 6 (createServer) (dev middleware mode, SPA)
+-- Auth & Security
|   +-- jsonwebtoken 9        (JWT sign/verify, 24h expiry)
|   +-- bcryptjs 2.4          (password hashing, cost 10)
|   +-- @azure/msal-node 5    (Entra ID OAuth 2.0 auth-code flow)
+-- Data Access
|   +-- better-sqlite3 12     (synchronous SQLite driver, WAL)
+-- Utilities
|   +-- dotenv 17             (.env loading)
|   +-- (global fetch)        (CheckMK / Semaphore / Caddy HTTP calls)
+-- Build / Run
    +-- tsx 4                 (dev: run server.ts directly)
    +-- esbuild 0.25          (bundle server > dist/server.cjs)

3.2 Frontend Stack

React 19 + TypeScript
+-- State Management
|   +-- React hooks only      (useState/useEffect in App.tsx — no Redux/Zustand)
|   +-- localStorage          (token + user persistence via src/lib/auth.ts)
+-- UI / Styling
|   +-- tailwindcss 4         (@tailwindcss/vite plugin)
|   +-- lucide-react 0.546    (icon set)
+-- Fonts (self-hosted, offline)
|   +-- @fontsource/inter
|   +-- @fontsource/jetbrains-mono
+-- Networking
|   +-- fetch + authFetch()   (thin wrapper injecting Authorization header)
+-- Build Tools
    +-- vite 6 + @vitejs/plugin-react

3.3 Data Layer

SQLite (single file: ghostgrid.db, WAL mode)
+-- users            (local + Azure-provisioned accounts)
+-- devices          (inventory + CheckMK-synced status)
+-- labs             (templates: deviceIds[] + topology[] as JSON)
+-- bookings         (reservations + Ansible trigger flags/jobs)
+-- logs             (audit/maintenance journal)
+-- links            (shared quick-links dashboard)
+-- settings         (key/value runtime config for integrations)
+-- caddy            (custom reverse-proxy routes)

3.4 Infrastructure

Deployment
+-- Proxmox LXC (Debian 12, unprivileged)
+-- systemd services (ghostgrid + ghostgrid-dev)
+-- Two parallel instances (main:3000, dev:3001), separate DBs
+-- deploy/proxmox-ghostgrid.sh   (one-shot installer)
+-- deploy/deploy.sh <branch>     (git pull + build + restart)

Networking (optional, managed in-app)
+-- Caddy reverse proxy (local_certs, tls internal)
+-- Caddyfile pushed to Caddy Admin API /load

4. Database Schema Design

SQLite file ghostgrid.db in process.cwd(), opened with journal_mode = WAL. Every table is defined in full and created idempotently on boot (CREATE TABLE IF NOT EXISTS) — the app assumes a fresh install, so there is no migration layer.

4.1 Schema (as created in server-db.ts)

CREATE TABLE IF NOT EXISTS users (
    id            TEXT PRIMARY KEY,
    name          TEXT NOT NULL,
    role          TEXT NOT NULL DEFAULT 'User',
    email         TEXT NOT NULL UNIQUE,
    password_hash TEXT NOT NULL          -- bcrypt; '' for Azure-provisioned users
);

CREATE TABLE IF NOT EXISTS devices (
    id             TEXT PRIMARY KEY,
    hostname       TEXT NOT NULL,
    ip             TEXT NOT NULL,
    location       TEXT NOT NULL,
    notes          TEXT,
    type           TEXT NOT NULL,        -- Switch | Firewall | Access-Point | Controller | custom
    status         TEXT NOT NULL,        -- online | offline | unknown
    emergencySheet TEXT NOT NULL,        -- markdown
    lastCheckedAt  TEXT,
    checkMkUrl     TEXT NOT NULL DEFAULT '',
    cmkHostname    TEXT NOT NULL DEFAULT ''
);

CREATE TABLE IF NOT EXISTS labs (
    id                          TEXT PRIMARY KEY,
    name                        TEXT NOT NULL,
    description                 TEXT NOT NULL,
    contactPerson               TEXT NOT NULL,
    location                    TEXT NOT NULL,
    deviceIds                   TEXT NOT NULL,   -- JSON string: string[]
    topology                    TEXT NOT NULL,   -- JSON string: TopologyLink[]
    semaphoreSetupTemplateId    TEXT NOT NULL DEFAULT '',
    semaphoreTeardownTemplateId TEXT NOT NULL DEFAULT ''
);

CREATE TABLE IF NOT EXISTS bookings (
    id                       TEXT PRIMARY KEY,
    labId                    TEXT NOT NULL,
    userId                   TEXT NOT NULL,
    startDateTime            TEXT NOT NULL,
    endDateTime              TEXT NOT NULL,
    notes                    TEXT,
    status                   TEXT NOT NULL,   -- active | upcoming | completed | cancelled
    notified                 INTEGER NOT NULL DEFAULT 0,
    emailSent                INTEGER NOT NULL DEFAULT 0,
    ansibleSetupTriggered    INTEGER NOT NULL DEFAULT 0,
    ansibleTeardownTriggered INTEGER NOT NULL DEFAULT 0,
    ansibleSetupJobId        TEXT NOT NULL DEFAULT '',
    ansibleTeardownJobId     TEXT NOT NULL DEFAULT ''
);

CREATE TABLE IF NOT EXISTS logs (
    id        TEXT PRIMARY KEY,
    timestamp TEXT NOT NULL,
    type      TEXT NOT NULL,   -- maintenance | booking | status | system
    message   TEXT NOT NULL,
    deviceId  TEXT,
    userId    TEXT
);

CREATE TABLE IF NOT EXISTS links (
    id          TEXT PRIMARY KEY,
    title       TEXT NOT NULL,
    url         TEXT NOT NULL,
    description TEXT NOT NULL DEFAULT '',
    category    TEXT NOT NULL DEFAULT '',
    color       TEXT NOT NULL DEFAULT 'emerald',
    createdBy   TEXT,
    createdAt   TEXT NOT NULL
);

CREATE TABLE IF NOT EXISTS settings (
    key        TEXT PRIMARY KEY,
    value      TEXT NOT NULL,
    updated_at TEXT DEFAULT (datetime('now'))
);

CREATE TABLE IF NOT EXISTS caddy (
    id            INTEGER PRIMARY KEY AUTOINCREMENT,
    hostname      TEXT NOT NULL,
    upstream      TEXT NOT NULL,
    tls           INTEGER NOT NULL DEFAULT 1,
    compress      INTEGER NOT NULL DEFAULT 1,
    redirect_path TEXT NOT NULL DEFAULT '',   -- optional 'redir / <path>' for the bare root
    created_at    TEXT DEFAULT (datetime('now'))
);

4.2 Data-Model Notes

Topic Detail
IDs App-generated strings: `${prefix}-${Date.now()}-${rand}` (dev-…, lab-…, book-…, log-…, u-…, link-…); caddy uses an autoincrement integer
JSON columns labs.deviceIds and labs.topology are JSON strings, parsed in the API layer
Booleans Booking flags (notified, emailSent, ansible*Triggered) are INTEGER 0/1, mapped to JS booleans on read
Cascade No FK cascades; referential cleanup is done in code (e.g. deleting a device scrubs it from every lab's deviceIds/topology)
Schema changes Fresh-install model — edit the CREATE TABLE block in server-db.ts directly; there is no migration helper

4.3 Settings (key/value config)

Seeded with INSERT OR IGNORE (defaults below). Secret keys are never returned raw — maskSettings() replaces them with the __SET__ sentinel.

Group Keys (default)
Azure azure_enabled(false), azure_client_id, azure_tenant_id, azure_client_secret🔒, azure_redirect_uri, azure_allowed_group
CheckMK checkmk_enabled(false), checkmk_api_url, checkmk_api_user(automation), checkmk_api_secret🔒, checkmk_sync_interval_ms(60000)
Semaphore semaphore_enabled(false), semaphore_api_url, semaphore_api_token🔒, semaphore_project_id
Caddy caddy_enabled(false), caddy_admin_url(http://localhost:2019)

🔒 = in SECRET_KEYS, masked on read, only updated when a non-__SET__ value is sent.


5. API Design

All /api/* routes return JSON. Every route except the public auth/config endpoints requires requireAuth (JWT bearer).

5.1 REST API Structure

/api
+-- /auth
|   +-- POST   /register              # Create local account > { token, user }   [public]
|   +-- POST   /login                 # Authenticate > { token, user }            [public]
|   +-- GET    /me                    # Current user from token                   [auth]
|   +-- GET    /config                # { azureEnabled, effectiveRedirectUri,     [public]
|   |                                  #   checkmkEnabled, checkmkBaseUrl }
|   +-- GET    /azure                  # Start Azure OAuth (redirect to Microsoft) [public]
|   +-- GET    /azure/callback         # OAuth callback > redirect /?token=…       [public]
|
+-- /settings
|   +-- GET    /                       # All settings (secrets masked as __SET__)  [auth]
|   +-- PUT    /                       # Update allow-listed keys; re-push Caddy    [auth]
|
+-- /users
|   +-- GET    /                       # List users                                [auth]
|   +-- PUT    /{id}                    # Update name/email (dupe-email guarded)     [auth]
|   +-- DELETE /{id}                    # Delete (not self, not last user)           [auth]
|
+-- /devices
|   +-- GET    /                       # List devices                              [auth]
|   +-- POST   /                       # Create device (+ maintenance log)          [auth]
|   +-- PUT    /{id}                    # Update device (+ maintenance log)          [auth]
|   +-- DELETE /{id}                    # Delete + scrub from all labs               [auth]
|
+-- /labs
|   +-- GET    /                       # List labs (parses deviceIds/topology JSON) [auth]
|   +-- POST   /                       # Create lab                                 [auth]
|   +-- PUT    /{id}                    # Update lab                                [auth]
|   +-- DELETE /{id}                    # Delete + cancel upcoming bookings          [auth]
|
+-- /bookings
|   +-- GET    /                       # List bookings (int flags > booleans)       [auth]
|   +-- POST   /                       # Create booking (+ log, alertGenerated)     [auth]
|   +-- PUT    /{id}                    # Update status; cancel>teardown trigger     [auth]
|   +-- DELETE /{id}                    # Delete booking                            [auth]
|
+-- /logs
|   +-- GET    /                       # All logs, newest first                     [auth]
|   +-- POST   /                       # Manual log entry                           [auth]
|
+-- /links
|   +-- GET    /                       # Quick links (ordered category, title)      [auth]
|   +-- POST   /                       # Create link                                [auth]
|   +-- PUT    /{id}                    # Update link                               [auth]
|   +-- DELETE /{id}                    # Delete link                               [auth]
|
+-- /checkmk
|   +-- POST   /sync                    # Trigger CheckMK status sync now            [auth]
|
+-- /semaphore
|   +-- GET    /templates               # Proxy Semaphore task-template list         [auth]
|   +-- POST   /trigger/{bookingId}     # Manual setup|teardown for a booking        [auth]
|
+-- /caddy
    +-- GET    /status                  # Caddy admin API reachable?                 [auth]
    +-- GET    /routes                  # Custom routes (plain array)                [auth]
    +-- POST   /routes                  # Add custom route + push config             [auth]
    +-- PUT    /routes/{id}             # Update custom route + push config          [auth]
    +-- DELETE /routes/{id}             # Remove custom route + push config          [auth]

5.2 Authentication & Authorization

Auth model
+-- Token: JWT (HS256, secret = JWT_SECRET), payload { userId, email }, expiry 24h
+-- Transport: Authorization: Bearer <jwt> (no cookies > no CSRF surface)
+-- Storage: browser localStorage (ghostgrid_token, ghostgrid_user)
+-- Middleware
|   +-- requireAuth   — verifies JWT, sets req.user; applied to all data routes
|   +-- requireAdmin  — checks users.role === 'admin'  ⚠ DEFINED BUT NOT WIRED
+-- Roles: role column defaults to 'User'; no route currently enforces admin

Local flow: register (bcrypt hash, role User) / login (bcrypt compare) > issue JWT > client stores token + user.

Azure Entra ID (OAuth 2.0 auth-code flow):

1. GET /api/auth/config        > frontend learns azureEnabled, shows SSO button
2. GET /api/auth/azure         > MSAL getAuthCodeUrl > 302 to Microsoft
3. GET /api/auth/azure/callback> acquireTokenByCode
                                 > optional azure_allowed_group membership check
                                 > upsert user (auto-provision, empty password)
                                 > 302 /?token=<jwt>
4. App.tsx reads ?token= / ?auth_error=, verifies via /api/auth/me, persists

6. Integrations & Background Jobs

All three integrations are configured at runtime via the Settings UI (stored in the settings table). The background loops re-read settings each cycle, so CheckMK interval changes take effect without a restart.

6.1 CheckMK — Device Status Sync

Loop: scheduleSync() > syncCheckMkStatuses() > setTimeout(checkmk_sync_interval_ms)
      (default 60s; skipped entirely if checkmk_enabled !== 'true')

Auth header:  Authorization: Bearer <user> <secret>

Step 1  GET /domain-types/host_config/collections/all
        > build IP > hostname map (checks attributes + effective_attributes)
Step 2  for each device:
        - no CheckMK host for its IP  > status 'unknown'
        - GET /objects/host/{name}?columns=state…
              state 0      > online
              state 1 | 2  > offline
              else         > unknown
        - update devices.status, lastCheckedAt, cmkHostname
        - on change: write a 'status' log
Summary log per cycle: "<online> online, <offline> offline, <unknown> unknown"
HTTP hints: 401/403/404 mapped to actionable messages (checkmkHttpHint)

6.2 Ansible Semaphore — Playbook Automation

Loop: scheduleSemaphoreCheck() > checkAndTriggerAnsibleTasks() > setTimeout(30s)
      (skipped if semaphore_enabled !== 'true')

Setup     bookings WHERE startDateTime <= now AND ansibleSetupTriggered=0
          AND status != 'cancelled'  > trigger lab.semaphoreSetupTemplateId
Teardown  bookings WHERE endDateTime  <= now AND ansibleTeardownTriggered=0
          AND status != 'cancelled'  > trigger lab.semaphoreTeardownTemplateId
          (also triggered immediately when a started booking is cancelled)

triggerSemaphoreTask(templateId, extraVars):
  POST {apiUrl}/api/project/{projectId}/tasks
       body { template_id, environment: JSON.stringify(extraVars) }
  extraVars = { booking_id, lab_name, user_id, start_time, end_time }
  > store returned job id on booking; log success/failure
  (a booking with no template id is marked triggered > not retried)

Manual:  POST /api/semaphore/trigger/{bookingId}  body { type: 'setup'|'teardown' }
         GET  /api/semaphore/templates  (proxy for UI dropdowns)

6.3 Caddy — Reverse Proxy

buildCaddyfile():
  { local_certs }                          # global block
  per custom route { [encode] [tls internal] [redir / <redirect_path>] reverse_proxy <upstream> { … } }
  redirect_path set → `redir / <path>` redirects only the bare root '/'
    (other paths pass through; e.g. CheckMK served at /<site>/check_mk/)
  every reverse_proxy block carries standard forwarding headers:
    header_up X-Forwarded-Proto {scheme}
    header_up X-Real-IP {remote_host}
    header_up Host {host}
  upstream prefixed with https:// → block also gets a
    transport http { tls_insecure_skip_verify } block
    (for self-signed TLS backends like Semaphore)

importCaddyfileRoutes():  reads /etc/caddy/Caddyfile on first Caddy enable
  parses hostname/upstream blocks → seeds caddy table as custom routes
  (no-op if caddy table already has entries or file not found)

pushCaddyConfig():  POST <caddy_admin_url>/load   (Content-Type: text/caddyfile)
  called on startup, after settings save, after route add/delete
  (failures logged as warnings, non-fatal; skipped if caddy_enabled !== 'true'
   or if this instance is not the Caddy manager)

Ownership — one Caddy serves the whole container (admin API on :2019); POST /load
replaces the ENTIRE config. Only the instance with env CADDY_MANAGER=true (production)
pushes, seeds routes, and accepts route edits (POST/PUT/DELETE → 403 otherwise). The
other instance shows the Caddy section read-only (/api/auth/config → caddyManaged:false)
and never pushes — otherwise its own (partial) config would clobber the owner's. The
owner's caddy table therefore holds ALL routes (both GhostGrid domains + every service).

6.4 First-start Initialization

Runs in startServer() on every startup — each step is idempotent.

Default admin user (only on a blank database):
  if users table is empty:
    INSERT user (name='admin', role='admin', email='admin@ghostgrid.local', password=bcrypt('admin'))
    → log "[Init] Default admin user created"

Caddy route import (re-deploy safety net, Caddy manager only):
  if CADDY_MANAGER === 'true' AND caddy_enabled === 'true' AND caddy table is empty:
    importCaddyfileRoutes()  → seed routes from /etc/caddy/Caddyfile
  (also runs in PUT /api/settings on the disabled → enabled transition)

Default settings:
  INSERT OR IGNORE all DEFAULT_SETTINGS keys from server-db.ts
  → existing values in the settings table are never overwritten

7. Frontend Architecture

7.1 Application Structure

src/
+-- main.tsx              # React entry: imports fonts + index.css, renders <App/>
+-- App.tsx               # Stateful root: auth gate, data loading, tab routing, all handlers
+-- index.css             # Tailwind + theme tokens
+-- types.ts              # Shared interfaces (Device, LabTemplate, Booking, …) — see §8
+-- vite-env.d.ts
+-- lib/
|   +-- auth.ts           # localStorage token/user, authFetch() wrapper, session helpers
+-- components/
    +-- Header.tsx            # Top bar; exports GhostGridLogo; notifications; theme/logout
    +-- Dashboard.tsx         # Active/upcoming bookings + quick-links widget
    +-- BookingCalendar.tsx   # Day-offset grid; create/cancel; conflict + online checks
    +-- BookingDetailsModal.tsx # Booking detail; manual Semaphore trigger; cancel/delete
    +-- DeviceInventory.tsx   # List/detail; CRUD; markdown emergency sheet; CheckMK link
    +-- LabTemplates.tsx      # Lab CRUD + topology editor; embeds TopologyPanel
    +-- TopologyPanel.tsx     # Pure SVG (800x400) node/link renderer
    +-- Logbook.tsx           # Sorted/filtered log list + manual entry
    +-- LinkDashboard.tsx     # Quick-link CRUD; 6 accent colors; category grouping
    +-- UserDirectory.tsx     # Team list; avatar colors; edit/delete modal
    +-- LoginPage.tsx         # Local login + Azure SSO button (if enabled)
    +-- RegisterPage.tsx      # Self-registration form
    +-- Settings.tsx          # Integration config cards (Azure, CheckMK, Semaphore, Caddy)

7.2 State & Data Flow

App.tsx is the single stateful root (no router, no global store):

+-- Auth state:  currentUser (from localStorage), authView (login|register), authChecked
+-- App data:    users, devices, labs, bookings, logs, links  (loaded in one Promise.all)
+-- UI state:    activeTab, navCollapsed*, theme* (dark|light), notifications,
|                selectedBookingForDetails, inventoryHighlightDevice, checkmk{Enabled,BaseUrl}
+-- Effects:
|   +-- Startup token verify + OAuth ?token=/?auth_error= handling
|   +-- Load data on login
|   +-- Poll GET /api/devices every 30s   (surface CheckMK-driven status changes)
|   +-- Booking reminder check every 60s  (fires once per upcoming booking ≤30min away)
+-- Handlers:    handleAdd/Update/Delete* for bookings, devices, labs, links, users +
                 handleAddLogManually — call API via authFetch, update local state,
                 most then refetch /api/logs

(* persisted to localStorage)

Navigation is a plain activeTab switch. Groups: Dashboard / Lab Management (Booking, Inventory, Topology) / Resources (Quick Links, Team) / Audit (Logbook) / System (Settings).

7.3 Key UI Components

Dashboard
+-- Active / upcoming booking cards
+-- Quick-links widget
+-- Navigation shortcuts (to calendar, devices, labs, links)

Booking Calendar
+-- Day-offset grid
+-- Create booking with conflict detection + device-online validation
+-- (devices in 'unknown' status are not bookable when CheckMK enabled)

Device Inventory
+-- Searchable list + detail panel
+-- CRUD; class presets (Switch/Firewall/Access-Point/Controller) + free-form
+-- Markdown emergency sheet; optional CheckMK deep-link

Lab Templates + Topology
+-- Lab CRUD; Semaphore setup/teardown template selection
+-- Topology link editor (fromDevice > toDevice, link type)
+-- TopologyPanel: SVG layout by node count (1 / 2 / 3 / circular)

Settings
+-- Microsoft Entra ID  (OAuth SSO, redirect-URI helper, allowed group)
+-- CheckMK             (API URL/user/secret, sync interval, "Run sync now")
+-- Ansible Semaphore   (API URL/token/project, "Test connection")
+-- Caddy               (admin URL, custom route management;
                         auto-seeded from /etc/caddy/Caddyfile on first enable;
                         https:// upstream → TLS proxy, certificate not verified)
+-- Secret inputs use the __SET__ sentinel (blank = keep existing)

8. Shared Types (src/types.ts)

The single contract between frontend and backend — imported by both server.ts and the React components.

Type Notes
DeviceType 'Switch' | 'Access-Point' | 'Firewall' | 'Controller' | (string & {}) — presets + free-form
Device status: 'online' | 'offline' | 'unknown'; emergencySheet markdown; optional cmkHostname, lastCheckedAt
TopologyLink { fromDevice, toDevice, type } (e.g. LACP-Trunk, Uplink, OOB-Management)
LabTemplate deviceIds: string[], topology: TopologyLink[], optional Semaphore template IDs
Booking status: 'active' | 'upcoming' | 'completed' | 'cancelled'; ansible trigger flags + job IDs
LogEntry type: 'maintenance' | 'booking' | 'status' | 'system'
User { id, name, role, email } (never password on the client)
QuickLink { id, title, url, description, category, color, createdBy?, createdAt }

9. Deployment Architecture

9.1 Process & Build Model

Dev   :  npm run dev   > tsx server.ts
         Express + Vite middleware (HMR) on :3000

Build :  npm run build
         vite build                              > dist/ (frontend)
         esbuild server.ts --bundle --platform=node --format=cjs
                 --packages=external             > dist/server.cjs

Prod  :  NODE_ENV=production node dist/server.cjs
         Express serves static dist/ + SPA fallback (GET * > dist/index.html)

9.2 Proxmox LXC + systemd (two instances)

+----------------------- Proxmox LXC (Debian 12) -----------------------+
|                                                                       |
|  Production                          Staging                          |
|  +-------------------------+         +-----------------------------+  |
|  | branch  : main          |         | branch  : dev               |  |
|  | dir     : /opt/ghostgrid|         | dir : /opt/ghostgrid-dev    |  |
|  | port    : 3000          |         | port    : 3001              |  |
|  | service : ghostgrid     |         | service : ghostgrid-dev     |  |
|  | db      : ghostgrid.db  |         | db : ghostgrid.db (own)     |  |
|  | .env    : own JWT_SECRET|         | .env : own JWT_SECRET       |  |
|  +-------------------------+         +-----------------------------+  |
|                                                                       |
|  Both exposed directly on the LAN (no reverse proxy / TLS by default; |
|  the in-app Caddy feature can add this).                              |
+-----------------------------------------------------------------------+

Install :  deploy/proxmox-ghostgrid.sh   (creates LXC, Node 20, clones both
                                           branches, builds, configures services)
Update  :  deploy/deploy.sh <branch>     (git pull + npm run build + systemctl restart;
                                           defaults to main)
Backup  :  ghostgrid.db + ghostgrid.db-wal + ghostgrid.db-shm

9.3 Environment Variables

Var Default Purpose
JWT_SECRET insecure fallback Sign/verify JWTs — must be set in production
APP_URL http://localhost:<PORT> Base URL for deriving the Azure redirect URI
PORT 3000 HTTP listen port
NODE_ENV production switches to static dist/ serving
CADDY_MANAGER unset true makes this instance the sole Caddy owner (push/seed/edit). Set on production only — one Caddy per container
CHECKMK_API_URL / CHECKMK_API_USER / CHECKMK_API_SECRET Fallbacks if not set in the Settings UI

10. Security Architecture

+-------------------------------------------------------------+
|                    SECURITY OVERVIEW                          |
+-------------------------------------------------------------+
|  Authentication                                              |
|  +-- Local users: bcrypt password hashing (cost 10)         |
|  +-- JWT (HS256), 24h expiry, no refresh tokens             |
|  +-- Token in Authorization header (not cookies)            |
|  +-- Optional Azure Entra ID SSO (MSAL), group restriction  |
+-------------------------------------------------------------+
|  Authorization                                               |
|  +-- requireAuth on all data routes                         |
|  +-- role column ('User'/'admin') exists                    |
|  +-- ⚠ requireAdmin defined but NOT applied — any           |
|      authenticated user can read/write settings + users     |
+-------------------------------------------------------------+
|  Secret Handling                                             |
|  +-- Integration secrets stored in settings table           |
|  +-- Masked as __SET__ on read (SECRET_KEYS)                |
|  +-- Only overwritten when a non-sentinel value is sent      |
+-------------------------------------------------------------+
|  Notable gaps / accepted risks                               |
|  +-- JWT_SECRET has an insecure fallback if unset           |
|  +-- POST /api/bookings trusts client-supplied userId       |
|      (does not force req.user.userId)                       |
|  +-- No rate limiting on auth endpoints                      |
|  +-- Secrets at rest are plaintext in SQLite (file perms    |
|      are the protection boundary)                           |
+-------------------------------------------------------------+

11. Project Structure

GhostGrid/
+-- server.ts                 # Express app: all routes, auth, integrations, background jobs
+-- server-db.ts              # SQLite connection, full schema, settings/Caddy helpers
+-- index.html                # Vite HTML entry (#root > src/main.tsx); links /favicon.svg
+-- public/
|   +-- favicon.svg           # app favicon (GhostGrid logo; served at site root by Vite)
+-- vite.config.ts            # Vite + React + Tailwind; '@' alias > repo root
+-- tsconfig.json             # noEmit, react-jsx, bundler resolution
+-- package.json              # scripts + deps (package name "react-example" is vestigial)
+-- .env.example              # JWT_SECRET, APP_URL
+-- metadata.json             # app name/description metadata
+-- README.md                 # user-facing overview
+-- DEPLOY.md                 # Proxmox LXC / systemd deployment guide
+-- ARCHITECTURE.md           # ← this file
+-- deploy/
|   +-- deploy.sh             # git pull + build + systemctl restart (arg: branch)
|   +-- ghostgrid.service     # systemd unit — production (main, :3000)
|   +-- ghostgrid-dev.service # systemd unit — staging (dev, :3001)
|   +-- proxmox-ghostgrid.sh  # one-shot LXC installer (Proxmox VE helper-script style)
+-- src/
    +-- main.tsx              # React entry
    +-- App.tsx               # root component (state, routing, handlers)
    +-- index.css             # Tailwind + theme
    +-- types.ts              # shared TS interfaces
    +-- vite-env.d.ts
    +-- lib/auth.ts           # token storage + authFetch
    +-- components/           # 14 components (see §7.1)

# Runtime artifacts (gitignored):
#   ghostgrid.db, ghostgrid.db-wal, ghostgrid.db-shm, dist/, node_modules/

12. Key Technical Decisions Summary

Decision Choice Rationale
Process model Single Express process serves API + SPA No orchestration; trivial LXC deployment
Database SQLite (better-sqlite3, synchronous, WAL) Zero-ops, single file, ideal for LAN/air-gap
Auth JWT in localStorage + optional Azure MSAL Self-contained; SSO optional, header-based (no CSRF)
Frontend state React hooks only (no Redux/Zustand/router) App is small; one stateful root is enough
Runtime config Integration settings in DB, edited in Settings UI No redeploy to change CheckMK/Semaphore/Caddy/Azure
Background jobs Self-rescheduling setTimeout loops Picks up settings changes each cycle; no cron/queue dep
Styling Tailwind v4 + class-based dark/light Utility-first, theme toggled on <html>
Offline Fonts bundled via @fontsource No CDN / external runtime fetches
Build Vite (frontend) + esbuild (server bundle) Fast, single dist/server.cjs output

13. Operational Notes & Risk Mitigation

Risk / Concern Mitigation / Status
Missing JWT_SECRET in prod Documented in .env.example/DEPLOY.md; set per instance (installer generates a random one)
No admin RBAC enforced requireAdmin exists — wire it onto /api/settings and /api/users if stricter control is needed
Client-supplied userId on booking create Force req.user.userId server-side if spoofing is a concern
CheckMK/Semaphore outage Integration loops catch errors, log them, and retry next cycle; non-fatal
Caddy admin API unreachable pushCaddyConfig() failures are logged as warnings; routes apply when Caddy starts
Data loss Back up ghostgrid.db + -wal/-shm; each instance has its own DB
Schema evolution Edit the CREATE TABLE block in server-db.ts (fresh-install model, no migrations); new settings need seed + allow-list (+ SECRET_KEYS if secret)

14. Dependencies and Libraries

14.1 Backend (dependencies)

@azure/msal-node     ^5.2.2     # Entra ID OAuth
@fontsource/inter            ^5.2.8   # self-hosted font (used by frontend build)
@fontsource/jetbrains-mono   ^5.2.8
@tailwindcss/vite    ^4.1.14    # Tailwind v4 Vite plugin
@vitejs/plugin-react ^5.0.4
bcryptjs             ^2.4.3     # password hashing
better-sqlite3       ^12.10.0   # SQLite driver
dotenv               ^17.2.3    # .env loading
express              ^4.21.2    # HTTP server
jsonwebtoken         ^9.0.2     # JWT
lucide-react         ^0.546.0   # icons
react / react-dom    ^19.0.1
vite                 ^6.2.3

14.2 Dev / Build (devDependencies)

@types/bcryptjs, @types/better-sqlite3, @types/express,
@types/jsonwebtoken, @types/node     # type definitions
esbuild        ^0.25.0   # bundle server.ts > dist/server.cjs
tailwindcss    ^4.1.14
tsx            ^4.21.0   # run TS directly in dev
typescript     ~5.8.2

14.3 npm Scripts

Script Command Purpose
dev tsx server.ts Dev server (Express + Vite middleware) on :3000
build vite build && esbuild server.ts … --outfile=dist/server.cjs Build frontend + bundle server
start node dist/server.cjs Run production build (NODE_ENV=production)
clean rm -rf dist server.js Remove build artifacts
lint tsc --noEmit Type check

15. Quick Mental Model (for future prompts)

Browser (React SPA, localStorage JWT)
        │  authFetch > Authorization: Bearer <jwt>
        ▼
Express (server.ts)  ──►  better-sqlite3 (ghostgrid.db, WAL)
   ├─ /api/auth/*        (JWT local + Azure MSAL)
   ├─ /api/{devices,labs,bookings,logs,links,users,settings}
   ├─ /api/checkmk/*     ── background: ~60s status sync
   ├─ /api/semaphore/*   ── background: 30s setup/teardown trigger
   └─ /api/caddy/*       ── pushes Caddyfile to Caddy admin API
        │
        └─ serves frontend: Vite middleware (dev) / static dist/ (prod)

Invariants to remember when editing:

  • Frontend and backend share src/types.ts — change both sides together.
  • labs.deviceIds / labs.topology are JSON strings in SQLite, parsed in the API.
  • Booking boolean flags are 0/1 integers in SQLite, mapped on read.
  • A new settings key must be: seeded in server-db.ts, allow-listed in PUT /api/settings, and (if secret) added to SECRET_KEYS.
  • Schema changes go straight into the CREATE TABLE block in server-db.ts — fresh-install model, no migration helper.
  • The SPA catch-all (app.get('*')) + static serving are registered last in startServer(), after every /api route — otherwise GET /api/* falls through to index.html. All /api responses carry Cache-Control: no-store.
  • One Caddy per container; POST /load replaces the whole config. Only the CADDY_MANAGER=true instance may push/seed/edit routes — never let the non-manager push.
  • All user-facing strings are in English.

Generated from the GhostGrid codebase. Keep this document in sync when the data model, API surface, or integrations change.