Skip to content

Network Architecture

Statement

ClaimGuard's production network architecture today is a single VM on the GCP default VPC in europe-west1-b, with a public IP and inbound traffic terminating directly on the VM. There is no load balancer, no managed TLS in front of the application, and no separate DMZ or backend tier — the application, the Python analysis tools, and the Postgres database all live on the same host and communicate via 127.0.0.1.

This is the deliberate pre-launch shape. The next architecture step (plan A1.3) introduces a GCP HTTPS Load Balancer + Google-managed certificate in front of the VM, which is the gating prerequisite for flipping the Encryption in transit control to implemented and for closing the world-open application port. Until A1.3 lands, this control sits at partial.

Implementation

Topology today

                +-----------------------------+
   Public IP -->| claim-guard-app-1 (VM)      |
   34.76.180.225|   pm2 → Node API (3001)    |
                |   pm2 → Vite dev (5173 dev) |
                |   pm2 → Python tools        |
                |     master_tool (loopback)  |
                |     c2pa, layover, etc.     |
                |   Postgres on local disk    |
                +-----------------------------+
                          |
                          | (control plane)
                          v
                +-----------------------------+
                | GCP services                |
                |   Secret Manager            |
                |   Cloud Logging             |
                |   Cloud Storage (GCS)       |
                +-----------------------------+
  • One VPC: GCP default (auto-mode). The VM lives in europe-west1-b.
  • One VM: claim-guard-app-1, n1-standard-8 + T4 GPU, 300GB encrypted boot disk (deepfakebench3).
  • One public IP on the VM. No load balancer, no forwarding rule.
  • One Postgres, on the VM's boot disk. No Cloud SQL.
  • No subnets beyond VPC defaults. No separate "frontend" / "backend" / "data" tiers.

Trust boundaries

  • Public ↔ VM is the only externally-reachable boundary. Today it exposes the application's API port (3001) directly. Plan A1.3 collapses that into "443 → LB → 3001 over the LB-internal hop" with the public-facing port becoming TLS-only.
  • VM ↔ GCP control plane (Secret Manager, Cloud Logging, GCS) is TLS to Google-issued certificates, validated against the system trust store.
  • Inside the VM: Node ↔ Python ↔ Postgres ↔ pm2 all over loopback (127.0.0.1). Documented as a deliberate design choice in docs/security/PROTOCOL.md §4 so cross-process calls don't need a VPC firewall rule.

Outbound paths

  • GCP services as above.
  • Google Gemini API from tools/master_tool/ over TLS. See AI transparency.
  • Customer-supplied webhooks (SUPPORT_WEBHOOK_URL) wrapped by safeFetch (server/src/lib/safeFetch.js), which enforces TLS for https:// URLs, blocks private / link-local / CGNAT ranges, and refuses redirects by default. See SAST for the full invariant set.

Firewall posture

The current VPC firewall rules are documented and being narrowed in plan steps A0.1 / A0.2 / A1.2:

  • A0.1 (done)default-allow-rdp deleted; nothing legitimate used it.
  • A0.2 (done) — duplicated world-open SSH rules (allow-ssh, default-allow-ssh) deleted; SSH is now IAP-only on 35.235.240.0/20.
  • A1.2 (queued, gated on A1.3) — close the dev/internal ports currently world-open (5173, 9996/9997 incl. 3001, 9998/9999, etc.) once the LB is in front and serves the production port path.

See Network security for the firewall record and rationale.

Why no separation today

A multi-tier network with a frontend load balancer, a private backend subnet, and Cloud SQL for the database is the textbook target shape. We are not there yet because:

  • Pre-launch, the additional GCP cost would be unjustified.
  • Internal traffic over loopback removes a category of cross-VM network bugs and a category of inter-tier authentication code.
  • The architecture is documented honestly so the trust portal does not over-claim what the topology delivers.

The cost of getting there incrementally is small, and the plan sequences it: A1.3 puts the LB in front, A2.3 (longer-term) moves Postgres to Cloud SQL.

Status

partial — verified 2026-04-29.

What's in place:

  • A documented, reproducible single-VM topology with a single trust boundary.
  • IAP-only SSH (no public SSH path).
  • TLS for every outbound path (GCP control plane, Gemini API, operator webhooks via safeFetch).
  • Loopback-only internal traffic — no inter-VM traffic to firewall.
  • An immutable cloud audit log of every network configuration change.

Known gaps

  • No load balancer in front of the application. Public traffic terminates on the VM directly. Plan A1.3.
  • No managed TLS for application traffic. Same gating step.
  • No multi-tier separation (frontend / backend / data). Single VM hosts everything.
  • Postgres on the VM (cross-listed on Cloud provider and Backups). VM compromise = full DB compromise until A2.3.
  • Single zone (europe-west1-b). Zonal outage is a service interruption.
  • Several legacy world-open firewall rules still exist on the VPC for sibling workloads / dev access (see Network security). Plan A1.2 narrows them after A1.3.

Roadmap

  • A1.3 — HTTPS Load Balancer + Google-managed certificate. Closes the public-TLS gap and unlocks A1.2.
  • A1.2 — close the dev/internal ports world-open today.
  • A2.3 — Postgres → Cloud SQL with point-in-time recovery, which also removes Postgres from the VM trust boundary.
  • Network architecture diagram added to this page once A1.3 is in production.
  • Multi-zone or multi-region availability — not on the immediate roadmap; revisit after Cloud SQL migration.