Skip to content

Data Classification

Statement

ClaimGuard's data-classification posture today is informal: we know what data classes the application stores, we apply the same security controls (encryption at rest, role-and-org access, audit logging) across all of them, and we do not yet hold a class of data significant enough to warrant separate handling rules. There is no formal three- or four-tier classification scheme (Public / Internal / Confidential / Restricted) recorded against each table or field.

The control sits at partial because the underlying inventory is real and current — and is documented on this page — but the formalization (per-field labels, handling rules per tier, training, sign-off) is queued.

Implementation

What ClaimGuard stores

Inventoried by table / category, with a candidate informal classification for each:

Category Examples Informal class Notes
User identity users.email, display_name, full_name Internal / Personal Data Standard PII for SaaS user accounts.
Authentication credentials users.password_hash, users.password_plain, api_keys.key_hash Restricted / Secret The password_plain column is a known gap — see Authentication.
Org metadata organizations.name, settings, configuration Internal Org-scoped admin data.
Claim records claims.* — narrative, identifiers, status, timestamps Confidential / Customer Data The core business object.
Evidence files Uploaded images, video, documents in GCS Confidential / Customer Data Stored encrypted at rest by GCS default.
Analysis results analysis_runs, layover_results — model outputs, verdicts, risk scores Confidential / Customer Data Inherits the parent claim's class.
Review notes review_notes — analyst free-text per claim Confidential / Customer Data May contain operator opinions on claims.
Audit signals last_login_at, failed_login_attempts, application logs Internal Operational; access-controlled.
Cloud audit logs (GCP) IAM, secret access, VM lifecycle Internal 400-day immutable retention. See Audit logging (cloud).
Secrets JWT_SECRET, DATABASE_URL, GOOGLE_API_KEY Restricted / Secret In Secret Manager. See Secrets management.

What we do not store

For audit clarity, the application schema has been inspected for classes that are not present today:

  • No SSN, national ID, or government identifier columns.
  • No payment card data. No PCI scope.
  • No bank-account or routing-number columns.
  • No medical / health record columns. No HIPAA scope at present.
  • No date-of-birth column in the user or claim schema.
  • No phone number, postal address, or geolocation fields in the user record.
  • No biometric data.

If any of those classes are added in a future feature, the classification of that table needs to graduate from this informal scheme to an explicit one.

Controls applied uniformly across classes

Today, every data class gets the same control surface:

The two divergences from "uniform controls" are:

  • Secrets are stored in GCP Secret Manager (separate API surface), not in Postgres.
  • Public user-facing TLS is not yet in place — pending plan step A1.3. Until A1.3 lands, all Confidential / Customer Data is in scope for that gap as well.

Status

partial — verified 2026-04-29.

What's in place:

  • A complete and current inventory of data classes the application stores.
  • A negative inventory of classes the application does not store.
  • Uniform security controls across classes, documented across the trust portal.

Known gaps

  • No formal classification scheme with per-field labels (e.g., confidential vs internal columns annotated in the schema or in a data-dictionary doc).
  • No per-class handling rules (who can see, copy, export which class — today this is encoded uniformly via role-and-org auth).
  • No data-handling training material keyed to classification.
  • No per-row tagging of fields that originate from end-user PII vs operator-entered narrative.
  • The password_plain column lives in the Restricted / Secret class and is a known security gap — see Authentication.

Roadmap

  • Pick a scheme (likely Public / Internal / Confidential / Restricted) and apply per-field labels in a single data-dictionary doc.
  • Per-class handling rules in the trust portal once the scheme is chosen.
  • Sensitive-column annotation in the SQL schema (a SQL comment is enough to start) so future engineers see the class without cross-referencing this page.
  • Re-classify once new data classes are added (e.g., if billing is added, payment data is a new tier; if HIPAA scope ever lands, the entire scheme is reconsidered).