Data Classification¶

Statement¶

ClaimGuard's data-classification posture today is informal: we know what data classes the application stores, we apply the same security controls (encryption at rest, role-and-org access, audit logging) across all of them, and we do not yet hold a class of data significant enough to warrant separate handling rules. There is no formal three- or four-tier classification scheme (Public / Internal / Confidential / Restricted) recorded against each table or field.

The data-classification control is operating: the underlying inventory is real, current, and documented on this page, with uniform encryption / access / audit controls applied across classes. Formalization items (per-field labels, handling rules per tier, training, sign-off) is queued.

Implementation¶

What ClaimGuard stores¶

Inventoried by table / category, with a candidate informal classification for each:

Category	Examples	Informal class	Notes
User identity	`users.email`, `display_name`, `full_name`	Internal / Personal Data	Standard PII for SaaS user accounts.
Authentication credentials	the password hash field, the admin-recovery field, the API-key hash field	Restricted / Secret	The admin-recovery field is a known gap — see Authentication.
Org metadata	`organizations.name`, settings, configuration	Internal	Org-scoped admin data.
Claim records	`claims.*` — narrative, identifiers, status, timestamps	Confidential / Customer Data	The core business object.
Evidence files	Uploaded images, video, documents in GCS	Confidential / Customer Data	Stored encrypted at rest by GCS default.
Analysis results	analysis records, `analysis_results` — model outputs, verdicts, risk scores	Confidential / Customer Data	Inherits the parent claim's class.
Review notes	review notes — analyst free-text per claim	Confidential / Customer Data	May contain operator opinions on claims.
Audit signals	`last_login_at`, `failed_login_attempts`, application logs	Internal	Operational; access-controlled.
Cloud audit logs (GCP)	IAM, secret access, VM lifecycle	Internal	400-day immutable retention. See Audit logging (cloud).
Secrets	`JWT_SECRET`, `DATABASE_URL`, `API_KEY`	Restricted / Secret	In Secret Manager. See Secrets management.

What we do not store¶

For audit clarity, the application schema has been inspected for classes that are not present today:

No SSN, national ID, or government identifier columns.
No payment card data. No PCI scope.
No bank-account or routing-number columns.
No medical / health record columns. No HIPAA scope at present.
No date-of-birth column in the user or claim schema.
No phone number, postal address, or geolocation fields in the user record.
No biometric data.

If any of those classes are added in a future feature, the classification of that table needs to graduate from this informal scheme to an explicit one.

Controls applied uniformly across classes¶

Today, every data class gets the same control surface:

Encryption at rest under GMEK (see Encryption at rest).
Role + org access at the application layer (see Authorization).
Outbound TLS for any cloud-control-plane traffic (see Cryptography).
Audit-logged privileged access at the cloud layer (Audit logging (cloud)).

The two divergences from "uniform controls" are:

Secrets are stored in GCP Secret Manager (separate API surface), not in Postgres.
Public user-facing TLS is not yet in place — pending plan step the LB rollout step. Until the LB rollout step lands, all Confidential / Customer Data is in scope for that gap as well.

Status¶

implemented — verified 2026-05-06. Data inventory is current, uniform controls are operating across classes, and the page is the live single-source-of-truth. Known gaps below are formalization items.

What's in place:

A complete and current inventory of data classes the application stores.
A negative inventory of classes the application does not store.
Uniform security controls across classes, documented across the trust portal.

Known gaps¶

No formal classification scheme with per-field labels (e.g., confidential vs internal columns annotated in the schema or in a data-dictionary doc).
No per-class handling rules (who can see, copy, export which class — today this is encoded uniformly via role-and-org auth).
No data-handling training material keyed to classification.
No per-row tagging of fields that originate from end-user PII vs operator-entered narrative.
The admin-recovery field on the user record column lives in the Restricted / Secret class and is a known security gap — see Authentication.

Roadmap¶

Pick a scheme (likely Public / Internal / Confidential / Restricted) and apply per-field labels in a single data-dictionary doc.
Per-class handling rules in the trust portal once the scheme is chosen.
Sensitive-column annotation in the SQL schema (a SQL comment is enough to start) so future engineers see the class without cross-referencing this page.
Re-classify once new data classes are added (e.g., if billing is added, payment data is a new tier; if HIPAA scope ever lands, the entire scheme is reconsidered).