Data Classification¶
Statement¶
ClaimGuard's data-classification posture today is informal: we know what data classes the application stores, we apply the same security controls (encryption at rest, role-and-org access, audit logging) across all of them, and we do not yet hold a class of data significant enough to warrant separate handling rules. There is no formal three- or four-tier classification scheme (Public / Internal / Confidential / Restricted) recorded against each table or field.
The control sits at partial because the underlying inventory is real and current — and is documented on this page — but the formalization (per-field labels, handling rules per tier, training, sign-off) is queued.
Implementation¶
What ClaimGuard stores¶
Inventoried by table / category, with a candidate informal classification for each:
| Category | Examples | Informal class | Notes |
|---|---|---|---|
| User identity | users.email, display_name, full_name |
Internal / Personal Data | Standard PII for SaaS user accounts. |
| Authentication credentials | users.password_hash, users.password_plain, api_keys.key_hash |
Restricted / Secret | The password_plain column is a known gap — see Authentication. |
| Org metadata | organizations.name, settings, configuration |
Internal | Org-scoped admin data. |
| Claim records | claims.* — narrative, identifiers, status, timestamps |
Confidential / Customer Data | The core business object. |
| Evidence files | Uploaded images, video, documents in GCS | Confidential / Customer Data | Stored encrypted at rest by GCS default. |
| Analysis results | analysis_runs, layover_results — model outputs, verdicts, risk scores |
Confidential / Customer Data | Inherits the parent claim's class. |
| Review notes | review_notes — analyst free-text per claim |
Confidential / Customer Data | May contain operator opinions on claims. |
| Audit signals | last_login_at, failed_login_attempts, application logs |
Internal | Operational; access-controlled. |
| Cloud audit logs (GCP) | IAM, secret access, VM lifecycle | Internal | 400-day immutable retention. See Audit logging (cloud). |
| Secrets | JWT_SECRET, DATABASE_URL, GOOGLE_API_KEY |
Restricted / Secret | In Secret Manager. See Secrets management. |
What we do not store¶
For audit clarity, the application schema has been inspected for classes that are not present today:
- No SSN, national ID, or government identifier columns.
- No payment card data. No PCI scope.
- No bank-account or routing-number columns.
- No medical / health record columns. No HIPAA scope at present.
- No date-of-birth column in the user or claim schema.
- No phone number, postal address, or geolocation fields in the user record.
- No biometric data.
If any of those classes are added in a future feature, the classification of that table needs to graduate from this informal scheme to an explicit one.
Controls applied uniformly across classes¶
Today, every data class gets the same control surface:
- Encryption at rest under GMEK (see Encryption at rest).
- Role + org access at the application layer (see Authorization).
- Outbound TLS for any cloud-control-plane traffic (see Cryptography).
- Audit-logged privileged access at the cloud layer (Audit logging (cloud)).
The two divergences from "uniform controls" are:
- Secrets are stored in GCP Secret Manager (separate API surface), not in Postgres.
- Public user-facing TLS is not yet in place — pending plan step A1.3. Until A1.3 lands, all Confidential / Customer Data is in scope for that gap as well.
Status¶
partial — verified 2026-04-29.
What's in place:
- A complete and current inventory of data classes the application stores.
- A negative inventory of classes the application does not store.
- Uniform security controls across classes, documented across the trust portal.
Known gaps¶
- No formal classification scheme with per-field labels (e.g.,
confidentialvsinternalcolumns annotated in the schema or in a data-dictionary doc). - No per-class handling rules (who can see, copy, export which class — today this is encoded uniformly via role-and-org auth).
- No data-handling training material keyed to classification.
- No per-row tagging of fields that originate from end-user PII vs operator-entered narrative.
- The
password_plaincolumn lives in the Restricted / Secret class and is a known security gap — see Authentication.
Roadmap¶
- Pick a scheme (likely Public / Internal / Confidential / Restricted) and apply per-field labels in a single data-dictionary doc.
- Per-class handling rules in the trust portal once the scheme is chosen.
- Sensitive-column annotation in the SQL schema (a SQL comment is enough to start) so future engineers see the class without cross-referencing this page.
- Re-classify once new data classes are added (e.g., if billing is added, payment data is a new tier; if HIPAA scope ever lands, the entire scheme is reconsidered).