Skip to content

test(provisioner): multi-tenant deprovision-scoping regression (truehomie DROP-incident class)#46

Merged
mastermanas805 merged 1 commit into
masterfrom
test/multi-tenant-deprovision-scoping
Jun 4, 2026
Merged

test(provisioner): multi-tenant deprovision-scoping regression (truehomie DROP-incident class)#46
mastermanas805 merged 1 commit into
masterfrom
test/multi-tenant-deprovision-scoping

Conversation

@mastermanas805

Copy link
Copy Markdown
Member

What

Adds a real-Postgres + real-Redis integration regression test for the
2026-06-03 truehomie-db DROP incident class: prove that deprovisioning
tenant A drops ONLY A's database+role (postgres) / ACL user+keys (redis) and
leaves a co-resident tenant B fully intact.

Why #45 didn't cover this

PR #45 (server_live_roundtrip_test.go) proved single-tenant
Provision→Regrade→Deprovision→re-Deprovision round-trips through the real gRPC
handlers for postgres + redis, including DROP-IF-EXISTS idempotency. It does
not prove scoping — that a deprovision is confined to its target tenant.
The truehomie incident was an unscoped/over-broad DROP taking out a neighbor;
that is precisely the assertion #45 lacked and this PR adds. No duplication of
the single-tenant idempotency path.

What this test does

server_multitenant_scoping_test.go provisions TWO co-resident tenants A+B
through the genuine server.ProvisionResource gRPC handler against a real
backend, seeds data into each (via each tenant's OWN returned ConnectionUrl
credentials), then DeprovisionResource(A) and asserts:

  • Postgres (TestServer_Postgres_Deprovision_IsScopedToTargetTenant):
    A's db_/usr_ gone (DROP ran) AND B's database + role still exist, B's
    seeded row intact, B can still CONNECT with its own credentials.
  • Redis (TestServer_Redis_Deprovision_IsScopedToTargetTenant):
    A's ACL user + namespace key reaped AND B's ACL user + key + value survive.

How tenant B survival was proven

Ran locally against real Postgres 16 + Redis 7 — both PASS. The postgres test
reconnects with tenant B's own ConnectionUrl after A's deprovision and reads
back the sentinel row, so it asserts connectivity + data integrity, not just
catalog presence. Skips clean under -short / no backend.

CI

Env-gated on TEST_POSTGRES_CUSTOMERS_URL + CUSTOMER_REDIS_URL (same env
resolution as #45). provisioner coverage.yml already provides pg+redis
services and runs go test ./... -p 1 without -short, so these tests
execute for real in the gating coverage job. No workflow changes needed.

Coverage block

Symptom:        unscoped DROP DATABASE/DROP USER (or ACL DELUSER/SCAN+DEL) on
                deprovision takes out a co-resident tenant (truehomie 2026-06-03)
Enumeration:    gRPC DeprovisionResource handler path, postgres+redis LocalBackend
Sites found:    2 (postgres deprovision, redis deprovision)
Sites touched:  2 (both have a co-resident-survival regression test)
Coverage test:  TestServer_{Postgres,Redis}_Deprovision_IsScopedToTargetTenant
Live verified:  PASS vs local Postgres 16 + Redis 7 (real backends); B survives A

Integration-coverage delta (§1.4 Mechanism C): server-package statement % is
unchanged (19.9%) because the new tests exercise the same Deprovision handler
lines #45 already covers — the new value is a flow-completeness / behavioral
isolation
guarantee (Decision 1A), not new lines.

🤖 Generated with Claude Code

…omie DROP-incident class)

PR #45 proved single-tenant Provision/Regrade/Deprovision/idempotency round-trips
for postgres + redis through the real gRPC handlers. It did NOT prove SCOPING:
that a deprovision is confined to the target tenant. That is the gap the
2026-06-03 truehomie-db DROP incident exposed (an active Pro customer's db+role
dropped while a co-resident tenant shared the cluster).

Adds server_multitenant_scoping_test.go: provisions TWO co-resident tenants A+B
through the genuine gRPC ProvisionResource handler against a real Postgres / real
Redis, seeds data into each, deprovisions ONLY A, and asserts B fully survives.

- Postgres: after Deprovision(A), A's db_/usr_ are gone (DROP ran) AND B's
  database + role still exist, B's seeded row is intact, and B can still CONNECT
  with its own ConnectionUrl credentials.
- Redis: after Deprovision(A), A's ACL user + namespace key are reaped AND B's
  ACL user + namespace key + value survive.

Env-gated identically to server_live_roundtrip_test.go (skips clean under
-short / no backend; runs for real in coverage.yml's pg+redis services and local
dev backends). Verified PASS locally against Postgres 16 + Redis 7.

Coverage block:
  Symptom:        unscoped DROP DATABASE/DROP USER (or ACL DELUSER/SCAN+DEL) on
                  deprovision takes out a co-resident tenant (truehomie 2026-06-03)
  Enumeration:    gRPC DeprovisionResource handler path, postgres+redis LocalBackend
  Sites found:    2 (postgres deprovision, redis deprovision)
  Sites touched:  2 (both have a co-resident-survival regression test)
  Coverage test:  TestServer_{Postgres,Redis}_Deprovision_IsScopedToTargetTenant
  Live verified:  PASS vs local Postgres 16 + Redis 7 (real backends); B survives A

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mastermanas805 mastermanas805 enabled auto-merge (squash) June 4, 2026 16:13
@mastermanas805 mastermanas805 merged commit bb1935b into master Jun 4, 2026
12 checks passed
mastermanas805 added a commit that referenced this pull request Jun 4, 2026
…age (Wave 4) (#47)

* test(provisioner): real-backend gRPC round-trips for Mongo/Queue/Storage (Wave 4)

Closes the remaining cells of the provisioner gRPC × backend matrix
(INTEGRATION-COVERAGE-PLAN §2.3 / Wave 4). Postgres + Redis server-layer
round-trips already shipped (#45, #46); this adds the real-backend
Provision → assert artifact → Deprovision → assert-gone lifecycle for the
three remaining backends, all driven through the genuine gRPC handlers
(breaker wrapping, tier routing, mapError, response shaping):

- Mongo: ProvisionResource creates usr_/db_ on a real MongoDB, GetStorageBytes
  reads real dbStats (>0 after seeding), DeprovisionResource runs the real
  dropUser/dropDatabase (truehomie DROP-incident class), second Deprovision is
  a clean idempotent no-op, and Regrade(mongo) asserts the documented skip path.
- Queue (NATS): ProvisionResource passes the real NATS monitor health check and
  returns nats:// URL + subject prefix, GetStorageBytes(queue)=0 (message-metered),
  Deprovision is the shared-backend no-op, idempotent.
- Storage (MinIO/S3): GetStorageBytes object-walk — empty prefix=0, after a real
  PutObject=exact byte count, after delete=0. (Storage Provision/Deprovision are
  API-side; provisioner only meters.)

All tests env-gated (skip cleanly under `go test -short`, the deploy gate; run
for real when the backend env is present). CI: added NATS service container +
MinIO docker-run step + the TEST_NATS_HOST / TEST_MINIO_* / CUSTOMER_MONGO_AUTH_URL
env wiring to coverage.yml so they execute (mongo was already provided).

Verified locally against real mongo/nats/minio containers: all 8 server
round-trip tests PASS; integration-only coverage for internal/server = 99.2%
(provisionMongo/provisionQueue/GetStorageBytes/DeprovisionResource/RegradeResource
all 100%). No bug found in the destroy/regrade paths. Added
INTEGRATION-COVERAGE-EXCLUSIONS.md documenting the ≥80 floor method + the only
genuinely-unreachable lines (k8s dedicated-backend boot wiring, cmd/ entrypoints).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ci(provisioner): start NATS via docker run, not a services container

The minimal nats:2 image has no wget/curl/nc, so the service-container
--health-cmd ('wget ... :8222/healthz') could never pass — GitHub Actions
marked the container unhealthy and aborted the coverage job before any test
ran (NATS logged 'Server is ready'). Mirror the MinIO pattern: docker run
nats:2 -js -m 8222 + a runner-side curl wait on /healthz. Unblocks #47.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Manas Srivastava <[email protected]>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant