test(provisioner): multi-tenant deprovision-scoping regression (truehomie DROP-incident class)#46
Merged
Conversation
…omie DROP-incident class) PR #45 proved single-tenant Provision/Regrade/Deprovision/idempotency round-trips for postgres + redis through the real gRPC handlers. It did NOT prove SCOPING: that a deprovision is confined to the target tenant. That is the gap the 2026-06-03 truehomie-db DROP incident exposed (an active Pro customer's db+role dropped while a co-resident tenant shared the cluster). Adds server_multitenant_scoping_test.go: provisions TWO co-resident tenants A+B through the genuine gRPC ProvisionResource handler against a real Postgres / real Redis, seeds data into each, deprovisions ONLY A, and asserts B fully survives. - Postgres: after Deprovision(A), A's db_/usr_ are gone (DROP ran) AND B's database + role still exist, B's seeded row is intact, and B can still CONNECT with its own ConnectionUrl credentials. - Redis: after Deprovision(A), A's ACL user + namespace key are reaped AND B's ACL user + namespace key + value survive. Env-gated identically to server_live_roundtrip_test.go (skips clean under -short / no backend; runs for real in coverage.yml's pg+redis services and local dev backends). Verified PASS locally against Postgres 16 + Redis 7. Coverage block: Symptom: unscoped DROP DATABASE/DROP USER (or ACL DELUSER/SCAN+DEL) on deprovision takes out a co-resident tenant (truehomie 2026-06-03) Enumeration: gRPC DeprovisionResource handler path, postgres+redis LocalBackend Sites found: 2 (postgres deprovision, redis deprovision) Sites touched: 2 (both have a co-resident-survival regression test) Coverage test: TestServer_{Postgres,Redis}_Deprovision_IsScopedToTargetTenant Live verified: PASS vs local Postgres 16 + Redis 7 (real backends); B survives A Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
mastermanas805
added a commit
that referenced
this pull request
Jun 4, 2026
…age (Wave 4) (#47) * test(provisioner): real-backend gRPC round-trips for Mongo/Queue/Storage (Wave 4) Closes the remaining cells of the provisioner gRPC × backend matrix (INTEGRATION-COVERAGE-PLAN §2.3 / Wave 4). Postgres + Redis server-layer round-trips already shipped (#45, #46); this adds the real-backend Provision → assert artifact → Deprovision → assert-gone lifecycle for the three remaining backends, all driven through the genuine gRPC handlers (breaker wrapping, tier routing, mapError, response shaping): - Mongo: ProvisionResource creates usr_/db_ on a real MongoDB, GetStorageBytes reads real dbStats (>0 after seeding), DeprovisionResource runs the real dropUser/dropDatabase (truehomie DROP-incident class), second Deprovision is a clean idempotent no-op, and Regrade(mongo) asserts the documented skip path. - Queue (NATS): ProvisionResource passes the real NATS monitor health check and returns nats:// URL + subject prefix, GetStorageBytes(queue)=0 (message-metered), Deprovision is the shared-backend no-op, idempotent. - Storage (MinIO/S3): GetStorageBytes object-walk — empty prefix=0, after a real PutObject=exact byte count, after delete=0. (Storage Provision/Deprovision are API-side; provisioner only meters.) All tests env-gated (skip cleanly under `go test -short`, the deploy gate; run for real when the backend env is present). CI: added NATS service container + MinIO docker-run step + the TEST_NATS_HOST / TEST_MINIO_* / CUSTOMER_MONGO_AUTH_URL env wiring to coverage.yml so they execute (mongo was already provided). Verified locally against real mongo/nats/minio containers: all 8 server round-trip tests PASS; integration-only coverage for internal/server = 99.2% (provisionMongo/provisionQueue/GetStorageBytes/DeprovisionResource/RegradeResource all 100%). No bug found in the destroy/regrade paths. Added INTEGRATION-COVERAGE-EXCLUSIONS.md documenting the ≥80 floor method + the only genuinely-unreachable lines (k8s dedicated-backend boot wiring, cmd/ entrypoints). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * ci(provisioner): start NATS via docker run, not a services container The minimal nats:2 image has no wget/curl/nc, so the service-container --health-cmd ('wget ... :8222/healthz') could never pass — GitHub Actions marked the container unhealthy and aborted the coverage job before any test ran (NATS logged 'Server is ready'). Mirror the MinIO pattern: docker run nats:2 -js -m 8222 + a runner-side curl wait on /healthz. Unblocks #47. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Manas Srivastava <[email protected]> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a real-Postgres + real-Redis integration regression test for the
2026-06-03 truehomie-db DROP incident class: prove that deprovisioning
tenant A drops ONLY A's database+role (postgres) / ACL user+keys (redis) and
leaves a co-resident tenant B fully intact.
Why #45 didn't cover this
PR #45 (
server_live_roundtrip_test.go) proved single-tenantProvision→Regrade→Deprovision→re-Deprovision round-trips through the real gRPC
handlers for postgres + redis, including DROP-IF-EXISTS idempotency. It does
not prove scoping — that a deprovision is confined to its target tenant.
The truehomie incident was an unscoped/over-broad DROP taking out a neighbor;
that is precisely the assertion #45 lacked and this PR adds. No duplication of
the single-tenant idempotency path.
What this test does
server_multitenant_scoping_test.goprovisions TWO co-resident tenants A+Bthrough the genuine
server.ProvisionResourcegRPC handler against a realbackend, seeds data into each (via each tenant's OWN returned ConnectionUrl
credentials), then
DeprovisionResource(A)and asserts:TestServer_Postgres_Deprovision_IsScopedToTargetTenant):A's
db_/usr_gone (DROP ran) AND B's database + role still exist, B'sseeded row intact, B can still CONNECT with its own credentials.
TestServer_Redis_Deprovision_IsScopedToTargetTenant):A's ACL user + namespace key reaped AND B's ACL user + key + value survive.
How tenant B survival was proven
Ran locally against real Postgres 16 + Redis 7 — both PASS. The postgres test
reconnects with tenant B's own ConnectionUrl after A's deprovision and reads
back the sentinel row, so it asserts connectivity + data integrity, not just
catalog presence. Skips clean under
-short/ no backend.CI
Env-gated on
TEST_POSTGRES_CUSTOMERS_URL+CUSTOMER_REDIS_URL(same envresolution as #45). provisioner
coverage.ymlalready provides pg+redisservices and runs
go test ./... -p 1without-short, so these testsexecute for real in the gating coverage job. No workflow changes needed.
Coverage block
Integration-coverage delta (§1.4 Mechanism C): server-package statement % is
unchanged (19.9%) because the new tests exercise the same Deprovision handler
lines #45 already covers — the new value is a flow-completeness / behavioral
isolation guarantee (Decision 1A), not new lines.
🤖 Generated with Claude Code