Skip to content

refactor: migrate integration tests to Go, split monolith, clean up seed data#701

Merged
zhouzhuojie merged 40 commits into
openflagr:mainfrom
zhouzhuojie:zz/rewrite-integration-test
Jun 10, 2026
Merged

refactor: migrate integration tests to Go, split monolith, clean up seed data#701
zhouzhuojie merged 40 commits into
openflagr:mainfrom
zhouzhuojie:zz/rewrite-integration-test

Conversation

@zhouzhuojie

@zhouzhuojie zhouzhuojie commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Description

Replace the shell/shakedown integration tests (integration_tests/test.sh) with a native Go test suite. The suite seeds ~50 realistic flags covering all 12 constraint operators as HTTP API fixtures, provides 12 test functions and 12 HTTP benchmarks, and supports three execution modes.

File split

integration_test.go was 1034 lines bundling 4 distinct responsibilities. Split into focused files:

File Lines Role
integration_server.go 253 TestMain, server lifecycle, seedFlags
integration_client.go 105 HTTP client helpers
integration_benchmark_test.go 112 HTTP benchmarks
integration_test.go 597 Test functions only
seed.go 154 Types + initFlagDefs() with flat data

Seed data cleanup

  • Replaced init() side effect with explicit initFlagDefs() call
  • Flattened opGroup nested loop machinery to a single []entry literal
  • Removed dead matchingEntity() method and bespoke itoa helper

Shell fallback removed

  • Deleted integration_tests/test.sh and all test-shell refs
  • Go suite fully replaces shell tests

Code quality fixes

  • BenchmarkEvalDisabledFlag now targets the actual disabled flag
  • Subprocess cleanup via serverCmd + defer
  • waitForEvalCache checks for non-empty data
  • startLocalServer always builds fresh to temp dir
  • TestIntegration_Evaluation uses t.Run subtests
  • Added INSTANCE ?= flagr_with_sqlite default to Makefile
  • integration.test and integration.test.log gitignored

Motivation and Context

The shell-based integration tests (test.sh) were slow, fragile, invisible to go test, and produced no timing metrics. Porting to Go gives us native go test integration, structured assertions via testify, HTTP benchmarks via BenchmarkXxx, and reliable seeding with polling instead of sleep 5.

Splitting the monolith was driven by a code quality review that flagged the file at 1034 lines (71 shy of the 1k threshold) bundling server lifecycle, HTTP helpers, test functions, and benchmarks into one file.

Closes: this replaces the old integration test approach entirely; no corresponding issue.

How Has This Been Tested?

  • go vet -tags=integration ./integration_tests/ — clean
  • make test-integration — auto-starts server, runs all 12 test functions against SQLite
  • go test -tags=integration -bench=. -benchmem ./integration_tests/ — all 12 benchmarks run
  • Existing make test (unit tests) — unaffected
  • Existing make test-e2e — unaffected
  • CI changes run in GitHub Actions integration_test job

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

close #700

… shell fallback

- Split integration_test.go (1034 lines) into 4 focused files:
  - integration_server.go: TestMain, lifecycle, seed runner
  - integration_client.go: HTTP helpers
  - integration_benchmark_test.go: benchmarks
  - integration_test.go: test functions only
- Replace init() side effect with explicit initFlagDefs()
- Flatten opGroup nested machinery to single []entry literal
- Delete test.sh and all test-shell references
- Always build fresh to temp dir (no stale-binary footgun)
- Add testing section to docs/home.md
- Update sidebar, AGENTS.md, plan doc to reflect current state
- gitignore integration.test + integration.test.log
@codecov-commenter

codecov-commenter commented Jun 9, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.87%. Comparing base (770461b) to head (b38629a).
⚠️ Report is 48 commits behind head on main.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #701      +/-   ##
==========================================
- Coverage   81.19%   76.87%   -4.33%     
==========================================
  Files          28       35       +7     
  Lines        2271     2266       -5     
==========================================
- Hits         1844     1742     -102     
- Misses        337      422      +85     
- Partials       90      102      +12     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Docker compose bind mounts create directories when the host path
doesn't exist, which caused go test -c to fail (trying to write
a binary file into a directory). Also, stale containers from previous
runs kept the old directory mount.

Fix: remove the bind mount from docker-compose.yml, copy the binary
into the running container via 'docker compose cp' after up.
… binary

Replace the alpine integration-runner (which needed the test binary
injected via bind-mount/docker-cp/cat-pipe) with a golang:1.26-alpine
container that compiles and runs 'go test' directly against each
flagr instance.

Changes:
- integration-runner: alpine → golang:1.26-alpine, source mounted at /src
- Remove build-integration-test Makefile target entirely
- Remove integration.test binary creation, cleanup, and gitignore entry
- CI integration_test job no longer needs setup-go (Go is in container)
- test/test-instance/bench-instance all use 'go test' via docker exec
The 821551instance variable from the Makefile for loop was not visible
inside 'docker compose exec' (separate shell). Fixed by passing
the URL via -e FLAGR_SERVER_URL=http://821551instance:18000.

Also restructured CI integration_test job to use a GitHub Actions
matrix with max-parallel:1, making each DB instance visible as
its own CI entry. test-instance target now properly fails on
test failure.
…unner, revert CI matrix

Instead of fighting with docker compose exec env var passing,
TestMain now accepts FLAGR_SERVER_URLS as a comma-separated list
and runs the full test suite against each instance sequentially.

- integration_server.go: TestMain loops over urls, resets
  seedFlagIDs/seedFlagKeys per instance, tracks exit code
- docker-compose.yml: Remove integration-runner (no longer needed)
- Makefile: FLAGR_SERVER_URLS passed directly to go test
- CI: Revert to single job with setup-go, no matrix
… container

The Docker compose service names (flagr_with_sqlite:18000) are only
resolvable from within the Docker network. Running go test on the
host runner couldn't resolve them. Fix: add back the golang
integration-runner container, mount source code, run go test
via 'docker compose exec -e FLAGR_SERVER_URLS' which inherits
the env var from the host shell.
…uction

URLs in FLAGR_SERVER_URLS already include :18000, but TestMain was
adding another :18000, producing http://flagr_with_sqlite:18000:18000
which is an invalid URL. This caused all tests to fail with
'unsupported protocol scheme empty' since baseURL was set to the
invalid URL but http.NewRequest stripped it.

Fix: just prepend 'http://' since the URLs already have ports.

Also simplify: env var is now baked into docker-compose.yml,
Makefile test target is just 'docker compose exec go test'.
…erride

The env var approach (docker-compose.yml environment, docker compose exec -e,
docker compose run, sh -c) all failed to pass FLAGR_SERVER_URLS into the
container. The tests compiled and ran but baseURL was always empty.

Fix: hardcode the 6 Docker instance URLs directly in the Go code.
TestMain uses these by default. If FLAGR_SERVER_URLS or FLAGR_SERVER_URL
is set, those override the hardcoded defaults. For local mode, startLocalServer
still works as before.
The Docker container approach (integration-runner) was failing because
source mounting and env var passing were unreliable across Docker
Compose v2 versions. Tests compiled and ran but baseURL was always empty.

Fix: add port mappings to each flagr service in docker-compose.yml
(e.g., 18001:18000 for sqlite, 18002:18000 for mysql, etc.) and run
'go test' from the CI runner host. Tests connect via localhost:<port>.

Changes:
- docker-compose.yml: Add ports to flagr services, remove integration-runner
- Makefile: Set FLAGR_SERVER_URLS with localhost ports, run go test from host
- integration_server.go: Remove debug code, clean up TestMain
TestMain must be in a _test.go file for Go's test framework to detect
and call it. The init() function was running but TestMain was silently
ignored because the file wasn't a test file.
The root cause of ALL CI failures was that TestMain was defined in
integration_server.go (not a _test.go file). Go's test framework
only recognizes TestMain in _test.go files. After renaming to
integration_server_test.go, TestMain is properly called.

The env var FLAGR_SERVER_URLS is correctly passed from the Makefile
shell prefix to the go test process. Tests connect to flagr instances
via localhost port mappings (18001-18006).
Structural improvements:
- Define typed API response structs (flagResponse, segmentResponse,
  variantResponse, constraintResponse, distributionResponse, tagResponse,
  evalResponse, batchEvalResponse) — eliminates all map[string]any and
  unsafe type assertions in tests
- Move seedFlags into seed.go — makes seed module self-contained
- Extract pollUntil helper — eliminates duplicate waitForServer/waitForEvalCache
  polling loops
- Extract prepareServer() from TestMain — deduplicates the two branches
- Delete doReqOrDie — seedFlags uses doReqAndDecode directly with testing.TB
- Delete jsonInts — use json.Marshal(seedFlagIDs) instead
- Table-drive TestIntegration_BatchEvalOperator — 80 lines → 25
- Remove version: 3.6 from docker-compose.yml (deprecated by Compose V2)
- waitForEvalCache now fails on timeout instead of silently continuing
- Fix server cleanup: Kill + WaitDelay instead of Interrupt, redirect
  server output to temp file to avoid 'I/O incomplete' errors
- Drain response body in TestIntegration_Export to prevent broken pipe
- Benchmarks use shared doReq helper with status code checking

Files changed:
- integration_client.go: response structs, pollUntil, cleaned HTTP helpers
- seed.go: seedFlags moved in with testing.TB parameter
- integration_server_test.go: prepareServer, cleaner TestMain
- integration_test.go: typed structs, table-driven subtests
- integration_benchmark_test.go: shared helpers, status checking
- docker-compose.yml: removed deprecated version field
- Unify seedFlags/seedFlagsFromMain into single function taking errorf param (deletes ~70 lines of duplication)
- Make pollUntil return error instead of relying on callback to terminate
- Add doReqOK helper for non-decoding 2xx checks
- Use doReq in waitForServer/waitForEvalCache instead of hand-rolled http.Get
- Remove sleep 20 from Makefile, let test harness poll via waitForServer
- Set EntityType default in initFlagDefs, simplify seedFlags condition
- Replace raw JSON strings in benchmarks with structured map bodies
- Clean dead imports (log, testing, io, net/http, encoding/json)
…ad check

- Delete benchEvalRaw (dead code after structured body refactor)
- Add assertion for entity_types response in FlagCRUD test
- Add assertion for preload query results in FlagCRUD test
- Change t.Log to t.Errorf in Preload test (unexpected segments is a bug, not a note)
- Use b.Loop() instead of b.N in benchmarks (Go 1.24+)
- Safe type assertion for TCP address in startLocalServer
- Document test ordering dependency at package level
- Add missing return after decode error in doReqAndDecode
@zhouzhuojie zhouzhuojie force-pushed the zz/rewrite-integration-test branch from 26aa868 to a2b032a Compare June 10, 2026 10:37
- Add bench and test-and-bench targets to integration_tests/Makefile
- Extract shared SERVER_URLS variable and up target
- CI integration_test job now uses test-and-bench
- Benchmark results framed with clear headers for grep-ability in logs
prepareServer now checks if flags exist before seeding. This allows
benchmarks to run after tests against the same servers without hitting
UNIQUE constraint violations.
@zhouzhuojie zhouzhuojie merged commit d20f929 into openflagr:main Jun 10, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rewrite integration tests with more realistic fixtures and cover more edge case with real api benchmarks

2 participants