Skip to content

feat: legacy SSE MCP transport (GET /sse + POST /messages)#975

Draft
bradgessler wants to merge 2 commits into
garrytan:masterfrom
bradgessler:feat/legacy-sse-mcp-transport
Draft

feat: legacy SSE MCP transport (GET /sse + POST /messages)#975
bradgessler wants to merge 2 commits into
garrytan:masterfrom
bradgessler:feat/legacy-sse-mcp-transport

Conversation

@bradgessler
Copy link
Copy Markdown

Context — what we're trying to do

Hi Garry — putting this up as a draft for your guidance rather than a merge ask. Want to make sure the direction lines up with where you want gbrain to go before we go deeper. Happy to close this entirely if it's not the right shape.

Over at gbrain.io we're building a managed-hosting layer for gbrain — teams sign up, we provision a Postgres-backed gbrain instance per workgroup (Fly Machines under the hood), and members point their agents at https://gbrain.io/brains/<id>/mcp. The hosted instance is upstream gbrain unchanged — same gbrain serve --http, same MCP surface, same admin dashboard — with a thin Phoenix proxy on the front that maps a per-org bearer token to the right brain's upstream credentials.

The goal is "make gbrain frictionless for a 5-person workgroup that doesn't want to run a database." We're trying not to fork; everything that's a real product feature should land upstream so other hosters / DIY users get it too.

This is the first PR from that work. There will be more — most of them small. Treat this one as both a feature ask and a "is this the right alignment model" check.

The specific gap

The current gbrain serve --http exposes MCP via POST /mcp (Streamable HTTP transport — the modern MCP spec). That works great for Claude Desktop, ChatGPT, Cursor, and anything built on StreamableHTTPClientTransport.

But there's a real population of MCP clients still pinned to SSEClientTransport:

  • openclaw (the agent runtime we're shipping in hosted Harnesses)
  • The older MCP TypeScript SDK
  • Custom integrations that haven't migrated since the spec deprecated SSE

When one of those connects to gbrain serve --http, it hits GET /sse, gets a 404, fails with SSE error: Non-200 status code (404), and the user sees an agent silently lose all its MCP tools. The failure mode is hard to diagnose — the agent doesn't crash, it just becomes useless.

Three options we considered:

  1. Update the clients. Not in our control for third-party clients; openclaw is moving but not there yet.
  2. Run a side-car translator. We tried — it's a lot of code for a thin transport bridge, and it duplicates auth, scope, and audit logging.
  3. Add the legacy transport to upstream serve --http. What this PR does. The MCP spec still documents SSE as a (deprecated) transport, so it's not unprecedented for a server to support both.

What's in this PR

One commit, two files changed (one src + one test), ~230 lines:

  • src/commands/serve-http.ts: factor out buildMcpServerForAuth(authInfo) so POST /mcp and the new GET /sse share the same Server, tool registration, scope checks, and audit-log path. Then add GET /sse (opens an SSEServerTransport, stores it keyed by SDK-generated sessionId) and POST /messages?sessionId=… (looks up the session, enforces clientId match, routes via transport.handlePostMessage).
  • test/e2e/serve-http-oauth.test.ts: tools/list round-trip over SSE + the three auth / sessionId / unknown-session failure cases.

Streamable HTTP behavior on POST /mcp is unchanged. Modern clients keep working; legacy clients gain a path that didn't exist before.

The full commit message has the implementation detail:
bradgessler@bb015e4

What's NOT in this PR

  • No new auth surface. Reuses the existing Bearer + requireAuth(serverInstance) chain. Same scopes, same tokens, same admin dashboard view.
  • No persistence. Sessions live in-memory; a serve restart means clients reconnect via GET /sse and get a fresh sessionId. Matches the MCP SSE spec's stateless-server posture.
  • No docs/integration updates. docs/mcp/*.md all point at /mcp and that stays correct. Happy to add an SSE section if you want a documented migration path for legacy clients, but didn't want to make this PR larger before knowing if you wanted it at all.
  • No openclaw-specific code anywhere in gbrain. This is just the legacy MCP spec; openclaw is one consumer, not the contract.

Open questions for you

  1. Does carrying both transports fit gbrain's posture? Streamable HTTP is the spec direction; SSE is deprecated upstream. Keeping the old one around is "polite to legacy clients" but it's also dead weight on the maintainer (you). If the answer is "no, push the clients to upgrade and don't pollute the server," we drop this and find another path.
  2. If yes, should the SSE transport be opt-in? E.g., behind a --legacy-sse flag on gbrain serve --http, off by default. Trades one knob for explicit consent.
  3. Session storage shape. In-memory is fine for our hosted model (each tenant gets its own process) but breaks if anyone runs gbrain serve behind a load balancer. Want to flag now: if multi-instance support is on your roadmap, this PR isn't compatible without a Redis/Postgres-backed session map. Easy to add later; harder to retrofit after merge.
  4. The hosted layer broadly. Anything you'd want us to not do upstream because it's hosted-specific? Or shapes you'd want us to follow so other hosters can reuse the same components?

Where this connects to gbrain.io

  • Source for the hosted side: https://github.com/overtonxyz/gbrain (Phoenix mothership).
  • Live: https://gbrain.io — signed-in workgroup gets one gbrain instance and one OpenClaw harness per plan.
  • This PR is what unblocks the "third-party MCP client → hosted gbrain over the legacy SSE wire" path. The mothership already speaks the SSE half (MemoryProxy.sse/2 + MemoryProxy.messages/2) and proxies it to the tenant brain; the tenant's gbrain serve --http is the only thing that doesn't yet listen on GET /sse. This PR closes that.

Tests

bun test test/e2e/serve-http-oauth.test.ts against a real Postgres-backed gbrain serve covers:

  • GET /sse without bearer → 401
  • GET /sse with bearer → SSE stream + endpoint event with sessionId
  • POST /messages?sessionId=<known> JSON-RPC tools/list round-trip
  • POST /messages without sessionId → 400
  • POST /messages with unknown sessionId → 404

Existing POST /mcp tests untouched and passing.


Posting as draft. If the direction is wrong, say so and I'll close it. If it's directionally right but needs a different shape (feature flag, session-store interface, etc.), happy to rework. If it's good as-is and you'd want a docs/integration follow-up, I can add docs/mcp/SSE.md and a section in docs/mcp/DEPLOY.md in a separate commit.

Adds the legacy SSE MCP transport (deprecated in the MCP spec but
still in active use by openclaw and other clients pinned to the
older SSEClientTransport) alongside the existing Streamable HTTP
transport on POST /mcp.

Without this, an SSEClientTransport-based client connecting to
gbrain serve --http sees `SSE error: Non-200 status code (404)` on
GET /sse and never establishes a session — the agent silently has
zero MCP tools and the symptom is hard to recognize without reading
the client's gateway log directly.

Implementation:

- New helper `buildMcpServerForAuth(authInfo)` factors out the
  Server + handler setup from the existing POST /mcp handler so both
  transports share one source of truth for tool listing, dispatch,
  scope enforcement, audit logging, and the activity-feed broadcast.
  Each handler captures its own `startTime` so SSE-path latency
  measures per-JSON-RPC-call duration (not per-session).

- `GET /sse` — bearer-auth, builds a fresh Server, opens an
  SSEServerTransport, stores it in an in-memory session map keyed by
  the SDK-generated sessionId. Cleanup on `req.on('close')` closes
  the transport + server and deletes the map entry.

- `POST /messages?sessionId=…` — bearer-auth, looks up the session,
  enforces clientId match (another client's valid bearer cannot
  drive a session it doesn't own), then routes the parsed body into
  `transport.handlePostMessage(...)`.

- Sessions live only in memory. If the server restarts mid-session,
  the client reconnects via GET /sse and gets a fresh sessionId. No
  cross-process state, no persistence.

Tests (in test/e2e/serve-http-oauth.test.ts, runs against the real
Postgres-backed serve-http when DATABASE_URL is set):

  - tools/list round-trips: open SSE → read endpoint event → POST
    JSON-RPC to it → receive result on the SSE stream
  - GET /sse without bearer → 401
  - POST /messages without sessionId → 400
  - POST /messages with unknown sessionId → 404

POST /mcp behavior is unchanged. Streamable-HTTP clients (Claude
Desktop, the modern MCP SDK using StreamableHTTPClientTransport)
keep using POST /mcp; legacy-SSE clients now use GET /sse +
POST /messages. Both paths share auth, scope rules, and audit log.
@bradgessler
Copy link
Copy Markdown
Author

Heads-up — also opened #976 as the higher-level design RFC behind this PR.

This PR is the small, low-controversy plumbing bit (legacy SSE transport so SSE-pinned clients aren't broken). #976 is the bigger question of whether memory should be a named subset of operations.ts with its own domain tag — and proposes intake(blob, opts) as the missing memory-domain operation behind what we're trying to do at gbrain.io.

This PR is mergeable on its own terms (or not — your call). #976 is where the actual product-direction conversation lives.

Flips runFactsBackstop from 'queue' (fire-and-forget) to 'inline'
(await full pipeline) in the put_page handler so the response
carries real {inserted, duplicate, superseded, fact_ids} counts.

Motivation — our hosted gbrain shape:
  - The mothership is the only writer; every put_page is a single
    HTTP request the mothership's MemoryProxy holds open anyway.
  - Per-page truthful metrics matter: the mothership UI wants to
    show 'this page contributed 12 new facts' instead of 'queued'.
  - The in-process FactsQueue's failure surface (best-effort warn,
    drop-oldest on overflow, no caller visibility) is wrong for
    a centralized writer.

Trade-off documented in the comment: put_page latency grows by the
fact-extraction LLM call (~3-5s on a medium meeting page). Fine
for mothership-driven crawl/ingest, would be wrong on a per-key-
stroke surface — that's not a shape we ship.

Response shape change:
  - was: { facts_backstop: { queued: true } }  on enqueue success
  - now: { facts_backstop: { inserted, duplicate, superseded, fact_ids } }
  - 'skipped' branch unchanged (same bare-reason strings).

No upstream tests touched — this is a fork-local opinion change
that wouldn't fit a PR into garrytan/gbrain (where queue mode is
the right default).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant