Feat/pack mind#2298
Conversation
Shared semantic memory for a Go2 team. Conductor (roster + append-only blackboard + deterministic mission state machine + movement lock) talks to each dog over MCP JSON-RPC; cross-dog handoff is by zone name only, no coordinates. Browser dashboard renders the causal chain. --mock runs the full Alpha->Bravo story with no hardware. dimos/experimental/pack_mind/
Load big_office_simple_occupancy.png into a fog-of-war search arena: remap PNG values (15=free/105=wall/0=exterior), downsample, morphological close to bridge speckle, keep largest connected floor. spread_starts() picks deterministic spread deployment. build_explore_building() + server PACK_MIND_MAP=building switch. On the 27x37m office floor the shared-vs- independent gap widens (shared clears 100%, independent stalls ~39%).
Fixed Fog(60,140) blacked out the larger building floor (camera ~274u from target, beyond fog far=140). Scale fog near/far by span; read cell res from state instead of assuming 0.1m so dogs land on the right cells.
When joined to the robot's own WiFi hotspot the gateway is always 192.168.12.1 and the peer expects the LocalAP handshake (empty SDP id). The previous hardcoded LocalSTA method sends id="STA_localNetwork", which some firmware rejects in AP mode, causing the WebRTC offer to hang with no answer. Auto-select LocalAP for the AP gateway IP, keep LocalSTA otherwise.
The dimos.msgs packages declare __all__ = [], so the previous getattr(module, "__all__", dir(module)) resolved to an empty list (present but falsy) and injected nothing into the eval context — every expression failed with "name 'Twist' is not defined". Walk same-named submodules (geometry_msgs.Twist.Twist, etc.) when __all__ is empty so message classes are available to the expression.
Keyboard teleop needs a pygame window (x11/cocoa), unavailable over SSH or headless. demo_drive_go2.py publishes Twist on /cmd_vel at a fixed rate so a running relay blueprint's ControlCoordinator forwards motion to the robot over WebRTC — no GUI required. Useful for live-hardware bring-up and demos.
- explore_sim: shared-map frontier de-confliction so the pack fans out instead of re-walking each other's ground; plant a search target at the farthest free cell, detect-on-sight, and converge-on-found - view_explore_rerun: DimOS Viewer (Rerun) view of the maze/office search with shared-vs-independent A/B and kill-a-dog resilience - README: document the viewer commands
Laptop-side brain that shares MEANING (which zones are searched, what was found) over HTTP so two dogs on two laptops never re-search the same area, and the mission survives a dog going offline. Zone NAMES only — never coordinates — so two independent SLAM frames work together without map merging. - pack_coordinator: zone ledger (provably no double-assign), report_finding (stop-on-find + claimed-zone fallback), release_dog (survivor inherits a downed dog's unfinished zones; findings persist) - pack_coordinator_server: JSON/HTTP API + projector dashboard at / + fan-out prefs - pack_dashboard.html: live view (zones, finding, offline/inheritance, causal chain) - pack_search_skills: dog agent tools (start_search/next_zone/report_*/where_is) - pack_search_runner: RobotDriver-protocol search loop + MockDriver - live.py: unitree-go2-pack blueprint (one dog per laptop, env-configured) + prompt - mock_dog + demo_pack_scene: hardware-free test/rehearsal of handoff + inheritance - tests: 24 passing (coordinator, HTTP server, runner)
- demo_pack_live: one command starts the coordinator + dashboard and plays the v4 inheritance climax with pauses, so the projector animates the full story (two dogs fan out with no overlap, a dog goes offline, the survivor inherits its unfinished zone and finds the object there) - mock_dog: add --target-zone (deterministic find in one place, like reality), --reset (call start_search first so re-runs reset the ledger), and --dwell (slow per-zone pacing for the dashboard)
Render the fog floor via OccupancyGrid.to_rerun (matching the live SLAM map), extrude explored walls into Boxes3D for real volume, and aim an orbital camera at the arena centre so the scene auto-frames instead of landing edge-on.
The --3d mode rendered the occupancy floor + extruded walls but never wired a working playback timeline in rerun 0.32 (recording showed no scrubbable tick axis), so it was a dead end for the demo. Real 3D map visuals come from LiDAR replay (`dimos run unitree-go2`), not the 2D coverage sim. Keep the working 2D fog A/B (--independent for the shared-vs-private baseline); that is the deliverable that proves the shared-memory thesis.
…dules) EdgeTAM segmentation hard-requires a CUDA GPU and reaches the spatial stack two ways, both fatal on a CPU/CoreML ground station (e.g. a Mac): - unitree_go2_spatial -> SecurityModule.__init__ -> EdgeTAMProcessor() (deploy crash) - PersonFollowSkillContainer.follow_person -> EdgeTAMProcessor() (mid-demo crash) The pack demo uses neither, so disable both via .disabled_modules(). The blueprint then deploys clean on CPU with no --disable flag and no call-time landmine. Verified: active_blueprints = 19 modules, both excluded, PackSearchSkills + MCP present.
So a second laptop+dog can collaborate from a clean checkout: - prefetch_live_models.py: one command caches the runtime models the live stack needs (moondream2 for look_out_for, faster-whisper-base for STT) so the demo runs fully offline (HF_HUB_OFFLINE=1) on a dog's internet-less AP / flaky venue WiFi - README: 2-dog/2-laptop bring-up — per-laptop env + run, router/STA vs AP, dashboard, and the hard-won gotchas (no --robot-ip flag, EdgeTAM already disabled, offline mode, moondream first-call latency, bring-up gate order)
…code modules + starmie-v1 tokenizer)
moondream VLM on a CPU ground station is slow (tens of seconds, async) and fragile. The demo target is defined by COLOUR, so detect it with a colour filter: - red_detector.py: RedObjectDetector watches the camera; @Skill look_for_red does a per-frame RGB ratio test (milliseconds, deterministic) and, on a hit, reports the finding to the coordinator (blank zone → coordinator fills the claimed zone) - wired into unitree-go2-pack; PACK prompt now uses look_for_red instead of look_out_for - 5 unit tests on the detection logic (hardware-free)
…aks threads in-process)
…m is the slow fallback
… on stop + auto speak)
…ky is_goal_reached LCM RPC
…hang) relative_move runs the A* planner then waits on ReplanningAStarPlanner/is_goal_reached over LCM — flaky on macOS, hangs 120s even though the dog reached the goal. New VelocityTeleop publishes Twist straight to MovementManager's tele_cmd_vel lane: the dog moves instantly, no planner, no confirmation RPC. demo_drive now uses `drive` (velocity bursts) instead of relative_move; W/A/S/D map to forward/turn m/s·rad/s.
- remove the old "backpack handoff" demo (conductor, dashboard, sim_harness, venue_go2, RUNBOOK, test_conductor) — superseded by the exploration A/B + live pack - recognize .disabled_modules() in the blueprint AST scanner so unitree-go2-pack (which disables the CUDA-only EdgeTAM modules) registers; regenerate all_blueprints
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ec6b00693b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| @rpc | ||
| def next_zone(self) -> str: |
There was a problem hiding this comment.
Expose next_zone as a pack skill
In the live unitree-go2-pack flow, the system prompt tells the MCP agent to call next_zone and should_stop during every mission loop, but MCP only lists methods returned by Module.get_skills(), which requires the @skill marker rather than plain @rpc. With these methods left as RPC-only, tools/list will not include them, so after start_search the agent cannot obtain assignments or observe the pack stop condition and the advertised autonomous search loop stalls.
Useful? React with 👍 / 👎.
| requests.post( | ||
| f"{self._url}/report_finding", | ||
| json={"dog": self._dog, "object": "red object", "zone": ""}, | ||
| timeout=_REQUEST_TIMEOUT, | ||
| ) |
There was a problem hiding this comment.
Raise on failed finding reports
When a red object is detected but the coordinator responds with a non-2xx HTTP status (for example a bad PACK_COORDINATOR_URL pointing at a server that returns 404/500), requests.post does not raise, so this skill still tells the agent the sighting was “reported to the pack” even though the coordinator did not record it. That leaves the finder stopping while teammates continue searching; mirror the other coordinator calls by checking raise_for_status() before returning success.
Useful? React with 👍 / 👎.
Greptile SummaryPACK MIND adds shared operational memory for a multi-robot pack: a zone-name ledger + target blackboard served over HTTP, so dogs coordinate search coverage without exchanging coordinates or merging SLAM maps. The PR includes a pure-numpy fog-of-war sim (A/B proof), a live Go2 blueprint, 58 tests, and small improvements to the blueprint scanner and
Confidence Score: 4/5Safe to merge as experimental code; two defects in the search runner and red detector deserve a fix before a reliability demo. Two concrete defects in the changed code: in pack_search_runner.py, zones the robot fails to physically reach remain permanently claimed and are never searched or released — coverage is silently incomplete. In red_detector.py, HTTP-level errors from the coordinator are not detected (no raise_for_status()), so the agent is told the finding was broadcast to the pack when the server may have rejected or dropped the request. Both issues affect the core demo flow. dimos/experimental/pack_mind/pack_search_runner.py and dimos/experimental/pack_mind/red_detector.py Important Files Changed
Sequence DiagramsequenceDiagram
participant DA as Dog Alpha (LLM agent)
participant DB as Dog Bravo (LLM agent)
participant PC as PackCoordinator (HTTP)
Note over DA,PC: Mission start
DA->>PC: POST /start_search
PC-->>DA: Searching for red kit. 4 zones.
Note over DA,DB: Zone assignment loop (no overlap)
DA->>PC: "POST /assign_zone dog=alpha"
PC-->>DA: "zone=north"
DB->>PC: "POST /assign_zone dog=bravo"
PC-->>DB: "zone=east"
DA->>DA: navigate + look_for_red(north)
DB->>DB: navigate + look_for_red(east)
DA->>PC: "POST /report_cleared dog=alpha zone=north"
PC-->>DA: north cleared.
Note over DB: Red object found!
DB->>PC: "POST /report_finding dog=bravo object=red object zone=east"
PC-->>DB: finding recorded, pack stop
DA->>PC: "GET /should_stop?dog=alpha"
PC-->>DA: "stop=true"
DA->>PC: GET /where_is
PC-->>DA: "found=true zone=east by=bravo"
Note over DA: Act on teammate memory
DA->>DA: navigate_to(east)
Reviews (3): Last reviewed commit: "feat: PACK MIND live 2-dog runbook + pre..." | Re-trigger Greptile |
| @skill | ||
| def where_is(self, object: str) -> str: | ||
| """Ask the pack's shared memory where an object was found. | ||
|
|
||
| Use this to act on a TEAMMATE's discovery — you may never have seen the | ||
| object yourself. Returns the zone a packmate reported it in, or that it | ||
| hasn't been found yet. | ||
|
|
||
| Args: | ||
| object: What you're asking about, e.g. "red backpack". | ||
| """ | ||
| data = self._get("/where_is", {}) | ||
| if data is None: | ||
| return "Could not reach the pack memory." | ||
| if not data.get("found"): | ||
| return f"No packmate has found the {object} yet." | ||
| return ( | ||
| f"{data.get('by')} found the {data.get('object')} in {data.get('zone')}. " | ||
| f"I can take you there." | ||
| ) |
There was a problem hiding this comment.
where_is ignores the object argument on the server query
The skill accepts object: str but self._get("/where_is", {}) passes an empty params dict — the queried object name is never sent to the server. The /where_is endpoint always returns the single current finding regardless of what object is being asked about. If the pack has found "red object" but the agent calls where_is("blue ball"), the coordinator still returns found=True and the response becomes "alpha found the red object in zone X. I can take you there." — the LLM receives a finding about a different object than it asked about. Consider either passing object as a query parameter and filtering server-side, or renaming the skill to current_finding() to reflect its true semantics.
|
|
||
| def _read_json(self) -> dict[str, Any] | None: | ||
| """Read a JSON object body. Returns None on malformed/non-object input.""" | ||
| length = int(self.headers.get("Content-Length", "0")) |
There was a problem hiding this comment.
int() on a malformed Content-Length header raises unhandled ValueError
If a client sends a request with a non-integer Content-Length header (e.g. an empty string or "abc"), int(self.headers.get("Content-Length", "0")) raises ValueError. Since _read_json is called directly from do_POST with no surrounding try-except, the exception propagates up to ThreadingMixIn.process_request_thread(), which closes the connection without sending any response. The server stays up but the client receives a silent connection reset.
| length = int(self.headers.get("Content-Length", "0")) | |
| try: | |
| length = int(self.headers.get("Content-Length", "0")) | |
| except ValueError: | |
| length = 0 |
| def report_cleared(self, dog: str, zone: str) -> str: | ||
| """Mark ``zone`` fully searched by ``dog`` (object not here).""" | ||
| with self._lock: | ||
| if zone in self._zones: | ||
| self._zones[zone] = Zone(zone, "cleared", dog) | ||
| self._log.append(f"{dog} cleared {zone}") | ||
| return f"{zone} cleared." |
There was a problem hiding this comment.
report_cleared has no zone-ownership check
Any dog can call report_cleared on a zone it does not own. If Dog A is currently searching zone "east" (claimed by Dog A) and Dog B mistakenly calls report_cleared("east"), the zone's state is overwritten to cleared with by=B. Dog A will still be searching a zone the ledger now marks as cleared by its teammate. In a coordinator that offers external zone-release via release_dog, this is also an easy-to-trigger consistency gap since the by field is used to identify which zones to reclaim on release.
| """ | ||
| duration = max(0.0, min(duration, _MAX_DURATION)) | ||
| self._publish(forward, turn) | ||
| time.sleep(duration) | ||
| self._publish(0.0, 0.0) # stop |
There was a problem hiding this comment.
drive() blocks the agent thread for up to 3 seconds
time.sleep(duration) runs on the LLM agent's calling thread, blocking it for the entire drive burst (up to _MAX_DURATION = 3.0 s). During this window the agent cannot process incoming messages, poll should_stop, or react to a packmate's finding. For short single-burst moves this is acceptable, but longer drives or rapid sequences will stall the agent loop.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| def report_cleared(self, dog: str, zone: str) -> str: | ||
| """Mark ``zone`` fully searched by ``dog`` (object not here).""" | ||
| with self._lock: | ||
| if zone in self._zones: | ||
| self._zones[zone] = Zone(zone, "cleared", dog) | ||
| self._log.append(f"{dog} cleared {zone}") | ||
| return f"{zone} cleared." |
There was a problem hiding this comment.
report_cleared silently accepts zones not owned by the reporting dog
Any dog can call report_cleared on a zone owned by another dog and the zone state will be overwritten. If Dog A holds "east" (claimed) and Dog B calls report_cleared("east"), the coordinator marks it cleared by B while A is still searching it. Additionally, release_dog relies on zone.by to reclaim zones from the released dog — a spurious clear from another dog prevents that reclaim, leaving a zone permanently cleared even though neither dog finished it.
|
|
||
| def _read_json(self) -> dict[str, Any] | None: | ||
| """Read a JSON object body. Returns None on malformed/non-object input.""" | ||
| length = int(self.headers.get("Content-Length", "0")) |
There was a problem hiding this comment.
Malformed
Content-Length header causes unhandled ValueError
int(self.headers.get("Content-Length", "0")) raises ValueError when the header value is non-numeric (e.g. "" or "abc"). Since _read_json is called directly from do_POST with no surrounding try-except, the exception propagates to ThreadingMixIn's request thread handler, which closes the connection without sending any HTTP response. The server stays alive but the client gets a silent connection reset.
AP-per-dog + Tailscale topology for the live demo no existing doc covers (README assumes shared-router/STA; live.py's LAN_IP only works on one LAN). - LIVE_RUNBOOK.md: on-site page — night-before checklist, per-laptop env, bring-up order, stable/showcase demo paths, failure->fix table. - preflight.py: per-laptop chain checker (dog/route/internet/tailscale/ coordinator) with exact remediation per FAIL; pure stdlib. - scripted_pack_run.py: can't-miss stable path — one process per laptop drives this dog's identity against the real coordinator (no-overlap + inheritance) and optionally moves the real dog via /cmd_vel teleop; no LLM/nav-RPC.
PACK MIND — shared operational memory for a robot pack (experimental)
One brain, many bodies, one memory that outlives any single dog.
Multi-robot search usually means merging maps — brittle, months of SLAM. PACK MIND
skips that: each dog keeps its own map and the pack shares only meaning — which
zones are searched and what was found, by name, never coordinates. So robots
running independent SLAM frames coordinate through one lightweight ledger. One dog
sees the target → the whole pack knows instantly. One dog drops → its discoveries
persist and a teammate inherits its unfinished ground. Share a mind, not a map.
What's in the PR (all under
dimos/experimental/pack_mind/)Sim (the A/B proof, pure numpy — no GPU/ROS/sim deps):
explore_sim.py— fog-of-war frontier exploration with shared-vs-private discoveredmaps + frontier de-confliction + object search.
view_explore_rerun.py— DimOS Viewer (Rerun) view; shared-vs-independent A/B + kill-a-dog.server.py+static/explore.html— web 3D side-by-side fog-of-war race (shared vsindependent, kill/reset).
Live pack (one dog per laptop):
pack_coordinator.py(+_server.py,pack_dashboard.html) — zone ledger (provablyno double-assign), finding blackboard with stop-on-find, inheritance
(
release_dog), HTTP API + projector dashboard.pack_search_skills.py— agent tools (start_search/next_zone/report_*/where_is).red_detector.py— GPU-free "red object" find (HSV-free RGB ratio test) — fast,deterministic where a VLM is too heavy.
velocity_teleop.py+demo_drive.py— directcmd_velteleop (bypasses theplanner's flaky
is_goal_reachedRPC) + keyboard driver with auto-detect-on-stop.live.py—unitree-go2-packblueprint, CPU/macOS-deployable (disables CUDA-onlyEdgeTAM modules).
mock_dog.py,demo_pack_live.py,demo_pack_scene.py,prefetch_live_models.py.Tests: 58 passing (coordinator, HTTP server, search runner, red detector, sim engine).
Registry: taught the blueprint scanner to recognize
.disabled_modules()sodisabled-module blueprints register.
Demo
Honest scope / design notes
dogs = two SLAM frames — merging them live is the brittle thing we deliberately don't do).
chosen for reliability on a no-GPU ground station. Full autonomy is shown in sim.
TBD / follow-ups (not in this PR)
names + env-configured identity), but only validated at 2 dogs. Field work needed:
cross-laptop HTTP reachability (firewall/subnet), STA-mode dog connections,
zone-name consistency across frames (the physical meaning of "no overlap"),
LCM host-isolation, and a 3-dog test (no-overlap + one-of-three-offline inheritance).
Plan: validate cross-machine with
mock_dogs first, then swap in real dogs.out of this PR; they overlap with the Zenoh transport work in Default to Zenoh transport on macOS and document replay workflow #2106 and should
land there/separately.
gossip/replication.
Test plan