Redact embedded credentials before persisting remote_url (DFT-13)#1249
Conversation
Strip the user:password@ component from remote_url (and each dependencies[].remote_url) when Metadata.dump() writes .dfetch_data.yaml, so a manifest URL like https://user:TOKEN@host/repo.git no longer leaks its credential into a file typically committed to VCS. Addresses DFT-13 in the usage threat model. https://claude.ai/code/session_019iPNdP6S7SwMviFE1uEjEV
Flip the DFT-13 threat response in the usage model from Accept to Mitigate, add control C-035 (persisted-metadata credential redaction) pointing at Metadata.dump(), and refresh the A-17 / A-18 asset descriptions to record that the on-disk metadata file no longer carries embedded URL credentials. Regenerate doc/explanation/threat_model_usage.rst and add a changelog entry for 0.14.0. https://claude.ai/code/session_019iPNdP6S7SwMviFE1uEjEV
WalkthroughThis PR strips embedded URL userinfo (user:password@) from persisted ChangesCredential redaction in persisted metadata
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@dfetch/project/metadata.py`:
- Around line 15-32: _strip_userinfo currently rebuilds the netloc using
parsed.hostname which drops IPv6 brackets, producing invalid netlocs like
"::1:8080"; update _strip_userinfo to detect IPv6 hosts (parsed.hostname
containing ":"), wrap the hostname in square brackets when reconstructing
netloc, and then append the port if parsed.port is an int; keep using
parsed.hostname and parsed.port, then call urlunsplit as before (ensure you
still return the original URL when host and parsed.netloc are empty). This
change should be made inside the _strip_userinfo function and use
parsed.hostname, parsed.port, and urlunsplit to locate the fix.
In `@security/tm_usage.py`:
- Around line 997-1014: The Control with id "C-035" (Control(... id="C-035",
name="Persisted-metadata credential redaction")) collides with an existing gap
reference ("gap C-035" mentioned elsewhere); rename one of the entries to a new
unique ID (e.g., "C-036" or another unused tag) and update all references
consistently — update the Control's id field and any textual/back-reference like
"gap C-035" to the new ID so threat-model traceability remains unique and
consistent across Metadata.dump(), dependencies[].remote_url, and the gap
reference.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7b145147-6638-4f52-9ca6-2dd395f05277
📒 Files selected for processing (5)
CHANGELOG.rstdfetch/project/metadata.pydoc/explanation/threat_model_usage.rstsecurity/tm_usage.pytests/test_metadata.py
parsed.hostname drops the square brackets around IPv6 literals, so
rebuilding the netloc as f"{host}:{port}" produced an invalid form like
"::1:8080". Detect IPv6 hosts (":" in host) and wrap them in brackets
before reconstruction.
Also tightens Metadata.dump()'s dependencies redaction to construct
each entry via the Dependency TypedDict explicitly, restoring mypy
strict-type checking on the assignment.
https://claude.ai/code/session_019iPNdP6S7SwMviFE1uEjEV
The new control collided with the pre-existing ``gap C-035`` placeholder that points at the upstream-source-attestation gap (DF-19 description in tm_usage.py). Rename the implemented control to C-036, the next free slot, keeping ``gap C-035`` available for the still-unfilled upstream attestation control. Updates the Control id, all back-references in tm_usage.py (``mitigated by`` comments, A-17 description, DF-08 description) and the three corresponding cells in the generated threat_model_usage.rst. https://claude.ai/code/session_019iPNdP6S7SwMviFE1uEjEV
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
doc/explanation/threat_model_usage.rst (1)
1199-1203:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd the trailing newline and commit the regenerated file.
pre-commit end-of-file-fixerchanged this file in CI, so the current commit is missing the expected EOF state.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@doc/explanation/threat_model_usage.rst` around lines 1199 - 1203, The file is missing a trailing newline which broke the pre-commit EOF fixer; open the regenerated documentation file (the one describing Metadata.dump()/dfetch/project/metadata.py), add a single newline at the end of the file, save, and commit the change so CI’s pre-commit check no longer modifies the file.Source: Pipeline failures
tests/test_metadata.py (2)
106-128:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winCommit the Black rewrite for this file.
CI reports
blackmodifiedtests/test_metadata.py; please commit formatter output to clear the pre-commit gate.As per coding guidelines,
blackis required for formatting, and pipeline logs show this file is not in formatted state.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_metadata.py` around lines 106 - 128, The file tests/test_metadata.py is not formatted by Black; run the Black formatter on the repo (or on tests/test_metadata.py) to apply the code style changes that CI reported, then add and commit the formatted file so the pytest.mark.parametrize block and surrounding code (e.g., the URL test cases) match Black’s output and the pre-commit/CI gate passes.Sources: Coding guidelines, Pipeline failures
135-146:⚠️ Potential issue | 🟠 MajorFix
Metadata(...)dict literals intests/test_metadata.pyto include requiredOptionskeys (Pyright).
Optionsis aTypedDictwith required fields; the twoMetadata({...})calls omit them (first at ~138, second at ~194), causing Pyright to fail.Suggested fix
+import datetime import textwrap @@ def _dump_metadata(tmp_path, *, remote_url, dependencies=None): @@ meta = Metadata( { + "last_fetch": datetime.datetime(2000, 1, 1, 0, 0, 0), "remote_url": remote_url, "branch": "main", + "tag": "", "revision": "abc123", "hash": "deadbeef", + "patch": "", "destination": str(tmp_path), "dependencies": dependencies or [], } ) @@ meta = Metadata( { + "last_fetch": datetime.datetime(2000, 1, 1, 0, 0, 0), "remote_url": "https://carol:hunter2@example.com/org/repo.git", + "branch": "", + "tag": "", + "revision": "", + "hash": "", + "patch": "", + "dependencies": [], "destination": str(tmp_path), } )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_metadata.py` around lines 135 - 146, The Metadata constructor calls in _dump_metadata and the other test helper are passing dicts missing required keys from the Options TypedDict; update both dict literals passed to Metadata(...) to include the required Options fields (add all mandatory keys from Options with suitable test values), e.g. extend the dict in _dump_metadata and the second Metadata(...) call to include those missing keys so Pyright accepts the TypedDict usage; search for Metadata(...) in tests/test_metadata.py to find both locations.Sources: Coding guidelines, Pipeline failures
security/tm_usage.py (1)
1-1:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winRun and commit Black on this file.
The DevContainer pipeline reports
pre-commit blackreformattedsecurity/tm_usage.py; commit that change to unblock CI.As per coding guidelines,
blackformatting is required, and pipeline logs confirm this file is currently non-compliant.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@security/tm_usage.py` at line 1, Run the Black formatter on security/tm_usage.py and commit the resulting changes so the file matches the project's pre-commit formatting; e.g., run the project's pre-commit hook or `black security/tm_usage.py` (or `pre-commit run --files security/tm_usage.py`) and then stage and commit the modified file to unblock CI.Sources: Coding guidelines, Pipeline failures
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@doc/explanation/threat_model_usage.rst`:
- Around line 1199-1203: The file is missing a trailing newline which broke the
pre-commit EOF fixer; open the regenerated documentation file (the one
describing Metadata.dump()/dfetch/project/metadata.py), add a single newline at
the end of the file, save, and commit the change so CI’s pre-commit check no
longer modifies the file.
In `@security/tm_usage.py`:
- Line 1: Run the Black formatter on security/tm_usage.py and commit the
resulting changes so the file matches the project's pre-commit formatting; e.g.,
run the project's pre-commit hook or `black security/tm_usage.py` (or
`pre-commit run --files security/tm_usage.py`) and then stage and commit the
modified file to unblock CI.
In `@tests/test_metadata.py`:
- Around line 106-128: The file tests/test_metadata.py is not formatted by
Black; run the Black formatter on the repo (or on tests/test_metadata.py) to
apply the code style changes that CI reported, then add and commit the formatted
file so the pytest.mark.parametrize block and surrounding code (e.g., the URL
test cases) match Black’s output and the pre-commit/CI gate passes.
- Around line 135-146: The Metadata constructor calls in _dump_metadata and the
other test helper are passing dicts missing required keys from the Options
TypedDict; update both dict literals passed to Metadata(...) to include the
required Options fields (add all mandatory keys from Options with suitable test
values), e.g. extend the dict in _dump_metadata and the second Metadata(...)
call to include those missing keys so Pyright accepts the TypedDict usage;
search for Metadata(...) in tests/test_metadata.py to find both locations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0b12e575-fe06-4c0d-a924-7f08b7d0a01e
📒 Files selected for processing (4)
dfetch/project/metadata.pydoc/explanation/threat_model_usage.rstsecurity/tm_usage.pytests/test_metadata.py
* end-of-file-fixer trimmed two stray trailing newlines from
threat_model_usage.rst
* black wrapped two over-long comment lines in tm_usage.py and
reformatted the parametrize block in test_metadata.py
* Both Metadata({...}) calls in test_metadata.py now include every
field required by the Options TypedDict (last_fetch, tag, patch
and, for the in-memory test, branch/revision/hash/dependencies),
silencing pyright's reportArgumentType warning without resorting
to suppression comments
https://claude.ai/code/session_019iPNdP6S7SwMviFE1uEjEV
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/test_metadata.py (1)
133-133: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winConvert new test docstrings to Google style format.
The newly added test docstrings are sentence-style, but this repo requires Google-style docstrings for test functions under
tests/**/*.py.As per coding guidelines, “Docstrings in test functions should follow Google style convention.”
Also applies to: 158-158, 172-172, 197-197
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_metadata.py` at line 133, Multiple test docstrings were written as sentence-style descriptions (e.g. "_strip_userinfo removes userinfo, preserves scheme/host/port/path/query.") but must follow the repo's Google-style test docstring convention; update each affected test function's docstring to a Google-style format (single-line summary followed by a blank line and optional explanatory text) for the tests in this file (replace the simple sentence docstrings at the locations noted), ensuring you keep the same meaning and punctuation and apply the same change to the other occurrences flagged (the similar docstrings at the other test functions).Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@tests/test_metadata.py`:
- Line 133: Multiple test docstrings were written as sentence-style descriptions
(e.g. "_strip_userinfo removes userinfo, preserves
scheme/host/port/path/query.") but must follow the repo's Google-style test
docstring convention; update each affected test function's docstring to a
Google-style format (single-line summary followed by a blank line and optional
explanatory text) for the tests in this file (replace the simple sentence
docstrings at the locations noted), ensuring you keep the same meaning and
punctuation and apply the same change to the other occurrences flagged (the
similar docstrings at the other test functions).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: fec07e20-185b-4d51-b5d9-acf172ad6304
📒 Files selected for processing (3)
doc/explanation/threat_model_usage.rstsecurity/tm_usage.pytests/test_metadata.py
💤 Files with no reviewable changes (1)
- doc/explanation/threat_model_usage.rst
|
@coderabbitai review |
✅ Action performedReview finished.
|
Summary
user:password@userinfo from everyremote_urlwritten to.dfetch_data.yaml— both the top-level field and eachdependencies[].remote_url(git-submodule / svn:external entries) — so a manifest URL with an embedded Personal Access Token no longer leaks the credential into the on-disk metadata file (and from there into any commit / clone / CI checkout of it).Metadata.dump()indfetch/project/metadata.py. The in-memoryMetadataobject held by the running command keeps the original URL, so in-flight authenticated fetches are unaffected — only the on-disk representation is sanitised.Why this is safe
.dfetch_data.yamlis write-only for recording purposes: no code path reads the persistedremote_urlback to drive a fetch.Metadata.__eq__is the only comparison that referencesremote_urland it is defined but never called;check,diff, andupdatealways rebuildMetadatafrom the liveProjectEntry(which still holds the credential), so redacting atdump()does not cause spurious "out of date" reports.Test plan
pytest tests/test_metadata.py— 21 passing, including new parametrised_strip_userinfocases (user:token@host,user@host, host-only, explicit port, query-string preserved,file://round-trip, empty / path-only inputs) and dump-then-from_fileround-trips for bothremote_urlanddependencies[].remote_urlplus an assertion that the in-memory URL is untouchedpytest tests/— full unit suite green except 6 pre-existing failures intests/test_sbom_reporter.pycaused by a missing optionalcyclonedx-python-lib[json-validation]install in this environment (unrelated to this change)behave features/ --tags=updateto confirm end-to-enddfetch updatestill writes.dfetch_data.yamlcorrectlyhttps://user:secret@example.com/repo.tar.gz, rundfetch update, confirm the persistedremote_url:contains neitheruser:secretnoruser@https://claude.ai/code/session_019iPNdP6S7SwMviFE1uEjEV
Generated by Claude Code
Summary by CodeRabbit
Bug Fixes
Documentation
Tests