fix(parser): comment before next binding bleeds into previous content_position by thedavidmeister · Pull Request #143 · rainlanguage/dotrain

thedavidmeister · 2026-06-29T18:09:50Z

Problem

Block comments immediately before a #binding declaration were attributed to the previous binding's content_position range. The previous binding's content string and content_position extended through the comment, which caused:

LSP semantic-token highlights (elided-binding marker) to cover the comment area
Composer output to include the stray comment text when composing the preceding binding

Reproducer:

---
#prev-binding
! elided

/* this comment belongs to next-binding, not prev */
#next-binding
! elided2

prev-binding.content was "! elided\n\n/* this comment belongs to next-binding, not prev */" instead of "! elided".

Fix

In process_binding, content_position[1] (trailing bound) now uses trailing-trim from the modified document (where block comments have been replaced with spaces by fill_in), so trailing comment-whitespace is stripped. content_position[0] (leading bound) continues to use the original text so block comments at the start of a binding's own content (e.g. a doc comment before the expression body) are preserved correctly.

All 25 existing tests pass. Added a mutation-validated test that fails when the fix is reverted.

Closes #114

Co-Authored-By: Claude noreply@anthropic.com

Summary by CodeRabbit

Bug Fixes
- Improved binding content extraction so leading whitespace and comments are preserved where expected, while comments before the next binding no longer get included in the previous binding’s content range.
- Corrected content position handling to better match the actual text used for parsing and error reporting.
- Added test coverage to ensure adjacent bindings are parsed with the right content boundaries.

…_position Use trailing trim from the modified document (where comments are spaces) for content_position[1] so a block comment immediately before the next #-binding is not included in the preceding binding's content or content_position. Leading trim still uses the original text so comments at the start of a binding's own content area (e.g. /* doc comment */ before the expression) are preserved correctly. Closes #114 Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai · 2026-06-29T18:10:05Z

Walkthrough

In RainDocument::process_binding, the computation of content_position and content is rewritten so that the start offset uses a trim of the original slice and the end offset uses the previously computed raw_trimmed.2, excluding leading comments of the next binding. A regression test verifies the fix.

Binding content range fix

Layer / File(s)	Summary
process_binding content-range rewrite and regression test `crates/dotrain/src/parser/raindocument/logic.rs`, `crates/dotrain/src/parser/raindocument/mod.rs`	`content_position` start is now derived from `trimmed_orig` and end from `raw_trimmed.2`; `content` is re-sliced from `self.text` using the new range. A new test asserts that a comment before `#b2` is excluded from `b1`'s `content` and `content_position`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the main fix: comment handling around the next binding's content position.
Linked Issues check	✅ Passed	The change addresses `#114` by preventing comments before a binding from being attributed to the previous binding, including the elided-binding case.
Out of Scope Changes check	✅ Passed	The diff stays focused on the parser fix and its regression test, with no unrelated code changes shown.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch 2026-06-29-issue-114-comment-position

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/dotrain/src/parser/raindocument/logic.rs`:
- Around line 798-808: The malformed range is being preserved in the
`content_position` value even though `self.text.get(...)` masks it locally.
Update the range calculation in `logic.rs` where `content_position` is derived
in this branch so it is clamped/normalized before assignment, ensuring `start <=
end` for comment-only bindings after whitespace trimming. Keep the fix scoped to
the `content_position` computation used just before `content` is read, and use
the existing `parsed_binding`, `trimmed_orig`, and `raw_trimmed` values to
adjust the bounds consistently.

In `@crates/dotrain/src/parser/raindocument/mod.rs`:
- Around line 953-993: The new regression test only covers exclusion of a
comment before the next binding, but the positional contract also needs coverage
for the complementary cases. Extend the tests around RainDocument::create and
binding parsing to verify that a comment at the start of a binding’s own content
is preserved in that binding’s content/content_position, and add an edge case
for a comment-only binding to confirm it is handled correctly per the logic in
the parser. Use the existing test in raindocument/mod.rs as the location to add
these cases and align expectations with the behavior introduced in the
binding/content extraction logic.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 37dc1c6e-c1e5-4411-a8fc-2f0ac9fdbbad

📥 Commits

Reviewing files that changed from the base of the PR and between eefd3b6 and 020fe70.

📒 Files selected for processing (2)

crates/dotrain/src/parser/raindocument/logic.rs
crates/dotrain/src/parser/raindocument/mod.rs

coderabbitai · 2026-06-29T18:18:30Z

            } else {
                [
-                    parsed_binding.1[0] + boundry_offset + 1 + trimmed_content.1,
-                    parsed_binding.1[1] - trimmed_content.2,
+                    parsed_binding.1[0] + boundry_offset + 1 + trimmed_orig.1,
+                    parsed_binding.1[1] - raw_trimmed.2,
                ]
            };
-            content = if trimmed_content.0.is_empty() {
-                slices.1.to_owned()
-            } else {
-                trimmed_content.0.to_owned()
-            };
+            content = self
+                .text
+                .get(content_position[0]..content_position[1])
+                .unwrap_or("")
+                .to_owned();


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find all usages of content_position to assess underflow/slice-panic risk rg -nP --type=rust 'content_position' -C3 # Specifically subtraction or range slicing on content_position rg -nP --type=rust 'content_position\s*\[\s*1\s*\]\s*-\s*.*content_position\s*\[\s*0\s*\]' -C2 rg -nP --type=rust 'content_position\[0\]\s*\.\.\s*.*content_position\[1\]' -C2

Repository: rainlanguage/dotrain

Length of output: 403

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate the relevant code and all uses of `content_position` FILE="crates/dotrain/src/parser/raindocument/logic.rs" echo "== file outline ==" ast-grep outline "$FILE" --view expanded || true echo echo "== relevant lines ==" sed -n '780,825p' "$FILE" | cat -n echo echo "== all content_position references ==" rg -n --type=rust 'content_position' crates/dotrain/src -C 2 || true echo echo "== range/slice usage patterns around content_position ==" rg -n --type=rust 'content_position\s*\[[01]\]' crates/dotrain/src -C 2 || true echo echo "== nearby binding fields ==" rg -n --type=rust 'binding.*content_position|content_position.*binding' crates/dotrain/src -C 2 || true

Repository: rainlanguage/dotrain

Length of output: 24199

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/dotrain/src/parser/raindocument/logic.rs" echo "== tracked_trim / related helpers ==" rg -n --type=rust 'fn tracked_trim|tracked_trim\(' "$FILE" -C 4 || true echo echo "== process_binding body around the range calculation ==" sed -n '758,860p' "$FILE" | cat -n echo echo "== AST for Binding and Offsets ==" sed -n '220,250p' crates/dotrain/src/types/ast.rs | cat -n echo echo "== any tests around comment-only or whitespace-only bindings ==" rg -n --type=rust 'comment-only|whitespace-only|content_position extends|comment before|binding.*comment' crates/dotrain/src/parser/raindocument/mod.rs -C 2 || true

Repository: rainlanguage/dotrain

Length of output: 8244

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="crates/dotrain/src/parser/raindocument/logic.rs" echo "== tracked_trim definition ==" ast-grep outline "$FILE" --view expanded | sed -n '1,120p' || true echo echo "== exact tracked_trim / trim helper block ==" rg -n --type=rust 'tracked_trim|fn .*trim' "$FILE" -C 6 || true echo echo "== comment-related test block ==" sed -n '940,990p' crates/dotrain/src/parser/raindocument/mod.rs | cat -n echo echo "== other content_position consumers in parser/raindocument ==" rg -n --type=rust 'content_position' crates/dotrain/src/parser/raindocument -C 2 || true

Repository: rainlanguage/dotrain

Length of output: 19722

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== tracked_trim definition across repo ==" rg -n --type=rust 'fn tracked_trim|tracked_trim\(' crates/dotrain/src -C 4 || true echo echo "== bindings/tests that mention comment-only or empty binding ==" rg -n --type=rust 'InvalidEmptyBinding|comment-only|comment before|extends through the comment|content_position extends' crates/dotrain/src -C 3 || true echo echo "== any direct use of content_position as a start/end offset in composer/parser ==" rg -n --type=rust 'content_position\[[01]\]' crates/dotrain/src -C 2 || true

Repository: rainlanguage/dotrain

Length of output: 11446

Clamp content_position before storing it

Bindings whose content is only comments after leading whitespace can still end up with start > end: trimmed_orig preserves the comment text, while raw_trimmed.2 trims the comment-replaced slice back to the start of the content. The local .get(...).unwrap_or("") hides it here, but content_position is still stored as a malformed range.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/dotrain/src/parser/raindocument/logic.rs` around lines 798 - 808, The malformed range is being preserved in the `content_position` value even though `self.text.get(...)` masks it locally. Update the range calculation in `logic.rs` where `content_position` is derived in this branch so it is clamped/normalized before assignment, ensuring `start <= end` for comment-only bindings after whitespace trimming. Keep the fix scoped to the `content_position` computation used just before `content` is read, and use the existing `parsed_binding`, `trimmed_orig`, and `raw_trimmed` values to adjust the bounds consistently.

coderabbitai · 2026-06-29T18:18:30Z

+    #[test]
+    fn test_comment_before_binding_excluded_from_previous_content_position() {
+        let store = Store::new();
+        let meta_store = Arc::new(RwLock::new(store));
+
+        // Comment before #b2 must not appear in b1's content or content_position.
+        let text = "---\n#b1\n! elided\n\n/* comment for b2 */\n#b2\n! elided2\n";
+        let rain_document =
+            RainDocument::create(text.to_owned(), Some(meta_store.clone()), None, None);
+
+        let b1 = rain_document
+            .bindings()
+            .iter()
+            .find(|b| b.name == "b1")
+            .expect("binding b1 not found");
+
+        // b1 content must be the elision text only, not including the comment.
+        assert_eq!(b1.content, "! elided");
+
+        // b1 content_position must end at the last char of "! elided", not at the comment.
+        let content_end = b1.content_position[1];
+        let text_at_end = text.get(b1.content_position[0]..content_end).unwrap();
+        assert_eq!(text_at_end, "! elided");
+
+        // The comment text must not appear anywhere inside b1's content_position range.
+        let b1_content_range = text
+            .get(b1.content_position[0]..b1.content_position[1])
+            .unwrap();
+        assert!(
+            !b1_content_range.contains("/*"),
+            "comment leaked into b1 content_position: {b1_content_range:?}"
+        );
+
+        // b2 must also parse correctly.
+        let b2 = rain_document
+            .bindings()
+            .iter()
+            .find(|b| b.name == "b2")
+            .expect("binding b2 not found");
+        assert_eq!(b2.content, "! elided2");
+    }


📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Solid regression test for the exclusion path; consider covering the complementary cases.

This validates that a trailing comment before #b2 is excluded from b1. The PR also introduces an asymmetric rule where a comment at the start of a binding's own content is preserved (start bound uses original-text trim). That behavior, plus the comment-only binding edge case flagged in logic.rs, is currently untested. Adding cases like #b1\n/* doc */\n_: foo();\n#b2 (leading comment retained) and #b1\n\n/* note */\n#b2 (comment-only b1) would lock down the new positional contract.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/dotrain/src/parser/raindocument/mod.rs` around lines 953 - 993, The new regression test only covers exclusion of a comment before the next binding, but the positional contract also needs coverage for the complementary cases. Extend the tests around RainDocument::create and binding parsing to verify that a comment at the start of a binding’s own content is preserved in that binding’s content/content_position, and add an edge case for a comment-only binding to confirm it is handled correctly per the logic in the parser. Use the existing test in raindocument/mod.rs as the location to add these cases and align expectations with the behavior introduced in the binding/content extraction logic.

thedavidmeister self-assigned this Jun 29, 2026

coderabbitai Bot reviewed Jun 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(parser): comment before next binding bleeds into previous content_position#143

fix(parser): comment before next binding bleeds into previous content_position#143
thedavidmeister wants to merge 1 commit into
mainfrom
2026-06-29-issue-114-comment-position

thedavidmeister commented Jun 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 29, 2026

Uh oh!

coderabbitai Bot Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thedavidmeister commented Jun 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thedavidmeister commented Jun 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading