When scraping a local directory tree with file://, there is no way to honour .gitignore. The only filters are includeHidden and user-supplied --exclude-pattern globs.
For the common case of indexing a tree of real git repositories, this means every non-hidden ignored path (node_modules/, dist/, build output, and any non-hidden secret listed in .gitignore) gets crawled unless manually re-enumerated as exclude patterns. That is both a noise problem and a safety one — a deliberately gitignored file is silently indexed and becomes searchable.
Request: an opt-in flag / config (e.g. --respect-gitignore, default off for backward compatibility) that skips gitignored paths during local directory traversal. Per-directory .gitignore cascade would cover the large majority of cases.
Opt-in because pure doc bundles often have no .gitignore, so defaulting off preserves current behaviour.
When scraping a local directory tree with
file://, there is no way to honour.gitignore. The only filters areincludeHiddenand user-supplied--exclude-patternglobs.For the common case of indexing a tree of real git repositories, this means every non-hidden ignored path (
node_modules/,dist/, build output, and any non-hidden secret listed in.gitignore) gets crawled unless manually re-enumerated as exclude patterns. That is both a noise problem and a safety one — a deliberately gitignored file is silently indexed and becomes searchable.Request: an opt-in flag / config (e.g.
--respect-gitignore, default off for backward compatibility) that skips gitignored paths during local directory traversal. Per-directory.gitignorecascade would cover the large majority of cases.Opt-in because pure doc bundles often have no
.gitignore, so defaulting off preserves current behaviour.