Skip to content

Conversation

@msukkari
Copy link
Contributor

@msukkari msukkari commented Jan 23, 2026

Weird edge case where if a regex filter like filename/reponame contained a parenthesis the parser would fail. This is because the query language forbids parenthesis in the value since they're used to match against the ParenExpr symbol.

The ideal case is that the agent would be smart enough to wrap the regexp in quotes, which would solve this problem. I've added instructions in the tool call to tell it to do this, but it rarely does. As a result, I've added basic logic to process the regexp before adding it in the query string to wrap it in quotes if a parenthesis is included.

Fixes #771

Summary by CodeRabbit

  • Bug Fixes

    • Properly handle regex filters that include parentheses.
  • New Features

    • Added file path filtering capability to code search functionality.
  • Documentation

    • Updated guidance on using file filters with parentheses in regex expressions.

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions

This comment has been minimized.

@coderabbitai
Copy link

coderabbitai bot commented Jan 23, 2026

Walkthrough

PR addresses issue #771 by implementing proper handling of regex filters containing parentheses. Introduces new preprocessRegexp utility function that wraps values with parentheses in quotes, adds filterByFile parameter to the search_code tool, and applies consistent preprocessing across MCP and web client implementations to prevent query parsing failures.

Changes

Cohort / File(s) Summary
Regex preprocessing utility
packages/shared/src/query.ts
New exported function preprocessRegexp(value: string) that detects parentheses in unquoted values and wraps them in double quotes to safely handle special characters in regex patterns.
Shared package exports
packages/shared/src/index.client.ts, packages/shared/src/index.server.ts
Added re-exports of preprocessRegexp from ./query.js to make the utility available across package boundaries.
MCP search_code tool enhancement
packages/mcp/src/index.ts
Extended search_code tool with new filterByFile parameter (array of strings). Imported preprocessRegexp and implemented filtering logic to transform file paths with preprocessing before appending to query as ( file:… ) clauses. Updated descriptions to document new functionality.
Web client tool documentation & preprocessing
packages/web/src/features/chat/tools.ts, packages/web/src/features/chat/utils.ts
Updated tool descriptions to document parenthesis handling requirements. Added preprocessing step in buildSearchQuery to transform file and repo filter regexes using preprocessRegexp before query composition.
Changelog
CHANGELOG.md
Added entry under Unreleased/Fixed: "Properly handle regex filters that include parenthesis." Appended reference tag to existing hotkey mapping item.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • brendan-kellam
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding proper handling of regex filters containing parentheses, which directly addresses the fix for issue #771.
Linked Issues check ✅ Passed The PR implements the required fix for issue #771 by adding preprocessRegexp logic to wrap regex filter values containing parentheses in quotes, preventing parser failures.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing regex filter handling with parentheses. No out-of-scope modifications detected; changes focus on the preprocessRegexp function, its exports, and its integration into search query building.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
packages/web/src/features/chat/tools.ts (1)

142-173: Fix minor typos in the tool description text.

“expresion” → “expression”, “paranthesis” → “parenthesis”.

✏️ Proposed fix
-            .describe(`Filter results from filepaths that match the regex. When this option is not specified, all files are searched. If the regex expresion includes a paranthesis **YOU MUST** wrap this value in quotes when passing it in.`)
+            .describe(`Filter results from filepaths that match the regex. When this option is not specified, all files are searched. If the regex expression includes a parenthesis **YOU MUST** wrap this value in quotes when passing it in.`)
packages/mcp/src/index.ts (1)

24-49: Fix minor typos in schema/description text.

“expresion” → “expression”, “paranthesis” → “parenthesis”.

✏️ Proposed fix
-            .describe("Scope the search to results inside filepaths that match the provided regex expression. By default all files are searched, so **only use this filter if you need to filter on specific files**. **YOU MUST** ensure that this is a valid regex expression and any special characters are properly escaped. If the regex expresion includes a paranthesis **YOU MUST** wrap this value in quotes when passing it in.")
+            .describe("Scope the search to results inside filepaths that match the provided regex expression. By default all files are searched, so **only use this filter if you need to filter on specific files**. **YOU MUST** ensure that this is a valid regex expression and any special characters are properly escaped. If the regex expression includes a parenthesis **YOU MUST** wrap this value in quotes when passing it in.")
🤖 Fix all issues with AI agents
In `@CHANGELOG.md`:
- Line 12: Update the changelog entry text to use the plural form "parentheses"
instead of "parenthesis" for clarity; locate the line containing "Properly
handle regex filters that include parenthesis.
[`#786`](https://github.com/sourcebot-dev/sourcebot/pull/786)" and change it to
"Properly handle regex filters that include parentheses.
[`#786`](https://github.com/sourcebot-dev/sourcebot/pull/786)".

### Fixed
- Properly map all hotkeys in UI based on the platform [#784](https://github.com/sourcebot-dev/sourcebot/pull/784)
- Properly map all hotkeys in UI based on the platform. [#784](https://github.com/sourcebot-dev/sourcebot/pull/784)
- Properly handle regex filters that include parenthesis. [#786](https://github.com/sourcebot-dev/sourcebot/pull/786)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use “parentheses” (plural) for clarity.

Minor wording nit in the changelog entry.

✏️ Suggested tweak
-- Properly handle regex filters that include parenthesis. [`#786`](https://github.com/sourcebot-dev/sourcebot/pull/786)
+- Properly handle regex filters that include parentheses. [`#786`](https://github.com/sourcebot-dev/sourcebot/pull/786)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Properly handle regex filters that include parenthesis. [#786](https://github.com/sourcebot-dev/sourcebot/pull/786)
- Properly handle regex filters that include parentheses. [`#786`](https://github.com/sourcebot-dev/sourcebot/pull/786)
🤖 Prompt for AI Agents
In `@CHANGELOG.md` at line 12, Update the changelog entry text to use the plural
form "parentheses" instead of "parenthesis" for clarity; locate the line
containing "Properly handle regex filters that include parenthesis.
[`#786`](https://github.com/sourcebot-dev/sourcebot/pull/786)" and change it to
"Properly handle regex filters that include parentheses.
[`#786`](https://github.com/sourcebot-dev/sourcebot/pull/786)".

// Entry point for the MCP server
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { preprocessRegexp } from '@sourcebot/shared';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: mcp cannot take a dependency on @sourcebot/shared since we do not publish it to npm

@brendan-kellam
Copy link
Contributor

I wonder if we could improve our parsing to not throw a exception in the case of a unmatched paren?

@msukkari
Copy link
Contributor Author

msukkari commented Jan 23, 2026

I wonder if we could improve our parsing to not throw a exception in the case of a unmatched paren?

Closing in favor of #788

@msukkari msukkari closed this Jan 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] Ask cannot handle files with special characters

3 participants