Skip to content

Conversation

@msukkari
Copy link
Contributor

@msukkari msukkari commented Jan 23, 2026

The parser was previously not allowing words to include parenthesis. This is a problem because it caused queries like test( to fail when in reality this is valid. It also caused issues like #771 when parenthesis were present in filter expressions as well.

This PR fixes that by modifying the lexer/parser do allow words to include parenthesis. The reason this is really complicated is because parenthesis are picked up to detect groupings (i.e. in queries like (foo bar)). Lexers don't like this because it requires them to look ahead to determine what the token produced should be. For examples should (foo bar) be a ParenExpr or two words (foo and bar)?

Needless to say this took a lot of prompting with cursor and claude to arrive at a working solution. Thankfully we have an extensive test suite so we can be pretty sure that no existing functionality has broken

Also fixes #771

Summary by CodeRabbit

  • New Features

    • Improved support for parentheses and grouped expressions in queries
    • More context-aware tokenization for OR, negation, and word-like terms to reduce mis-parsing
  • Bug Fixes

    • Better distinction between parentheses used for grouping vs. part of terms, reducing incorrect ASTs
  • Tests

    • Added extensive parenthesis-focused test cases and updated operator expectations
  • Documentation

    • Changelog entry noting parentheses support in queries

✏️ Tip: You can customize this high-level summary in your review settings.

@github-actions

This comment has been minimized.

@coderabbitai
Copy link

coderabbitai bot commented Jan 23, 2026

Walkthrough

Externalized and refined tokenization for parentheses, words, OR, and negation; updated LR parser state/token tables and term name mappings; grammar now uses external tokens for ParenExpr; added extensive parenthesis parsing tests and adjusted OR-related expectations.

Changes

Cohort / File(s) Summary
Parser term constants & config
packages/queryLanguage/src/parser.terms.ts, packages/queryLanguage/src/parser.ts
Added exported term constants openParen, word, closeParen, or. Replaced serialized parser configuration (states, stateData, goto, tokenData), updated tokenizers to include parenToken, wordToken, closeParenToken, orToken, and revised termNames index mappings.
Grammar & tokenizer implementations
packages/queryLanguage/src/query.grammar, packages/queryLanguage/src/tokens.ts
query.grammar now imports external tokens (parenToken, wordToken, closeParenToken, orToken) and uses them in ParenExpr; inline word/or rules removed. tokens.ts introduces context-aware tokenizers: parenToken, closeParenToken, wordToken, orToken, and a refined negateToken, plus helpers for whitespace, prefix detection, and balanced-paren logic.
Tests & expectations
packages/queryLanguage/test/operators.txt, packages/queryLanguage/test/parenthesis.txt
Adjusted two operator expectations to produce AndExpr(Term,Term) instead of warning placeholder; added parenthesis.txt with comprehensive parenthesis/prefix parsing cases and expected AST outputs.
Changelog
CHANGELOG.md
Notes support for parentheses in query/filter terms (Unreleased).

Sequence Diagram(s)

sequenceDiagram
  participant Input as InputStream
  participant Tokenizers as Tokenizer chain (tokens.ts)
  participant Parser as LRParser (parser.ts)
  participant AST as AST builder / semantic actions

  Input->>Tokenizers: read characters
  Tokenizers->>Tokenizers: apply rules (balanced-paren, prefixes, OR/negation)
  Tokenizers-->>Parser: emit tokens (openParen, word, closeParen, or, negate, ...)
  Parser->>Parser: consult states / tokenData / goto
  Parser-->>AST: invoke reductions → build Program / Expr nodes
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: allowing parentheses in query and filter terms, which directly addresses the parser modifications shown across multiple files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

brendan-kellam
brendan-kellam previously approved these changes Jan 23, 2026
Copy link
Contributor

@brendan-kellam brendan-kellam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@msukkari msukkari merged commit 507fc8b into main Jan 23, 2026
9 checks passed
@msukkari msukkari deleted the michael/sou-259-bug-parenthesis-in-search-term-causes-parse-error branch January 23, 2026 05:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] Ask cannot handle files with special characters

3 participants