feat: add query dsl
This commit is contained in:
336
packages/server/docs/query-language.md
Normal file
336
packages/server/docs/query-language.md
Normal file
@@ -0,0 +1,336 @@
|
||||
# Query Language Specification
|
||||
|
||||
This document describes the SQL-like query language syntax for building database queries. The language supports filtering on both text and numeric fields, including nested JSON fields, with logical operators for complex queries.
|
||||
|
||||
## Overview
|
||||
|
||||
The query language provides a human-readable, SQL-like syntax that can be parsed into the internal JSON query format used by the system. It supports:
|
||||
|
||||
- Text field conditions (equality, pattern matching, membership)
|
||||
- Numeric field conditions (comparison operators, membership)
|
||||
- Nested JSON field access using dot notation
|
||||
- Logical operators (AND, OR) with grouping
|
||||
- NULL value checks
|
||||
|
||||
## Syntax
|
||||
|
||||
### Field References
|
||||
|
||||
Fields are referenced using dot notation for nested JSON paths:
|
||||
|
||||
```
|
||||
field_name
|
||||
metadata.foo
|
||||
metadata.nested.deep.field
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
- `content` - top-level field
|
||||
- `metadata.author` - nested field in metadata object
|
||||
- `metadata.tags.0` - array element (if needed)
|
||||
|
||||
### Text Conditions
|
||||
|
||||
Text conditions operate on string values:
|
||||
|
||||
| Operator | Syntax | Description |
|
||||
|----------|--------|-------------|
|
||||
| Equality | `field = 'value'` | Exact match |
|
||||
| Inequality | `field != 'value'` | Not equal |
|
||||
| NULL check | `field IS NULL` | Field is null |
|
||||
| NOT NULL | `field IS NOT NULL` | Field is not null |
|
||||
| Pattern match | `field LIKE 'pattern'` | SQL LIKE pattern matching |
|
||||
| Not like | `field NOT LIKE 'pattern'` | Negated pattern matching |
|
||||
| In list | `field IN ('val1', 'val2', 'val3')` | Value in list |
|
||||
| Not in list | `field NOT IN ('val1', 'val2')` | Value not in list |
|
||||
|
||||
**String Literals:**
|
||||
- Single quotes: `'value'`
|
||||
- Escaped quotes: `'O''Brien'` (double single quote)
|
||||
- Empty string: `''`
|
||||
|
||||
**LIKE Patterns:**
|
||||
- `%` matches any sequence of characters
|
||||
- `_` matches any single character
|
||||
- Examples: `'%cat%'`, `'test_%'`, `'exact'`
|
||||
|
||||
**Examples:**
|
||||
```sql
|
||||
content = 'hello world'
|
||||
metadata.foo = 'bar'
|
||||
type != 'draft'
|
||||
source IS NULL
|
||||
title LIKE '%cat%'
|
||||
author NOT LIKE '%admin%'
|
||||
status IN ('published', 'archived')
|
||||
category NOT IN ('deleted', 'hidden')
|
||||
```
|
||||
|
||||
### Numeric Conditions
|
||||
|
||||
Numeric conditions operate on number values:
|
||||
|
||||
| Operator | Syntax | Description |
|
||||
|----------|--------|-------------|
|
||||
| Equality | `field = 123` | Exact match |
|
||||
| Inequality | `field != 123` | Not equal |
|
||||
| NULL check | `field IS NULL` | Field is null |
|
||||
| NOT NULL | `field IS NOT NULL` | Field is not null |
|
||||
| Greater than | `field > 10` | Greater than |
|
||||
| Greater or equal | `field >= 10` | Greater than or equal |
|
||||
| Less than | `field < 10` | Less than |
|
||||
| Less or equal | `field <= 10` | Less than or equal |
|
||||
| In list | `field IN (1, 2, 3)` | Value in list |
|
||||
| Not in list | `field NOT IN (1, 2, 3)` | Value not in list |
|
||||
|
||||
**Numeric Literals:**
|
||||
- Integers: `123`, `-45`, `0`
|
||||
- Decimals: `123.45`, `-0.5`, `3.14159`
|
||||
- Scientific notation: `1e10`, `2.5e-3` (if supported)
|
||||
|
||||
**Examples:**
|
||||
```sql
|
||||
typeVersion = 1
|
||||
score > 0.5
|
||||
views >= 100
|
||||
priority < 5
|
||||
age <= 65
|
||||
rating IN (1, 2, 3, 4, 5)
|
||||
count NOT IN (0, -1)
|
||||
```
|
||||
|
||||
### Logical Operators
|
||||
|
||||
Combine conditions using `AND` and `OR` operators:
|
||||
|
||||
| Operator | Syntax | Description |
|
||||
|----------|--------|-------------|
|
||||
| AND | `condition1 AND condition2` | Both conditions must be true |
|
||||
| OR | `condition1 OR condition2` | At least one condition must be true |
|
||||
|
||||
**Grouping:**
|
||||
Use parentheses `()` to group conditions and control operator precedence:
|
||||
|
||||
```sql
|
||||
(condition1 AND condition2) OR condition3
|
||||
condition1 AND (condition2 OR condition3)
|
||||
```
|
||||
|
||||
**Examples:**
|
||||
```sql
|
||||
type = 'article' AND status = 'published'
|
||||
metadata.foo = 'bar' OR metadata.foo = 'baz'
|
||||
(type = 'post' OR type = 'page') AND views > 100
|
||||
```
|
||||
|
||||
### Operator Precedence
|
||||
|
||||
1. Parentheses `()` - highest precedence
|
||||
2. `AND` - evaluated before OR
|
||||
3. `OR` - lowest precedence
|
||||
|
||||
**Examples:**
|
||||
```sql
|
||||
-- Equivalent to: (A AND B) OR C
|
||||
A AND B OR C
|
||||
|
||||
-- Equivalent to: A AND (B OR C)
|
||||
A AND (B OR C)
|
||||
|
||||
-- Explicit grouping
|
||||
(A OR B) AND (C OR D)
|
||||
```
|
||||
|
||||
## Complete Examples
|
||||
|
||||
### Simple Conditions
|
||||
|
||||
```sql
|
||||
-- Text equality
|
||||
metadata.author = 'John Doe'
|
||||
|
||||
-- Numeric comparison
|
||||
views >= 1000
|
||||
|
||||
-- Pattern matching
|
||||
title LIKE '%tutorial%'
|
||||
|
||||
-- NULL check
|
||||
source IS NULL
|
||||
```
|
||||
|
||||
### Multiple Conditions
|
||||
|
||||
```sql
|
||||
-- AND operator
|
||||
type = 'article' AND status = 'published' AND views > 100
|
||||
|
||||
-- OR operator
|
||||
category = 'tech' OR category = 'science'
|
||||
|
||||
-- Mixed operators
|
||||
(type = 'post' OR type = 'page') AND published = true
|
||||
```
|
||||
|
||||
### Complex Nested Queries
|
||||
|
||||
```sql
|
||||
-- Nested AND within OR
|
||||
(metadata.foo = 'bar' AND type = 'demo') OR metadata.foo = 'baz'
|
||||
|
||||
-- Multiple levels of nesting
|
||||
((status = 'active' AND views > 100) OR (status = 'featured' AND views > 50)) AND category = 'news'
|
||||
|
||||
-- Complex query with multiple field types
|
||||
type = 'article' AND (metadata.author = 'John' OR metadata.author = 'Jane') AND views >= 100 AND rating IN (4, 5)
|
||||
```
|
||||
|
||||
### Array/List Operations
|
||||
|
||||
```sql
|
||||
-- Text IN
|
||||
status IN ('published', 'archived', 'draft')
|
||||
|
||||
-- Numeric IN
|
||||
priority IN (1, 2, 3)
|
||||
|
||||
-- NOT IN
|
||||
category NOT IN ('deleted', 'hidden')
|
||||
```
|
||||
|
||||
## Type Inference
|
||||
|
||||
The parser will infer the condition type (text vs number) based on:
|
||||
|
||||
1. **Operator context**: Operators like `>`, `<`, `>=`, `<=` imply numeric
|
||||
2. **Value type**:
|
||||
- Quoted strings (`'value'`) → text condition
|
||||
- Unquoted numbers (`123`, `45.6`) → numeric condition
|
||||
- `NULL` → can be either (context-dependent)
|
||||
3. **Field name**: If a field is known to be numeric, numeric operators are used
|
||||
|
||||
**Examples:**
|
||||
```sql
|
||||
-- Text condition (quoted string)
|
||||
author = 'John'
|
||||
|
||||
-- Numeric condition (unquoted number)
|
||||
age = 30
|
||||
|
||||
-- Numeric comparison
|
||||
score > 0.5
|
||||
|
||||
-- Text pattern
|
||||
title LIKE '%test%'
|
||||
```
|
||||
|
||||
## Escaping and Special Characters
|
||||
|
||||
### String Escaping
|
||||
|
||||
- Single quotes in strings: `'O''Brien'` → `O'Brien`
|
||||
- Empty string: `''`
|
||||
|
||||
### Field Name Escaping
|
||||
|
||||
If field names contain special characters or reserved words, they can be quoted (implementation-dependent):
|
||||
|
||||
```sql
|
||||
-- Reserved words or special characters (if supported)
|
||||
"order" = 'asc'
|
||||
"metadata.field-name" = 'value'
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
The parser should provide clear error messages for:
|
||||
|
||||
- Invalid syntax
|
||||
- Mismatched parentheses
|
||||
- Invalid operators for field types
|
||||
- Missing values
|
||||
- Invalid escape sequences
|
||||
|
||||
## Grammar (BNF-like)
|
||||
|
||||
```
|
||||
query ::= expression
|
||||
expression ::= condition | group
|
||||
group ::= '(' expression ')'
|
||||
| expression AND expression
|
||||
| expression OR expression
|
||||
condition ::= text_condition | numeric_condition
|
||||
text_condition ::= field ( '=' | '!=' | 'LIKE' | 'NOT LIKE' ) string_literal
|
||||
| field 'IS' ( 'NULL' | 'NOT NULL' )
|
||||
| field 'IN' '(' string_list ')'
|
||||
| field 'NOT IN' '(' string_list ')'
|
||||
numeric_condition ::= field ( '=' | '!=' | '>' | '>=' | '<' | '<=' ) number
|
||||
| field 'IS' ( 'NULL' | 'NOT NULL' )
|
||||
| field 'IN' '(' number_list ')'
|
||||
| field 'NOT IN' '(' number_list ')'
|
||||
field ::= identifier ( '.' identifier )*
|
||||
identifier ::= [a-zA-Z_][a-zA-Z0-9_]*
|
||||
string_literal ::= "'" ( escaped_char | [^'] )* "'"
|
||||
escaped_char ::= "''"
|
||||
string_list ::= string_literal ( ',' string_literal )*
|
||||
number ::= [0-9]+ ( '.' [0-9]+ )? ( [eE] [+-]? [0-9]+ )?
|
||||
number_list ::= number ( ',' number )*
|
||||
```
|
||||
|
||||
## Migration from JSON Format
|
||||
|
||||
The SQL-like syntax maps to the JSON format as follows:
|
||||
|
||||
**JSON:**
|
||||
```json
|
||||
{
|
||||
"type": "text",
|
||||
"field": ["metadata", "foo"],
|
||||
"conditions": {
|
||||
"equal": "bar"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**SQL:**
|
||||
```sql
|
||||
metadata.foo = 'bar'
|
||||
```
|
||||
|
||||
**JSON (with operator):**
|
||||
```json
|
||||
{
|
||||
"type": "operator",
|
||||
"operator": "and",
|
||||
"conditions": [
|
||||
{
|
||||
"type": "text",
|
||||
"field": ["metadata", "foo"],
|
||||
"conditions": {
|
||||
"equal": "bar"
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "text",
|
||||
"field": ["type"],
|
||||
"conditions": {
|
||||
"equal": "demo"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**SQL:**
|
||||
```sql
|
||||
metadata.foo = 'bar' AND type = 'demo'
|
||||
```
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
1. **Whitespace**: Whitespace is generally ignored except within string literals
|
||||
2. **Case sensitivity**:
|
||||
- Operators (`AND`, `OR`, `LIKE`, etc.) are case-insensitive
|
||||
- Field names and string values are case-sensitive
|
||||
3. **Comments**: Not supported in initial version (can be added later)
|
||||
4. **Table prefixes**: The parser may support optional table name prefixes (e.g., `documents.metadata.foo`) if needed
|
||||
Reference in New Issue
Block a user