add DSL documentation
This commit is contained in:
470
docs/query-dsl.md
Normal file
470
docs/query-dsl.md
Normal file
@@ -0,0 +1,470 @@
|
||||
# Query DSL Documentation
|
||||
|
||||
The Query DSL (Domain Specific Language) provides a powerful and flexible way to query documents in the system. It allows you to construct complex queries using a SQL-like syntax with support for filtering by URI, document type, and metadata fields.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Syntax Overview](#syntax-overview)
|
||||
- [Field Types](#field-types)
|
||||
- [Operators](#operators)
|
||||
- [Logical Operations](#logical-operations)
|
||||
- [Data Types](#data-types)
|
||||
- [Examples](#examples)
|
||||
- [Error Handling](#error-handling)
|
||||
- [API Reference](#api-reference)
|
||||
|
||||
## Overview
|
||||
|
||||
The Query DSL parses human-readable query strings and converts them into `DocumentSearchOptions` objects that can be used to search documents. It supports:
|
||||
|
||||
- **URI filtering**: Filter documents by their unique resource identifiers
|
||||
- **Type filtering**: Filter documents by their type
|
||||
- **Metadata filtering**: Filter documents by their metadata fields with type-aware operations
|
||||
- **Logical operations**: Combine conditions using AND/OR logic
|
||||
- **Parenthetical grouping**: Group conditions for complex boolean logic
|
||||
- **Multiple data types**: String, number, boolean values with appropriate operators
|
||||
|
||||
## Syntax Overview
|
||||
|
||||
The basic syntax follows this pattern:
|
||||
|
||||
```
|
||||
field operator value [logical_operator field operator value...]
|
||||
```
|
||||
|
||||
### Basic Examples
|
||||
|
||||
```
|
||||
uri = "doc-123"
|
||||
type = "article"
|
||||
meta.priority = 5
|
||||
meta.title like "%search%"
|
||||
```
|
||||
|
||||
### Complex Examples
|
||||
|
||||
```
|
||||
uri in ["doc-1", "doc-2"] and meta.priority >= 5
|
||||
(meta.published = true or meta.draft = false) and type = "article"
|
||||
```
|
||||
|
||||
## Field Types
|
||||
|
||||
### 1. URI Fields
|
||||
|
||||
Filter documents by their URI (Unique Resource Identifier).
|
||||
|
||||
**Syntax**: `uri`
|
||||
|
||||
**Supported Operations**:
|
||||
- Equality: `uri = "document-id"`
|
||||
- Array membership: `uri in ["doc-1", "doc-2", "doc-3"]`
|
||||
|
||||
### 2. Type Fields
|
||||
|
||||
Filter documents by their type.
|
||||
|
||||
**Syntax**: `type`
|
||||
|
||||
**Supported Operations**:
|
||||
- Equality: `type = "article"`
|
||||
- Array membership: `type in ["article", "blog", "news"]`
|
||||
|
||||
### 3. Metadata Fields
|
||||
|
||||
Filter documents by their metadata fields. Metadata fields are accessed using dot notation.
|
||||
|
||||
**Syntax**: `meta.fieldName`
|
||||
|
||||
**Examples**:
|
||||
- `meta.title`
|
||||
- `meta.priority`
|
||||
- `meta.published`
|
||||
- `meta.created_at`
|
||||
|
||||
## Operators
|
||||
|
||||
### Comparison Operators
|
||||
|
||||
| Operator | Description | Example |
|
||||
|----------|-------------|---------|
|
||||
| `=` | Equals | `meta.priority = 5` |
|
||||
| `!=` | Not equals | `meta.status != "archived"` |
|
||||
| `>` | Greater than | `meta.score > 8.5` |
|
||||
| `>=` | Greater than or equal | `meta.priority >= 5` |
|
||||
| `<` | Less than | `meta.age < 30` |
|
||||
| `<=` | Less than or equal | `meta.count <= 100` |
|
||||
| `like` | Pattern matching (SQL-like) | `meta.title like "%test%"` |
|
||||
| `not like` | Negative pattern matching | `meta.title not like "%draft%"` |
|
||||
| `in` | Array membership | `uri in ["a", "b", "c"]` |
|
||||
|
||||
### Logical Operators
|
||||
|
||||
| Operator | Syntax | Description | Example |
|
||||
|----------|---------|-------------|---------|
|
||||
| AND | `and` or `&` | Logical AND | `meta.a = 1 and meta.b = 2` |
|
||||
| OR | `or` or `\|` | Logical OR | `meta.urgent = true or meta.priority > 8` |
|
||||
|
||||
## Logical Operations
|
||||
|
||||
### AND Operations
|
||||
|
||||
Combine conditions where **both** must be true:
|
||||
|
||||
```
|
||||
meta.published = true and meta.priority >= 5
|
||||
meta.created >= 1234567890 & meta.updated < 1234567999
|
||||
```
|
||||
|
||||
### OR Operations
|
||||
|
||||
Combine conditions where **either** can be true:
|
||||
|
||||
```
|
||||
type = "article" or type = "blog"
|
||||
meta.urgent = true | meta.priority > 9
|
||||
```
|
||||
|
||||
### Parenthetical Grouping
|
||||
|
||||
Use parentheses to control the order of operations:
|
||||
|
||||
```
|
||||
(meta.priority > 5 or meta.urgent = true) and type = "article"
|
||||
((meta.a = 1 and meta.b = 2) or meta.c = 3) and uri = "test"
|
||||
```
|
||||
|
||||
## Data Types
|
||||
|
||||
The DSL automatically detects and handles different data types based on the value syntax:
|
||||
|
||||
### Strings
|
||||
|
||||
Enclosed in double quotes (`"`) or single quotes (`'`):
|
||||
|
||||
```
|
||||
meta.title = "Article Title"
|
||||
meta.status = 'published'
|
||||
```
|
||||
|
||||
**Escape Sequences**:
|
||||
- `\"` or `\'` - Quote characters
|
||||
- `\\` - Backslash
|
||||
- `\n` - Newline
|
||||
- `\t` - Tab
|
||||
- `\r` - Carriage return
|
||||
|
||||
Example:
|
||||
```
|
||||
meta.content = "String with \"quotes\" and \n newlines"
|
||||
```
|
||||
|
||||
### Numbers
|
||||
|
||||
Integer or floating-point numbers, including negative values:
|
||||
|
||||
```
|
||||
meta.priority = 5
|
||||
meta.score = 8.75
|
||||
meta.balance = -150.50
|
||||
```
|
||||
|
||||
### Booleans
|
||||
|
||||
Case-insensitive `true` or `false`:
|
||||
|
||||
```
|
||||
meta.published = true
|
||||
meta.archived = false
|
||||
meta.draft = True
|
||||
```
|
||||
|
||||
### Arrays
|
||||
|
||||
Square bracket notation for array values (used with `in` operator):
|
||||
|
||||
```
|
||||
uri in ["doc-1", "doc-2", "doc-3"]
|
||||
type in ["article", "blog", "news"]
|
||||
```
|
||||
|
||||
Empty arrays are supported:
|
||||
```
|
||||
uri in []
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Basic Filtering
|
||||
|
||||
**Filter by URI**:
|
||||
```
|
||||
uri = "my-document"
|
||||
```
|
||||
|
||||
**Filter by multiple URIs**:
|
||||
```
|
||||
uri in ["doc-1", "doc-2", "doc-3"]
|
||||
```
|
||||
|
||||
**Filter by document type**:
|
||||
```
|
||||
type = "article"
|
||||
```
|
||||
|
||||
**Filter by metadata**:
|
||||
```
|
||||
meta.priority = 5
|
||||
meta.title = "Important Document"
|
||||
meta.published = true
|
||||
```
|
||||
|
||||
### Comparison Operations
|
||||
|
||||
**Numeric comparisons**:
|
||||
```
|
||||
meta.priority > 5
|
||||
meta.score >= 8.0
|
||||
meta.count < 100
|
||||
meta.rating <= 4.5
|
||||
meta.value != 0
|
||||
```
|
||||
|
||||
**Text operations**:
|
||||
```
|
||||
meta.title like "%report%"
|
||||
meta.category != "archived"
|
||||
meta.description not like "%draft%"
|
||||
```
|
||||
|
||||
### Logical Combinations
|
||||
|
||||
**Simple AND**:
|
||||
```
|
||||
type = "article" and meta.published = true
|
||||
```
|
||||
|
||||
**Simple OR**:
|
||||
```
|
||||
meta.urgent = true or meta.priority > 8
|
||||
```
|
||||
|
||||
**Mixed field types**:
|
||||
```
|
||||
uri in ["doc-1", "doc-2"] and meta.priority >= 5
|
||||
```
|
||||
|
||||
### Complex Queries
|
||||
|
||||
**Grouped conditions**:
|
||||
```
|
||||
(meta.priority > 5 or meta.urgent = true) and type = "article"
|
||||
```
|
||||
|
||||
**Nested grouping**:
|
||||
```
|
||||
((meta.a = 1 and meta.b = 2) or meta.c = 3) and type = "test"
|
||||
```
|
||||
|
||||
**Real-world example**:
|
||||
```
|
||||
uri in ["article-1", "article-2"] and (meta.created >= 1640995200 and meta.created < 1672531200 or (type = "urgent" and meta.status != "archived"))
|
||||
```
|
||||
|
||||
### Date/Time Queries
|
||||
|
||||
Since dates are typically stored as Unix timestamps (numbers):
|
||||
|
||||
```
|
||||
meta.created >= 1640995200
|
||||
meta.updated > 1672531200 and meta.updated < 1675209600
|
||||
```
|
||||
|
||||
### String Pattern Matching
|
||||
|
||||
**Contains pattern**:
|
||||
```
|
||||
meta.title like "%search term%"
|
||||
```
|
||||
|
||||
**Starts with pattern**:
|
||||
```
|
||||
meta.filename like "report_%"
|
||||
```
|
||||
|
||||
**Ends with pattern**:
|
||||
```
|
||||
meta.extension like "%.pdf"
|
||||
```
|
||||
|
||||
**Exclude pattern**:
|
||||
```
|
||||
meta.title not like "%draft%"
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
The DSL parser provides detailed error messages for various error conditions:
|
||||
|
||||
### Syntax Errors
|
||||
|
||||
**Unterminated string**:
|
||||
```
|
||||
meta.title = "unterminated string
|
||||
// Error: Unterminated string at position X
|
||||
```
|
||||
|
||||
**Missing parentheses**:
|
||||
```
|
||||
(meta.a = 1 and meta.b = 2
|
||||
// Error: Expected RPAREN at position X
|
||||
```
|
||||
|
||||
**Invalid characters**:
|
||||
```
|
||||
meta.field = test@invalid
|
||||
// Error: Unexpected character '@' at position X
|
||||
```
|
||||
|
||||
### Semantic Errors
|
||||
|
||||
**Unknown fields**:
|
||||
```
|
||||
unknown_field = "value"
|
||||
// Error: Unknown field 'unknown_field' at position X
|
||||
```
|
||||
|
||||
**Unsupported operators**:
|
||||
```
|
||||
uri like "pattern"
|
||||
// Error: Unsupported operator 'like' for uri field
|
||||
```
|
||||
|
||||
**Type mismatches**:
|
||||
```
|
||||
meta.numeric_field > "string_value"
|
||||
// Error: Unsupported operator '>' for text field
|
||||
```
|
||||
|
||||
### Validation
|
||||
|
||||
- Field names are validated (only `uri`, `type`, and `meta.*` are allowed)
|
||||
- Operators are validated based on field type and data type
|
||||
- Syntax is strictly enforced
|
||||
|
||||
## API Reference
|
||||
|
||||
### Function: `parseDSL(query: string): DocumentSearchOptions`
|
||||
|
||||
Parses a DSL query string into a `DocumentSearchOptions` object.
|
||||
|
||||
**Parameters**:
|
||||
- `query` (string): The DSL query string to parse
|
||||
|
||||
**Returns**: `DocumentSearchOptions` object with the following structure:
|
||||
|
||||
```typescript
|
||||
type DocumentSearchOptions = {
|
||||
uris?: string[]; // Array of URIs to filter by
|
||||
types?: string[]; // Array of document types to filter by
|
||||
meta?: MetaCondition; // Metadata filtering conditions
|
||||
limit?: number; // Result limit (not set by DSL)
|
||||
offset?: number; // Result offset (not set by DSL)
|
||||
};
|
||||
```
|
||||
|
||||
**Throws**: Error with descriptive message if parsing fails
|
||||
|
||||
### Types
|
||||
|
||||
#### MetaCondition
|
||||
|
||||
```typescript
|
||||
type MetaCondition =
|
||||
| { type: 'and'; conditions: MetaCondition[] }
|
||||
| { type: 'or'; conditions: MetaCondition[] }
|
||||
| MetaFilter;
|
||||
```
|
||||
|
||||
#### MetaFilter
|
||||
|
||||
```typescript
|
||||
type MetaFilter =
|
||||
| MetaNumberFilter
|
||||
| MetaTextFilter
|
||||
| MetaBoolFilter;
|
||||
|
||||
type MetaNumberFilter = {
|
||||
type: 'number';
|
||||
field: string;
|
||||
filter: {
|
||||
gt?: number; // Greater than
|
||||
gte?: number; // Greater than or equal
|
||||
lt?: number; // Less than
|
||||
lte?: number; // Less than or equal
|
||||
eq?: number; // Equal
|
||||
neq?: number; // Not equal
|
||||
nill?: boolean; // Is null/undefined
|
||||
};
|
||||
};
|
||||
|
||||
type MetaTextFilter = {
|
||||
type: 'text';
|
||||
field: string;
|
||||
filter: {
|
||||
eq?: string; // Equal
|
||||
neq?: string; // Not equal
|
||||
like?: string; // Pattern match
|
||||
nlike?: string; // Negative pattern match
|
||||
nill?: boolean; // Is null/undefined
|
||||
};
|
||||
};
|
||||
|
||||
type MetaBoolFilter = {
|
||||
type: 'bool';
|
||||
field: string;
|
||||
filter: {
|
||||
eq: boolean; // Equal (required)
|
||||
nill?: boolean; // Is null/undefined
|
||||
};
|
||||
};
|
||||
```
|
||||
|
||||
### Usage Example
|
||||
|
||||
```typescript
|
||||
import { parseDSL } from './documents.dsl';
|
||||
|
||||
// Parse a query
|
||||
const options = parseDSL('uri in ["doc-1", "doc-2"] and meta.priority >= 5');
|
||||
|
||||
// Result:
|
||||
// {
|
||||
// uris: ['doc-1', 'doc-2'],
|
||||
// meta: {
|
||||
// type: 'number',
|
||||
// field: 'priority',
|
||||
// filter: { gte: 5 }
|
||||
// }
|
||||
// }
|
||||
|
||||
// Use with document search
|
||||
const results = await searchDocuments(options);
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Use quotes for strings**: Always wrap string values in quotes to avoid parsing issues
|
||||
2. **Group complex conditions**: Use parentheses to make complex boolean logic clear
|
||||
3. **Choose appropriate operators**: Use `like` for pattern matching, `in` for multiple values
|
||||
4. **Consider performance**: Simpler queries with fewer conditions perform better
|
||||
5. **Handle errors gracefully**: Wrap DSL parsing in try-catch blocks in production code
|
||||
|
||||
## Limitations
|
||||
|
||||
1. **Field restrictions**: Only `uri`, `type`, and `meta.*` fields are supported
|
||||
2. **Operator compatibility**: Not all operators work with all data types
|
||||
3. **No nested metadata**: Metadata fields must be flat (no `meta.nested.field`)
|
||||
4. **Case sensitivity**: Field names and operators are case-sensitive
|
||||
5. **No functions**: No support for functions like `date()`, `count()`, etc.
|
||||
Reference in New Issue
Block a user