add DSL documentation
This commit is contained in:
470
docs/query-dsl.md
Normal file
470
docs/query-dsl.md
Normal file
@@ -0,0 +1,470 @@
|
|||||||
|
# Query DSL Documentation
|
||||||
|
|
||||||
|
The Query DSL (Domain Specific Language) provides a powerful and flexible way to query documents in the system. It allows you to construct complex queries using a SQL-like syntax with support for filtering by URI, document type, and metadata fields.
|
||||||
|
|
||||||
|
## Table of Contents
|
||||||
|
|
||||||
|
- [Overview](#overview)
|
||||||
|
- [Syntax Overview](#syntax-overview)
|
||||||
|
- [Field Types](#field-types)
|
||||||
|
- [Operators](#operators)
|
||||||
|
- [Logical Operations](#logical-operations)
|
||||||
|
- [Data Types](#data-types)
|
||||||
|
- [Examples](#examples)
|
||||||
|
- [Error Handling](#error-handling)
|
||||||
|
- [API Reference](#api-reference)
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
The Query DSL parses human-readable query strings and converts them into `DocumentSearchOptions` objects that can be used to search documents. It supports:
|
||||||
|
|
||||||
|
- **URI filtering**: Filter documents by their unique resource identifiers
|
||||||
|
- **Type filtering**: Filter documents by their type
|
||||||
|
- **Metadata filtering**: Filter documents by their metadata fields with type-aware operations
|
||||||
|
- **Logical operations**: Combine conditions using AND/OR logic
|
||||||
|
- **Parenthetical grouping**: Group conditions for complex boolean logic
|
||||||
|
- **Multiple data types**: String, number, boolean values with appropriate operators
|
||||||
|
|
||||||
|
## Syntax Overview
|
||||||
|
|
||||||
|
The basic syntax follows this pattern:
|
||||||
|
|
||||||
|
```
|
||||||
|
field operator value [logical_operator field operator value...]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Basic Examples
|
||||||
|
|
||||||
|
```
|
||||||
|
uri = "doc-123"
|
||||||
|
type = "article"
|
||||||
|
meta.priority = 5
|
||||||
|
meta.title like "%search%"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Complex Examples
|
||||||
|
|
||||||
|
```
|
||||||
|
uri in ["doc-1", "doc-2"] and meta.priority >= 5
|
||||||
|
(meta.published = true or meta.draft = false) and type = "article"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Field Types
|
||||||
|
|
||||||
|
### 1. URI Fields
|
||||||
|
|
||||||
|
Filter documents by their URI (Unique Resource Identifier).
|
||||||
|
|
||||||
|
**Syntax**: `uri`
|
||||||
|
|
||||||
|
**Supported Operations**:
|
||||||
|
- Equality: `uri = "document-id"`
|
||||||
|
- Array membership: `uri in ["doc-1", "doc-2", "doc-3"]`
|
||||||
|
|
||||||
|
### 2. Type Fields
|
||||||
|
|
||||||
|
Filter documents by their type.
|
||||||
|
|
||||||
|
**Syntax**: `type`
|
||||||
|
|
||||||
|
**Supported Operations**:
|
||||||
|
- Equality: `type = "article"`
|
||||||
|
- Array membership: `type in ["article", "blog", "news"]`
|
||||||
|
|
||||||
|
### 3. Metadata Fields
|
||||||
|
|
||||||
|
Filter documents by their metadata fields. Metadata fields are accessed using dot notation.
|
||||||
|
|
||||||
|
**Syntax**: `meta.fieldName`
|
||||||
|
|
||||||
|
**Examples**:
|
||||||
|
- `meta.title`
|
||||||
|
- `meta.priority`
|
||||||
|
- `meta.published`
|
||||||
|
- `meta.created_at`
|
||||||
|
|
||||||
|
## Operators
|
||||||
|
|
||||||
|
### Comparison Operators
|
||||||
|
|
||||||
|
| Operator | Description | Example |
|
||||||
|
|----------|-------------|---------|
|
||||||
|
| `=` | Equals | `meta.priority = 5` |
|
||||||
|
| `!=` | Not equals | `meta.status != "archived"` |
|
||||||
|
| `>` | Greater than | `meta.score > 8.5` |
|
||||||
|
| `>=` | Greater than or equal | `meta.priority >= 5` |
|
||||||
|
| `<` | Less than | `meta.age < 30` |
|
||||||
|
| `<=` | Less than or equal | `meta.count <= 100` |
|
||||||
|
| `like` | Pattern matching (SQL-like) | `meta.title like "%test%"` |
|
||||||
|
| `not like` | Negative pattern matching | `meta.title not like "%draft%"` |
|
||||||
|
| `in` | Array membership | `uri in ["a", "b", "c"]` |
|
||||||
|
|
||||||
|
### Logical Operators
|
||||||
|
|
||||||
|
| Operator | Syntax | Description | Example |
|
||||||
|
|----------|---------|-------------|---------|
|
||||||
|
| AND | `and` or `&` | Logical AND | `meta.a = 1 and meta.b = 2` |
|
||||||
|
| OR | `or` or `\|` | Logical OR | `meta.urgent = true or meta.priority > 8` |
|
||||||
|
|
||||||
|
## Logical Operations
|
||||||
|
|
||||||
|
### AND Operations
|
||||||
|
|
||||||
|
Combine conditions where **both** must be true:
|
||||||
|
|
||||||
|
```
|
||||||
|
meta.published = true and meta.priority >= 5
|
||||||
|
meta.created >= 1234567890 & meta.updated < 1234567999
|
||||||
|
```
|
||||||
|
|
||||||
|
### OR Operations
|
||||||
|
|
||||||
|
Combine conditions where **either** can be true:
|
||||||
|
|
||||||
|
```
|
||||||
|
type = "article" or type = "blog"
|
||||||
|
meta.urgent = true | meta.priority > 9
|
||||||
|
```
|
||||||
|
|
||||||
|
### Parenthetical Grouping
|
||||||
|
|
||||||
|
Use parentheses to control the order of operations:
|
||||||
|
|
||||||
|
```
|
||||||
|
(meta.priority > 5 or meta.urgent = true) and type = "article"
|
||||||
|
((meta.a = 1 and meta.b = 2) or meta.c = 3) and uri = "test"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Data Types
|
||||||
|
|
||||||
|
The DSL automatically detects and handles different data types based on the value syntax:
|
||||||
|
|
||||||
|
### Strings
|
||||||
|
|
||||||
|
Enclosed in double quotes (`"`) or single quotes (`'`):
|
||||||
|
|
||||||
|
```
|
||||||
|
meta.title = "Article Title"
|
||||||
|
meta.status = 'published'
|
||||||
|
```
|
||||||
|
|
||||||
|
**Escape Sequences**:
|
||||||
|
- `\"` or `\'` - Quote characters
|
||||||
|
- `\\` - Backslash
|
||||||
|
- `\n` - Newline
|
||||||
|
- `\t` - Tab
|
||||||
|
- `\r` - Carriage return
|
||||||
|
|
||||||
|
Example:
|
||||||
|
```
|
||||||
|
meta.content = "String with \"quotes\" and \n newlines"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Numbers
|
||||||
|
|
||||||
|
Integer or floating-point numbers, including negative values:
|
||||||
|
|
||||||
|
```
|
||||||
|
meta.priority = 5
|
||||||
|
meta.score = 8.75
|
||||||
|
meta.balance = -150.50
|
||||||
|
```
|
||||||
|
|
||||||
|
### Booleans
|
||||||
|
|
||||||
|
Case-insensitive `true` or `false`:
|
||||||
|
|
||||||
|
```
|
||||||
|
meta.published = true
|
||||||
|
meta.archived = false
|
||||||
|
meta.draft = True
|
||||||
|
```
|
||||||
|
|
||||||
|
### Arrays
|
||||||
|
|
||||||
|
Square bracket notation for array values (used with `in` operator):
|
||||||
|
|
||||||
|
```
|
||||||
|
uri in ["doc-1", "doc-2", "doc-3"]
|
||||||
|
type in ["article", "blog", "news"]
|
||||||
|
```
|
||||||
|
|
||||||
|
Empty arrays are supported:
|
||||||
|
```
|
||||||
|
uri in []
|
||||||
|
```
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
|
||||||
|
### Basic Filtering
|
||||||
|
|
||||||
|
**Filter by URI**:
|
||||||
|
```
|
||||||
|
uri = "my-document"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Filter by multiple URIs**:
|
||||||
|
```
|
||||||
|
uri in ["doc-1", "doc-2", "doc-3"]
|
||||||
|
```
|
||||||
|
|
||||||
|
**Filter by document type**:
|
||||||
|
```
|
||||||
|
type = "article"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Filter by metadata**:
|
||||||
|
```
|
||||||
|
meta.priority = 5
|
||||||
|
meta.title = "Important Document"
|
||||||
|
meta.published = true
|
||||||
|
```
|
||||||
|
|
||||||
|
### Comparison Operations
|
||||||
|
|
||||||
|
**Numeric comparisons**:
|
||||||
|
```
|
||||||
|
meta.priority > 5
|
||||||
|
meta.score >= 8.0
|
||||||
|
meta.count < 100
|
||||||
|
meta.rating <= 4.5
|
||||||
|
meta.value != 0
|
||||||
|
```
|
||||||
|
|
||||||
|
**Text operations**:
|
||||||
|
```
|
||||||
|
meta.title like "%report%"
|
||||||
|
meta.category != "archived"
|
||||||
|
meta.description not like "%draft%"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Logical Combinations
|
||||||
|
|
||||||
|
**Simple AND**:
|
||||||
|
```
|
||||||
|
type = "article" and meta.published = true
|
||||||
|
```
|
||||||
|
|
||||||
|
**Simple OR**:
|
||||||
|
```
|
||||||
|
meta.urgent = true or meta.priority > 8
|
||||||
|
```
|
||||||
|
|
||||||
|
**Mixed field types**:
|
||||||
|
```
|
||||||
|
uri in ["doc-1", "doc-2"] and meta.priority >= 5
|
||||||
|
```
|
||||||
|
|
||||||
|
### Complex Queries
|
||||||
|
|
||||||
|
**Grouped conditions**:
|
||||||
|
```
|
||||||
|
(meta.priority > 5 or meta.urgent = true) and type = "article"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Nested grouping**:
|
||||||
|
```
|
||||||
|
((meta.a = 1 and meta.b = 2) or meta.c = 3) and type = "test"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Real-world example**:
|
||||||
|
```
|
||||||
|
uri in ["article-1", "article-2"] and (meta.created >= 1640995200 and meta.created < 1672531200 or (type = "urgent" and meta.status != "archived"))
|
||||||
|
```
|
||||||
|
|
||||||
|
### Date/Time Queries
|
||||||
|
|
||||||
|
Since dates are typically stored as Unix timestamps (numbers):
|
||||||
|
|
||||||
|
```
|
||||||
|
meta.created >= 1640995200
|
||||||
|
meta.updated > 1672531200 and meta.updated < 1675209600
|
||||||
|
```
|
||||||
|
|
||||||
|
### String Pattern Matching
|
||||||
|
|
||||||
|
**Contains pattern**:
|
||||||
|
```
|
||||||
|
meta.title like "%search term%"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Starts with pattern**:
|
||||||
|
```
|
||||||
|
meta.filename like "report_%"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Ends with pattern**:
|
||||||
|
```
|
||||||
|
meta.extension like "%.pdf"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Exclude pattern**:
|
||||||
|
```
|
||||||
|
meta.title not like "%draft%"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
The DSL parser provides detailed error messages for various error conditions:
|
||||||
|
|
||||||
|
### Syntax Errors
|
||||||
|
|
||||||
|
**Unterminated string**:
|
||||||
|
```
|
||||||
|
meta.title = "unterminated string
|
||||||
|
// Error: Unterminated string at position X
|
||||||
|
```
|
||||||
|
|
||||||
|
**Missing parentheses**:
|
||||||
|
```
|
||||||
|
(meta.a = 1 and meta.b = 2
|
||||||
|
// Error: Expected RPAREN at position X
|
||||||
|
```
|
||||||
|
|
||||||
|
**Invalid characters**:
|
||||||
|
```
|
||||||
|
meta.field = test@invalid
|
||||||
|
// Error: Unexpected character '@' at position X
|
||||||
|
```
|
||||||
|
|
||||||
|
### Semantic Errors
|
||||||
|
|
||||||
|
**Unknown fields**:
|
||||||
|
```
|
||||||
|
unknown_field = "value"
|
||||||
|
// Error: Unknown field 'unknown_field' at position X
|
||||||
|
```
|
||||||
|
|
||||||
|
**Unsupported operators**:
|
||||||
|
```
|
||||||
|
uri like "pattern"
|
||||||
|
// Error: Unsupported operator 'like' for uri field
|
||||||
|
```
|
||||||
|
|
||||||
|
**Type mismatches**:
|
||||||
|
```
|
||||||
|
meta.numeric_field > "string_value"
|
||||||
|
// Error: Unsupported operator '>' for text field
|
||||||
|
```
|
||||||
|
|
||||||
|
### Validation
|
||||||
|
|
||||||
|
- Field names are validated (only `uri`, `type`, and `meta.*` are allowed)
|
||||||
|
- Operators are validated based on field type and data type
|
||||||
|
- Syntax is strictly enforced
|
||||||
|
|
||||||
|
## API Reference
|
||||||
|
|
||||||
|
### Function: `parseDSL(query: string): DocumentSearchOptions`
|
||||||
|
|
||||||
|
Parses a DSL query string into a `DocumentSearchOptions` object.
|
||||||
|
|
||||||
|
**Parameters**:
|
||||||
|
- `query` (string): The DSL query string to parse
|
||||||
|
|
||||||
|
**Returns**: `DocumentSearchOptions` object with the following structure:
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
type DocumentSearchOptions = {
|
||||||
|
uris?: string[]; // Array of URIs to filter by
|
||||||
|
types?: string[]; // Array of document types to filter by
|
||||||
|
meta?: MetaCondition; // Metadata filtering conditions
|
||||||
|
limit?: number; // Result limit (not set by DSL)
|
||||||
|
offset?: number; // Result offset (not set by DSL)
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
**Throws**: Error with descriptive message if parsing fails
|
||||||
|
|
||||||
|
### Types
|
||||||
|
|
||||||
|
#### MetaCondition
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
type MetaCondition =
|
||||||
|
| { type: 'and'; conditions: MetaCondition[] }
|
||||||
|
| { type: 'or'; conditions: MetaCondition[] }
|
||||||
|
| MetaFilter;
|
||||||
|
```
|
||||||
|
|
||||||
|
#### MetaFilter
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
type MetaFilter =
|
||||||
|
| MetaNumberFilter
|
||||||
|
| MetaTextFilter
|
||||||
|
| MetaBoolFilter;
|
||||||
|
|
||||||
|
type MetaNumberFilter = {
|
||||||
|
type: 'number';
|
||||||
|
field: string;
|
||||||
|
filter: {
|
||||||
|
gt?: number; // Greater than
|
||||||
|
gte?: number; // Greater than or equal
|
||||||
|
lt?: number; // Less than
|
||||||
|
lte?: number; // Less than or equal
|
||||||
|
eq?: number; // Equal
|
||||||
|
neq?: number; // Not equal
|
||||||
|
nill?: boolean; // Is null/undefined
|
||||||
|
};
|
||||||
|
};
|
||||||
|
|
||||||
|
type MetaTextFilter = {
|
||||||
|
type: 'text';
|
||||||
|
field: string;
|
||||||
|
filter: {
|
||||||
|
eq?: string; // Equal
|
||||||
|
neq?: string; // Not equal
|
||||||
|
like?: string; // Pattern match
|
||||||
|
nlike?: string; // Negative pattern match
|
||||||
|
nill?: boolean; // Is null/undefined
|
||||||
|
};
|
||||||
|
};
|
||||||
|
|
||||||
|
type MetaBoolFilter = {
|
||||||
|
type: 'bool';
|
||||||
|
field: string;
|
||||||
|
filter: {
|
||||||
|
eq: boolean; // Equal (required)
|
||||||
|
nill?: boolean; // Is null/undefined
|
||||||
|
};
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### Usage Example
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
import { parseDSL } from './documents.dsl';
|
||||||
|
|
||||||
|
// Parse a query
|
||||||
|
const options = parseDSL('uri in ["doc-1", "doc-2"] and meta.priority >= 5');
|
||||||
|
|
||||||
|
// Result:
|
||||||
|
// {
|
||||||
|
// uris: ['doc-1', 'doc-2'],
|
||||||
|
// meta: {
|
||||||
|
// type: 'number',
|
||||||
|
// field: 'priority',
|
||||||
|
// filter: { gte: 5 }
|
||||||
|
// }
|
||||||
|
// }
|
||||||
|
|
||||||
|
// Use with document search
|
||||||
|
const results = await searchDocuments(options);
|
||||||
|
```
|
||||||
|
|
||||||
|
## Best Practices
|
||||||
|
|
||||||
|
1. **Use quotes for strings**: Always wrap string values in quotes to avoid parsing issues
|
||||||
|
2. **Group complex conditions**: Use parentheses to make complex boolean logic clear
|
||||||
|
3. **Choose appropriate operators**: Use `like` for pattern matching, `in` for multiple values
|
||||||
|
4. **Consider performance**: Simpler queries with fewer conditions perform better
|
||||||
|
5. **Handle errors gracefully**: Wrap DSL parsing in try-catch blocks in production code
|
||||||
|
|
||||||
|
## Limitations
|
||||||
|
|
||||||
|
1. **Field restrictions**: Only `uri`, `type`, and `meta.*` fields are supported
|
||||||
|
2. **Operator compatibility**: Not all operators work with all data types
|
||||||
|
3. **No nested metadata**: Metadata fields must be flat (no `meta.nested.field`)
|
||||||
|
4. **Case sensitivity**: Field names and operators are case-sensitive
|
||||||
|
5. **No functions**: No support for functions like `date()`, `count()`, etc.
|
||||||
Reference in New Issue
Block a user