10 KiB
Query DSL Documentation
The Query DSL (Domain Specific Language) provides a powerful and flexible way to query documents in the system. It allows you to construct complex queries using a SQL-like syntax with support for filtering by URI, document type, and metadata fields.
Table of Contents
- Overview
- Syntax Overview
- Field Types
- Operators
- Logical Operations
- Data Types
- Examples
- Error Handling
- API Reference
Overview
The Query DSL parses human-readable query strings and converts them into DocumentSearchOptions objects that can be used to search documents. It supports:
- URI filtering: Filter documents by their unique resource identifiers
- Type filtering: Filter documents by their type
- Metadata filtering: Filter documents by their metadata fields with type-aware operations
- Logical operations: Combine conditions using AND/OR logic
- Parenthetical grouping: Group conditions for complex boolean logic
- Multiple data types: String, number, boolean values with appropriate operators
Syntax Overview
The basic syntax follows this pattern:
field operator value [logical_operator field operator value...]
Basic Examples
uri = "doc-123"
type = "article"
meta.priority = 5
meta.title like "%search%"
Complex Examples
uri in ["doc-1", "doc-2"] and meta.priority >= 5
(meta.published = true or meta.draft = false) and type = "article"
Field Types
1. URI Fields
Filter documents by their URI (Unique Resource Identifier).
Syntax: uri
Supported Operations:
- Equality:
uri = "document-id" - Array membership:
uri in ["doc-1", "doc-2", "doc-3"]
2. Type Fields
Filter documents by their type.
Syntax: type
Supported Operations:
- Equality:
type = "article" - Array membership:
type in ["article", "blog", "news"]
3. Metadata Fields
Filter documents by their metadata fields. Metadata fields are accessed using dot notation.
Syntax: meta.fieldName
Examples:
meta.titlemeta.prioritymeta.publishedmeta.created_at
Operators
Comparison Operators
| Operator | Description | Example |
|---|---|---|
= |
Equals | meta.priority = 5 |
!= |
Not equals | meta.status != "archived" |
> |
Greater than | meta.score > 8.5 |
>= |
Greater than or equal | meta.priority >= 5 |
< |
Less than | meta.age < 30 |
<= |
Less than or equal | meta.count <= 100 |
like |
Pattern matching (SQL-like) | meta.title like "%test%" |
not like |
Negative pattern matching | meta.title not like "%draft%" |
in |
Array membership | uri in ["a", "b", "c"] |
Logical Operators
| Operator | Syntax | Description | Example |
|---|---|---|---|
| AND | and or & |
Logical AND | meta.a = 1 and meta.b = 2 |
| OR | or or | |
Logical OR | meta.urgent = true or meta.priority > 8 |
Logical Operations
AND Operations
Combine conditions where both must be true:
meta.published = true and meta.priority >= 5
meta.created >= 1234567890 & meta.updated < 1234567999
OR Operations
Combine conditions where either can be true:
type = "article" or type = "blog"
meta.urgent = true | meta.priority > 9
Parenthetical Grouping
Use parentheses to control the order of operations:
(meta.priority > 5 or meta.urgent = true) and type = "article"
((meta.a = 1 and meta.b = 2) or meta.c = 3) and uri = "test"
Data Types
The DSL automatically detects and handles different data types based on the value syntax:
Strings
Enclosed in double quotes (") or single quotes ('):
meta.title = "Article Title"
meta.status = 'published'
Escape Sequences:
\"or\'- Quote characters\\- Backslash\n- Newline\t- Tab\r- Carriage return
Example:
meta.content = "String with \"quotes\" and \n newlines"
Numbers
Integer or floating-point numbers, including negative values:
meta.priority = 5
meta.score = 8.75
meta.balance = -150.50
Booleans
Case-insensitive true or false:
meta.published = true
meta.archived = false
meta.draft = True
Arrays
Square bracket notation for array values (used with in operator):
uri in ["doc-1", "doc-2", "doc-3"]
type in ["article", "blog", "news"]
Empty arrays are supported:
uri in []
Examples
Basic Filtering
Filter by URI:
uri = "my-document"
Filter by multiple URIs:
uri in ["doc-1", "doc-2", "doc-3"]
Filter by document type:
type = "article"
Filter by metadata:
meta.priority = 5
meta.title = "Important Document"
meta.published = true
Comparison Operations
Numeric comparisons:
meta.priority > 5
meta.score >= 8.0
meta.count < 100
meta.rating <= 4.5
meta.value != 0
Text operations:
meta.title like "%report%"
meta.category != "archived"
meta.description not like "%draft%"
Logical Combinations
Simple AND:
type = "article" and meta.published = true
Simple OR:
meta.urgent = true or meta.priority > 8
Mixed field types:
uri in ["doc-1", "doc-2"] and meta.priority >= 5
Complex Queries
Grouped conditions:
(meta.priority > 5 or meta.urgent = true) and type = "article"
Nested grouping:
((meta.a = 1 and meta.b = 2) or meta.c = 3) and type = "test"
Real-world example:
uri in ["article-1", "article-2"] and (meta.created >= 1640995200 and meta.created < 1672531200 or (type = "urgent" and meta.status != "archived"))
Date/Time Queries
Since dates are typically stored as Unix timestamps (numbers):
meta.created >= 1640995200
meta.updated > 1672531200 and meta.updated < 1675209600
String Pattern Matching
Contains pattern:
meta.title like "%search term%"
Starts with pattern:
meta.filename like "report_%"
Ends with pattern:
meta.extension like "%.pdf"
Exclude pattern:
meta.title not like "%draft%"
Error Handling
The DSL parser provides detailed error messages for various error conditions:
Syntax Errors
Unterminated string:
meta.title = "unterminated string
// Error: Unterminated string at position X
Missing parentheses:
(meta.a = 1 and meta.b = 2
// Error: Expected RPAREN at position X
Invalid characters:
meta.field = test@invalid
// Error: Unexpected character '@' at position X
Semantic Errors
Unknown fields:
unknown_field = "value"
// Error: Unknown field 'unknown_field' at position X
Unsupported operators:
uri like "pattern"
// Error: Unsupported operator 'like' for uri field
Type mismatches:
meta.numeric_field > "string_value"
// Error: Unsupported operator '>' for text field
Validation
- Field names are validated (only
uri,type, andmeta.*are allowed) - Operators are validated based on field type and data type
- Syntax is strictly enforced
API Reference
Function: parseDSL(query: string): DocumentSearchOptions
Parses a DSL query string into a DocumentSearchOptions object.
Parameters:
query(string): The DSL query string to parse
Returns: DocumentSearchOptions object with the following structure:
type DocumentSearchOptions = {
uris?: string[]; // Array of URIs to filter by
types?: string[]; // Array of document types to filter by
meta?: MetaCondition; // Metadata filtering conditions
limit?: number; // Result limit (not set by DSL)
offset?: number; // Result offset (not set by DSL)
};
Throws: Error with descriptive message if parsing fails
Types
MetaCondition
type MetaCondition =
| { type: 'and'; conditions: MetaCondition[] }
| { type: 'or'; conditions: MetaCondition[] }
| MetaFilter;
MetaFilter
type MetaFilter =
| MetaNumberFilter
| MetaTextFilter
| MetaBoolFilter;
type MetaNumberFilter = {
type: 'number';
field: string;
filter: {
gt?: number; // Greater than
gte?: number; // Greater than or equal
lt?: number; // Less than
lte?: number; // Less than or equal
eq?: number; // Equal
neq?: number; // Not equal
nill?: boolean; // Is null/undefined
};
};
type MetaTextFilter = {
type: 'text';
field: string;
filter: {
eq?: string; // Equal
neq?: string; // Not equal
like?: string; // Pattern match
nlike?: string; // Negative pattern match
nill?: boolean; // Is null/undefined
};
};
type MetaBoolFilter = {
type: 'bool';
field: string;
filter: {
eq: boolean; // Equal (required)
nill?: boolean; // Is null/undefined
};
};
Usage Example
import { parseDSL } from './documents.dsl';
// Parse a query
const options = parseDSL('uri in ["doc-1", "doc-2"] and meta.priority >= 5');
// Result:
// {
// uris: ['doc-1', 'doc-2'],
// meta: {
// type: 'number',
// field: 'priority',
// filter: { gte: 5 }
// }
// }
// Use with document search
const results = await searchDocuments(options);
Best Practices
- Use quotes for strings: Always wrap string values in quotes to avoid parsing issues
- Group complex conditions: Use parentheses to make complex boolean logic clear
- Choose appropriate operators: Use
likefor pattern matching,infor multiple values - Consider performance: Simpler queries with fewer conditions perform better
- Handle errors gracefully: Wrap DSL parsing in try-catch blocks in production code
Limitations
- Field restrictions: Only
uri,type, andmeta.*fields are supported - Operator compatibility: Not all operators work with all data types
- No nested metadata: Metadata fields must be flat (no
meta.nested.field) - Case sensitivity: Field names and operators are case-sensitive
- No functions: No support for functions like
date(),count(), etc.