Files
fluxcurrent/docs/query-dsl.md
2025-09-09 18:48:55 +02:00

10 KiB

Query DSL Documentation

The Query DSL (Domain Specific Language) provides a powerful and flexible way to query documents in the system. It allows you to construct complex queries using a SQL-like syntax with support for filtering by URI, document type, and metadata fields.

Table of Contents

Overview

The Query DSL parses human-readable query strings and converts them into DocumentSearchOptions objects that can be used to search documents. It supports:

  • URI filtering: Filter documents by their unique resource identifiers
  • Type filtering: Filter documents by their type
  • Metadata filtering: Filter documents by their metadata fields with type-aware operations
  • Logical operations: Combine conditions using AND/OR logic
  • Parenthetical grouping: Group conditions for complex boolean logic
  • Multiple data types: String, number, boolean values with appropriate operators

Syntax Overview

The basic syntax follows this pattern:

field operator value [logical_operator field operator value...]

Basic Examples

uri = "doc-123"
type = "article"
meta.priority = 5
meta.title like "%search%"

Complex Examples

uri in ["doc-1", "doc-2"] and meta.priority >= 5
(meta.published = true or meta.draft = false) and type = "article"

Field Types

1. URI Fields

Filter documents by their URI (Unique Resource Identifier).

Syntax: uri

Supported Operations:

  • Equality: uri = "document-id"
  • Array membership: uri in ["doc-1", "doc-2", "doc-3"]

2. Type Fields

Filter documents by their type.

Syntax: type

Supported Operations:

  • Equality: type = "article"
  • Array membership: type in ["article", "blog", "news"]

3. Metadata Fields

Filter documents by their metadata fields. Metadata fields are accessed using dot notation.

Syntax: meta.fieldName

Examples:

  • meta.title
  • meta.priority
  • meta.published
  • meta.created_at

Operators

Comparison Operators

Operator Description Example
= Equals meta.priority = 5
!= Not equals meta.status != "archived"
> Greater than meta.score > 8.5
>= Greater than or equal meta.priority >= 5
< Less than meta.age < 30
<= Less than or equal meta.count <= 100
like Pattern matching (SQL-like) meta.title like "%test%"
not like Negative pattern matching meta.title not like "%draft%"
in Array membership uri in ["a", "b", "c"]

Logical Operators

Operator Syntax Description Example
AND and or & Logical AND meta.a = 1 and meta.b = 2
OR or or | Logical OR meta.urgent = true or meta.priority > 8

Logical Operations

AND Operations

Combine conditions where both must be true:

meta.published = true and meta.priority >= 5
meta.created >= 1234567890 & meta.updated < 1234567999

OR Operations

Combine conditions where either can be true:

type = "article" or type = "blog"
meta.urgent = true | meta.priority > 9

Parenthetical Grouping

Use parentheses to control the order of operations:

(meta.priority > 5 or meta.urgent = true) and type = "article"
((meta.a = 1 and meta.b = 2) or meta.c = 3) and uri = "test"

Data Types

The DSL automatically detects and handles different data types based on the value syntax:

Strings

Enclosed in double quotes (") or single quotes ('):

meta.title = "Article Title"
meta.status = 'published'

Escape Sequences:

  • \" or \' - Quote characters
  • \\ - Backslash
  • \n - Newline
  • \t - Tab
  • \r - Carriage return

Example:

meta.content = "String with \"quotes\" and \n newlines"

Numbers

Integer or floating-point numbers, including negative values:

meta.priority = 5
meta.score = 8.75
meta.balance = -150.50

Booleans

Case-insensitive true or false:

meta.published = true
meta.archived = false
meta.draft = True

Arrays

Square bracket notation for array values (used with in operator):

uri in ["doc-1", "doc-2", "doc-3"]
type in ["article", "blog", "news"]

Empty arrays are supported:

uri in []

Examples

Basic Filtering

Filter by URI:

uri = "my-document"

Filter by multiple URIs:

uri in ["doc-1", "doc-2", "doc-3"]

Filter by document type:

type = "article"

Filter by metadata:

meta.priority = 5
meta.title = "Important Document"
meta.published = true

Comparison Operations

Numeric comparisons:

meta.priority > 5
meta.score >= 8.0
meta.count < 100
meta.rating <= 4.5
meta.value != 0

Text operations:

meta.title like "%report%"
meta.category != "archived"
meta.description not like "%draft%"

Logical Combinations

Simple AND:

type = "article" and meta.published = true

Simple OR:

meta.urgent = true or meta.priority > 8

Mixed field types:

uri in ["doc-1", "doc-2"] and meta.priority >= 5

Complex Queries

Grouped conditions:

(meta.priority > 5 or meta.urgent = true) and type = "article"

Nested grouping:

((meta.a = 1 and meta.b = 2) or meta.c = 3) and type = "test"

Real-world example:

uri in ["article-1", "article-2"] and (meta.created >= 1640995200 and meta.created < 1672531200 or (type = "urgent" and meta.status != "archived"))

Date/Time Queries

Since dates are typically stored as Unix timestamps (numbers):

meta.created >= 1640995200
meta.updated > 1672531200 and meta.updated < 1675209600

String Pattern Matching

Contains pattern:

meta.title like "%search term%"

Starts with pattern:

meta.filename like "report_%"

Ends with pattern:

meta.extension like "%.pdf"

Exclude pattern:

meta.title not like "%draft%"

Error Handling

The DSL parser provides detailed error messages for various error conditions:

Syntax Errors

Unterminated string:

meta.title = "unterminated string
// Error: Unterminated string at position X

Missing parentheses:

(meta.a = 1 and meta.b = 2
// Error: Expected RPAREN at position X

Invalid characters:

meta.field = test@invalid
// Error: Unexpected character '@' at position X

Semantic Errors

Unknown fields:

unknown_field = "value"
// Error: Unknown field 'unknown_field' at position X

Unsupported operators:

uri like "pattern"
// Error: Unsupported operator 'like' for uri field

Type mismatches:

meta.numeric_field > "string_value"
// Error: Unsupported operator '>' for text field

Validation

  • Field names are validated (only uri, type, and meta.* are allowed)
  • Operators are validated based on field type and data type
  • Syntax is strictly enforced

API Reference

Function: parseDSL(query: string): DocumentSearchOptions

Parses a DSL query string into a DocumentSearchOptions object.

Parameters:

  • query (string): The DSL query string to parse

Returns: DocumentSearchOptions object with the following structure:

type DocumentSearchOptions = {
  uris?: string[];           // Array of URIs to filter by
  types?: string[];          // Array of document types to filter by
  meta?: MetaCondition;      // Metadata filtering conditions
  limit?: number;            // Result limit (not set by DSL)
  offset?: number;           // Result offset (not set by DSL)
};

Throws: Error with descriptive message if parsing fails

Types

MetaCondition

type MetaCondition = 
  | { type: 'and'; conditions: MetaCondition[] }
  | { type: 'or'; conditions: MetaCondition[] }
  | MetaFilter;

MetaFilter

type MetaFilter = 
  | MetaNumberFilter
  | MetaTextFilter  
  | MetaBoolFilter;

type MetaNumberFilter = {
  type: 'number';
  field: string;
  filter: {
    gt?: number;      // Greater than
    gte?: number;     // Greater than or equal
    lt?: number;      // Less than
    lte?: number;     // Less than or equal
    eq?: number;      // Equal
    neq?: number;     // Not equal
    nill?: boolean;   // Is null/undefined
  };
};

type MetaTextFilter = {
  type: 'text';
  field: string;
  filter: {
    eq?: string;      // Equal
    neq?: string;     // Not equal
    like?: string;    // Pattern match
    nlike?: string;   // Negative pattern match
    nill?: boolean;   // Is null/undefined
  };
};

type MetaBoolFilter = {
  type: 'bool';
  field: string;
  filter: {
    eq: boolean;      // Equal (required)
    nill?: boolean;   // Is null/undefined
  };
};

Usage Example

import { parseDSL } from './documents.dsl';

// Parse a query
const options = parseDSL('uri in ["doc-1", "doc-2"] and meta.priority >= 5');

// Result:
// {
//   uris: ['doc-1', 'doc-2'],
//   meta: {
//     type: 'number',
//     field: 'priority',
//     filter: { gte: 5 }
//   }
// }

// Use with document search
const results = await searchDocuments(options);

Best Practices

  1. Use quotes for strings: Always wrap string values in quotes to avoid parsing issues
  2. Group complex conditions: Use parentheses to make complex boolean logic clear
  3. Choose appropriate operators: Use like for pattern matching, in for multiple values
  4. Consider performance: Simpler queries with fewer conditions perform better
  5. Handle errors gracefully: Wrap DSL parsing in try-catch blocks in production code

Limitations

  1. Field restrictions: Only uri, type, and meta.* fields are supported
  2. Operator compatibility: Not all operators work with all data types
  3. No nested metadata: Metadata fields must be flat (no meta.nested.field)
  4. Case sensitivity: Field names and operators are case-sensitive
  5. No functions: No support for functions like date(), count(), etc.