PromQL implementation

tsink ships a native PromQL parser and evaluator with no external query layer. The implementation lives entirely in src/promql/ and is exposed through the public tsink::promql module.

Architecture

The PromQL pipeline has three stages:

query string → Lexer → Parser → Evaluator → PromqlValue

Module	File	Responsibility
Lexer	`src/promql/lexer.rs`	Tokenise raw input into a flat `Vec<Token>`
Parser	`src/promql/parser.rs`	Turn tokens into an `Expr` AST (Pratt / precedence-climbing)
Evaluator	`src/promql/eval/`	Walk the AST and resolve values against the storage engine
Types	`src/promql/types.rs`	`PromqlValue`, `Sample`, `Series`, histogram helpers
Errors	`src/promql/error.rs`	`PromqlError` enum

Public API

Parsing

use tsink::promql::{parse, ast::Expr};

let expr: Expr = parse("rate(http_requests_total[5m])")?;

Query engine

use std::sync::Arc;
use tsink::promql::{Engine, PromqlValue};
use tsink::{Storage, TimestampPrecision};

// Build an engine from any Arc<dyn Storage>.
let engine = Engine::with_precision(storage, TimestampPrecision::Milliseconds);

// Instant query — evaluate at a single timestamp.
let result: PromqlValue = engine.instant_query("up", eval_time)?;

// Range query — evaluate over [start, end] at each step.
let result: PromqlValue = engine.range_query("rate(errors_total[1m])", start, end, step)?;

Engine::new defaults to TimestampPrecision::Nanoseconds. Both methods parse the expression internally before evaluation.

Value types

PromqlValue mirrors the four PromQL expression result types.

Variant	Contents	When returned
`Scalar(f64, i64)`	A single float and its evaluation timestamp	Literals, `scalar()`, arithmetic on two scalars
`InstantVector(Vec<Sample>)`	Zero or more labelled samples at one timestamp	Vector selectors, most functions
`RangeVector(Vec<Series>)`	Labelled time series with multiple samples	Matrix selectors, range queries
`String(String, i64)`	A string value and its evaluation timestamp	String literals

`Sample`

pub struct Sample {
    pub metric: String,
    pub labels: Vec<Label>,
    pub timestamp: i64,
    pub value: f64,
    pub histogram: Option<Box<NativeHistogram>>,
}

`Series`

pub struct Series {
    pub metric: String,
    pub labels: Vec<Label>,
    pub samples: Vec<(i64, f64)>,       // (timestamp, float value)
    pub histograms: Vec<(i64, Box<NativeHistogram>)>,
}

Lexer

The lexer is a single-pass byte scanner. It produces all tokens in one call (Lexer::new(input).tokenize()), returning Vec<Token> or a PromqlError::Parse.

Comments

# starts a line comment; everything until the next newline is discarded.

Identifiers

Identifiers follow the usual [a-zA-Z_][a-zA-Z0-9_]* alphabet. Colons (:) are also accepted inside identifiers to support recording-rule naming conventions such as job:http_requests:rate5m.

Keywords (case-insensitive)

by, without, offset, bool, and, or, unless, on, ignoring, group_left, group_right, atan2, inf, nan

Duration literals

A duration is a sequence of one or more <integer><unit> segments.

Suffix	Unit
`ms`	milliseconds
`s`	seconds
`m`	minutes
`h`	hours
`d`	days (24 h)
`w`	weeks (7 d)
`y`	years (365 d)

Segments can be combined: 1h30m, 5m30s, 2d12h. Durations are stored internally as milliseconds (i64).

String literals

Double-quoted strings with \", \\, \n, \r, \t escape sequences.

Number literals

Decimal integers and floats. The special identifiers inf and nan are recognised as numeric tokens equivalent to f64::INFINITY and f64::NAN.

Parser

The parser implements precedence-climbing (Pratt) parsing for expressions. The entry point is parser::parse(input).

Expression grammar (summary)

expr        := unary ( binary_op modifiers expr )*
unary       := ('+' | '-')? primary postfix*
primary     := number | string | inf | nan
             | '(' expr ')'
             | '{' matchers '}'
             | ident [ aggregation | call | vector_selector ]
postfix     := '[' duration (':' duration?)? ']'   -- matrix selector or subquery
             | 'offset' signed_duration
             | '@' (number | 'start()' | 'end()')

Operator precedence

Lower number = binds tighter.

Precedence	Operators
1 (highest)	`^` (right-associative)
2	`*`, `/`, `%`, `atan2`
3	`+`, `-`
4	`==`, `!=`, `<`, `>`, `<=`, `>=`
5	`and`, `unless`
6 (lowest)	`or`

Label matchers

{label="value", label2!="value", label3=~"regex", label4!~"regex"}

Operator	Semantics
`=`	Exact equality
`!=`	Exact inequality
`=~`	Regex match (anchored)
`!~`	Regex non-match (anchored)

Regex matching compiles the pattern with the standard regex crate.

Vector selector

metric_name
metric_name{labels}
{labels}

Matrix selector

metric_name[5m]
metric_name{label="value"}[1h]

Subquery

expr[range:step]
expr[range:]          # omit step → use query step or default (1m)

@modifier and offset

metric @ 1700000000         # pin to Unix timestamp
metric @ start()            # pin to range query start
metric @ end()              # pin to range query end
metric offset 5m            # shift evaluation back 5 minutes
metric[10m] offset 1h       # combine range + offset

Aggregations

sum(expr)
sum by  (label1, label2) (expr)
sum without (label1)     (expr)

The grouping clause can be placed either before or after the argument list.

Binary operator modifiers

a + on(job)             b       # match only on "job"
a + ignoring(instance)  b       # ignore "instance" when matching
a * on(job) group_left  b       # many-to-one: keep left-side labels
a * on(job) group_right b       # one-to-many: keep right-side labels
a * on(job) group_left(region) b  # also copy "region" from right
a == bool   b                   # return 0/1 instead of filtering

group_left and group_right cannot be combined with set operators (and, or, unless).

Evaluator

The evaluator is in src/promql/eval/ and is split across several files:

File	Contents
`mod.rs`	`Engine`, instant and range query entry points, prefetch cache, `@` resolution
`selector.rs`	Instant vector and matrix selector evaluation
`functions.rs`	All built-in function implementations
`aggregation.rs`	Aggregation operator implementations
`binary.rs`	Binary operator evaluation and vector matching
`subquery.rs`	Subquery evaluation
`time.rs`	Duration/timestamp utilities (`duration_to_units`, `step_times`)

Default parameters

Parameter	Default
Lookback delta	5 minutes
Subquery step	1 minute

The lookback delta controls how far back an instant vector selector looks for the most recent sample.

Range query prefetch

For range queries the engine checks whether any selector uses a dynamic @ modifier or is wrapped in a subquery. When no dynamic time is involved it pre-fetches all required metric data from storage in a single pass before iterating over steps. This significantly reduces storage I/O for wide time ranges. Subqueries and @ modifiers disable prefetch for accuracy.

Aggregation operators

All aggregation operators accept an optional by (labels) or without (labels) grouping clause.

Operator	Parameter	Description
`sum`	—	Sum of values
`avg`	—	Average of values
`min`	—	Minimum value
`max`	—	Maximum value
`count`	—	Number of series
`group`	—	1 for each group (existence aggregation)
`stddev`	—	Population standard deviation
`stdvar`	—	Population variance
`count_values`	`label` (string)	Count series per distinct value; adds a `label` dimension
`quantile`	`φ` (scalar)	φ-quantile across the group
`topk`	`k` (scalar)	Top k series by value
`bottomk`	`k` (scalar)	Bottom k series by value
`limitk`	`k` (scalar)	Deterministically select k series (hash-stable)
`limit_ratio`	`ratio` (scalar)	Deterministically select a ratio of series

sum, avg, min, max, count, group, stddev, and stdvar support native histograms for sum. count_values, quantile, topk, and bottomk require float samples.

Functions

Counter and gauge range functions

Function	Input	Description
`rate(v[d])`	range vector	Per-second rate of counter increase (extrapolated to fit `d`)
`irate(v[d])`	range vector	Per-second instant rate using the last two samples
`increase(v[d])`	range vector	Total counter increase over `d` (extrapolated)
`delta(v[d])`	range vector	Value change over `d` (extrapolated, for gauges)
`idelta(v[d])`	range vector	Instant delta between the last two samples
`changes(v[d])`	range vector	Number of value changes within `d`
`resets(v[d])`	range vector	Number of counter resets within `d`

rate and increase support native histogram series and produce a histogram result. The other range functions require float samples. Extrapolation: rate, increase, and delta use the same boundary extrapolation algorithm as Prometheus — the sampled interval is extended toward the range boundaries when the gap is within 110% of the average sample interval.

Over-time aggregations

All take a range vector and return an instant vector.

Function	Description
`avg_over_time(v[d])`	Average of samples in window
`sum_over_time(v[d])`	Sum
`min_over_time(v[d])`	Minimum
`max_over_time(v[d])`	Maximum
`count_over_time(v[d])`	Count of samples
`last_over_time(v[d])`	Most recent sample
`present_over_time(v[d])`	1 if any sample exists
`stddev_over_time(v[d])`	Standard deviation
`stdvar_over_time(v[d])`	Variance
`mad_over_time(v[d])`	Median absolute deviation
`quantile_over_time(φ, v[d])`	φ-quantile of samples

Histogram functions

Function	Description
`histogram_quantile(φ, v)`	φ-quantile from classic (bucket-based) or native histograms
`histogram_avg(v)`	Average from native histograms
`histogram_count(v)`	Observation count from native histograms
`histogram_sum(v)`	Sum of observations from native histograms
`histogram_stddev(v)`	Standard deviation from native histograms
`histogram_stdvar(v)`	Variance from native histograms
`histogram_fraction(lower, upper, v)`	Fraction of observations in `(lower, upper]` from native histograms

Regression and prediction

Function	Description
`deriv(v[d])`	Estimated per-second derivative by linear regression
`predict_linear(v[d], t)`	Predicted value `t` seconds from now using linear regression
`double_exponential_smoothing(v[d], sf, tf)`	Double exponential smoothing; `sf` = smoothing factor, `tf` = trend factor; also callable as `holt_winters`

Math functions

Function	Description
`abs(v)`	Absolute value
`ceil(v)`	Ceiling
`floor(v)`	Floor
`round(v)`	Round to nearest integer
`round(v, to_nearest)`	Round to nearest multiple of `to_nearest`
`sqrt(v)`	Square root
`exp(v)`	e^v
`ln(v)`	Natural logarithm
`log2(v)`	Base-2 logarithm
`log10(v)`	Base-10 logarithm
`sgn(v)`	Sign (−1, 0, or 1)
`clamp(v, min, max)`	Clamp value to `[min, max]`
`clamp_min(v, min)`	Lower-clamp
`clamp_max(v, max)`	Upper-clamp

Trigonometry

Function	Function
`cos(v)`	`acos(v)`
`cosh(v)`	`acosh(v)`
`sin(v)`	`asin(v)`
`sinh(v)`	`asinh(v)`
`tan(v)`	`atan(v)`
`tanh(v)`	`atanh(v)`
`deg(v)` — radians to degrees	`rad(v)` — degrees to radians
`pi()` — π as a scalar

Date and time

When called with no argument these functions use the eval-time timestamp. When called with an instant vector they use each sample’s timestamp.

Function	Description
`time()`	Current evaluation time in seconds since epoch (scalar)
`timestamp(v)`	Timestamp of each sample in seconds since epoch
`minute(v?)`	Minute of the hour (0–59)
`hour(v?)`	Hour of the day (0–23)
`day_of_week(v?)`	Day of the week (0=Sunday–6=Saturday)
`day_of_month(v?)`	Day of the month (1–31)
`day_of_year(v?)`	Day of the year (1–366)
`days_in_month(v?)`	Number of days in the month (28–31)
`month(v?)`	Month (1–12)
`year(v?)`	Year

Label manipulation

Function	Description
`label_replace(v, dst, repl, src, regex)`	Rewrite label `src` into `dst` using a capture-aware `regex` and `repl`
`label_join(v, dst, sep, src1, src2, ...)`	Concatenate source labels into `dst` with `sep` as separator
`drop_common_labels(v)`	Remove labels that are identical across all series in the vector

Type coercion

Function	Description
`scalar(v)`	Convert a single-element instant vector to a scalar; `NaN` if more than one element
`vector(s)`	Convert a scalar to a single-element instant vector with no labels

Sorting

Function	Description
`sort(v)`	Sort by value ascending
`sort_desc(v)`	Sort by value descending
`sort_by_label(v, l1, ...)`	Sort by the specified label names, ascending
`sort_by_label_desc(v, l1, ...)`	Sort by the specified label names, descending

Absence detection

Function	Description
`absent(v)`	Returns `{} 1` when the instant vector is empty; nothing otherwise
`absent_over_time(v[d])`	Returns `{} 1` when the range vector is empty; nothing otherwise

Miscellaneous

Function	Description
`info(v)`	Experimental: fetches info-metric labels and merges them into each series
`count_scalar(v)`	Returns the element count of a vector as a scalar

Supported features vs. standard PromQL

Feature	Supported
Instant and range queries	Yes
All arithmetic and set operators	Yes
`bool` comparison modifier	Yes
Vector matching (`on` / `ignoring` / `group_left` / `group_right`)	Yes
`offset` modifier	Yes
`@` modifier with literal timestamp, `start()`, `end()`	Yes
Subqueries `expr[range:step]`	Yes
Native histograms	Yes (float samples only for most functions; `rate` and `increase` supported)
Stale NaN markers (Prometheus compatibility)	Yes
`limitk` / `limit_ratio` aggregations (VictoriaMetrics extension)	Yes
`mad_over_time`	Yes
`double_exponential_smoothing` / `holt_winters`	Yes
`sort_by_label` / `sort_by_label_desc`	Yes
`info`	Yes (experimental)
UTF-8 / non-ASCII metric names	No — identifiers are ASCII only
Backtick string literals	No

Error types

pub enum PromqlError {
    Parse(String),                                  // invalid syntax
    UnexpectedToken { expected: String, found: String },
    UnknownFunction(String),                        // unrecognised function name
    ArgumentCount { func, expected, got },          // wrong arity
    Type(String),                                   // type mismatch at evaluation
    Eval(String),                                   // runtime evaluation error
    Regex(String),                                  // invalid regex in matcher
    Storage(TsinkError),                            // underlying storage error
}

PromqlError::Storage is constructed automatically from TsinkError via a From impl, so storage errors surface transparently through the query result.

Examples

use std::sync::Arc;
use tsink::{StorageBuilder, TimestampPrecision};
use tsink::promql::Engine;

let storage = Arc::new(
    StorageBuilder::new()
        .with_timestamp_precision(TimestampPrecision::Seconds)
        .build()?
);
let engine = Engine::with_precision(Arc::clone(&storage), TimestampPrecision::Seconds);

// Instant queries
let v = engine.instant_query("up", 1_700_000_000)?;
let v = engine.instant_query(r#"http_requests_total{method="GET"}"#, now)?;
let v = engine.instant_query("rate(http_requests_total[5m])", now)?;
let v = engine.instant_query("sum by (job) (rate(errors_total[1m]))", now)?;

// Range query (returns PromqlValue::RangeVector)
let v = engine.range_query(
    "rate(http_requests_total[5m])",
    start,   // inclusive
    end,     // inclusive
    step,    // interval between evaluation points
)?;

// Parse only (no storage required)
let expr = tsink::promql::parse("histogram_quantile(0.99, rate(latency_seconds_bucket[5m]))")?;

​PromQL implementation

​Architecture

​Public API

​Parsing

​Query engine

​Value types

​Sample

​Series

​Lexer

​Comments

​Identifiers

​Keywords (case-insensitive)

​Duration literals

​String literals

​Number literals

​Parser

​Expression grammar (summary)

​Operator precedence

​Label matchers

​Vector selector

​Matrix selector

​Subquery

​@modifier and offset

​Aggregations

​Binary operator modifiers

​Evaluator

​Default parameters

​Range query prefetch

​Aggregation operators

​Functions

​Counter and gauge range functions

​Over-time aggregations

​Histogram functions

​Regression and prediction

​Math functions

​Trigonometry

​Date and time

​Label manipulation

​Type coercion

​Sorting

​Absence detection

​Miscellaneous

​Supported features vs. standard PromQL

​Error types

​Examples