Skip to main content

Architecture Overview

This document describes the high-level design of tsink, the component interactions, and the end-to-end data flow for writes and reads.

Table of Contents

  1. Deployment modes
  2. Repository layout
  3. Public API surface
  4. Storage engine overview
  5. Write path
  6. Read path
  7. Write-Ahead Log (WAL)
  8. Chunk encoding
  9. Segments and on-disk index
  10. Series registry
  11. Compaction
  12. Tiered storage and lifecycle
  13. Visibility and MVCC fence
  14. PromQL engine
  15. Rollups and downsampling
  16. Server mode
  17. Clustering and replication
  18. Security model
  19. Observability and background runtime
  20. Configuration reference summary

1. Deployment modes

tsink ships as three interconnected Rust crates in a single workspace.
CratePurpose
tsink (root src/)Embeddable library — the complete storage engine with no async runtime dependency.
tsink-server (crates/tsink-server/)Standalone HTTP server binary with protocol ingest, clustering, and RBAC.
tsink-uniffi (crates/tsink-uniffi/)UniFFI-generated Python bindings that wrap the library crate.
All three share the same engine code. The server and Python bindings call into the same Storage trait that application code uses directly.

2. Repository layout

src/
  lib.rs                 — public re-exports
  storage.rs             — Storage trait, StorageBuilder, public DTOs
  async.rs               — runtime-agnostic async façade (AsyncStorage)
  value.rs               — Value enum, NativeHistogram
  label.rs               — Label, canonical identity, stable hash
  wal.rs                 — WalSyncMode, WalReplayMode
  mmap.rs                — PlatformMmap (memmap2 wrapper)
  concurrency.rs         — Semaphore (write gate)
  cgroup.rs              — container-aware CPU/memory detection
  validation.rs          — metric/label boundary checks
  query_aggregation.rs   — Aggregation enum, downsampling
  query_matcher.rs       — CompiledSeriesMatcher (regex → postings)
  query_selection.rs     — PreparedSeriesSelection execution
  error.rs               — StorageError
  engine/                — all internal engine modules (see §4)
  promql/                — PromQL lexer, parser, AST, evaluator

crates/
  tsink-server/src/      — HTTP server, protocol adapters, cluster logic
  tsink-uniffi/src/      — UniFFI bindings

3. Public API surface

Core interface — Storage trait (src/storage.rs)

All interaction with the engine goes through the Storage trait. Key methods:
MethodDescription
insert_rows(&[Row])Synchronous batch write.
insert_rows_with_resultBatch write returning per-row WriteResult.
select(metric, labels, start, end)Range read for a single series.
select_with_optionsRange read with QueryOptions (aggregation, downsampling, pagination).
select_allBulk range read across all series matching a selector.
list_metricsEnumerate stored metric names.
select_seriesEnumerate series matching a label selector with time-range filtering.
scan_series_rows / scan_metric_rowsPaginated scan for large result sets.
delete_seriesTombstone-based deletion with optional time range.
snapshotAtomic directory snapshot.
add_rollup_policy / run_rollup_passManage persistent downsampling policies.
observability_snapshotCurrent internal counters and health state.

Key data types

  • Row(metric: String, labels: Vec<Label>, data_point: DataPoint).
  • DataPoint(timestamp: i64, value: Value).
  • ValueF64(f64) | I64(i64) | U64(u64) | Bool(bool) | Bytes(Vec<u8>) | Histogram(Box<NativeHistogram>).
  • Label(name: String, value: String).
  • TimestampPrecisionNanoseconds | Microseconds | Milliseconds | Seconds.

Async façade — AsyncStorage (src/async.rs)

AsyncStorage wraps Storage without requiring Tokio. It spawns OS threads internally:
  • One dedicated write worker receives WriteCommand messages over an async-channel.
  • A pool of read workers (sized by cgroup::default_workers_limit()) handles ReadCommand messages concurrently.
  • Uses parking_lot::Mutex and std::thread — suitable for embedding in any async runtime.

4. Storage engine overview

The engine lives in src/engine/ and is structured in three layers.

Layer 1 — Foundation libraries

Pure data-structure and codec modules with no cross-cutting concerns:
ModuleRole
chunkIn-memory chunk builder and sealed EncodedChunk.
encoderTimestamp and value codec selection and execution.
segmentOn-disk segment format, reader, writer, index postings.
seriesSeries identity, interned string dictionaries, Roaring Bitmap postings.
walWAL frame format, segmented log files, replay stream.
indexIn-memory ChunkIndex and PostingsIndex.
queryCore query primitives: QueryPlan, ChunkSeriesCursor, chunk decoding.
compactorL0/L1/L2 compaction planning and execution.

Layer 2 — Shared shell

State containers and cross-cutting services that the owner modules operate on:
ModuleRole
engine.rsChunkStorage — the top-level struct with five state buckets.
stateAll internal state type definitions.
visibilityMVCC fence and per-series visibility summaries.
runtimeBackground thread management and supervision.
metrics / observabilityLockless counters and snapshot generation.
shard_routingMaps series IDs to 1-of-64 shards; dead-lock-safe lock ordering.
metadata_lookupDistributed-shard series resolution.
process_lockExclusive-open file lock.

Layer 3 — Owner modules

Higher-level operations that drive the shell:
ModuleRole
bootstrapSix-phase startup: discovery → recovery → WAL open → hydrate → replay → finalize.
ingest + ingest_pipelineFive-phase write pipeline (resolve → prepare → apply → commit).
write_bufferActive-head flush policies.
lifecycleFlush, persist, replay, and shutdown orchestration.
query_exec + query_readHigh-level query execution and low-level read path.
deletionTombstone-based series deletion.
rollupsPersistent downsampling materialization.
maintenance + tiering + registry_catalogTier lifecycle, catalog validation.

State buckets in ChunkStorage

To contain lock contention, ChunkStorage organises mutable state into five separate structs:
StructContents
CatalogStateSeriesRegistry, 64 write_txn_shards (Mutex), pending series IDs, persistence lock.
ChunkBufferState64-shard active builders, 64-shard sealed chunks, persisted-chunk watermarks, monotone chunk sequence counter.
VisibilityStateTombstones, materialized series, per-series visibility summaries, flush visibility RwLock.
PersistedStorageStatePersistedIndexState, WAL handle, compactors, tiered-storage config, pending segment diff.
RuntimeConfigStateRetention/partition/skew windows, memory budget, cardinality limit, WAL size limit, concurrency bounds.

5. Write path

A single insert_rows call traverses six steps before returning to the caller.
insert_rows(&[Row])


[1] Semaphore acquire          (max_writers permits, write_timeout deadline)


[2] Shard lock acquisition     (minimal set of write_txn_shards, ascending order)


[3] WriteResolver              — validate labels, resolve or mint SeriesId,
    │                            append WAL SeriesDefinitionFrame for new series

[4] WritePreparer              — normalise timestamps, reject future skew,
    │                            reject below retention floor,
    │                            admission control: memory budget, cardinality limit, WAL size

[5] WriteApplier               — append ChunkPoint to ActivePartitionHead (ChunkBuilder),
    │                            append WAL SamplesBatchFrame

[6] WriteCommitter             — seal full chunks, publish visibility,
                                 update tombstone/rollup cross-references,
                                 notify background flush/compaction workers
Each Row carries a metric, Vec<Label>, and a DataPoint. The 64-shard split means that independent series in the same batch can be admitted concurrently. Admission control in the prepare phase enforces:
  • memory_budget_bytes — total in-memory chunk bytes.
  • cardinality_limit — maximum unique series count.
  • WAL size limit — maximum outstanding un-flushed WAL bytes.
Admission failures return a per-Row WriteResult when calling insert_rows_with_result; the synchronous insert_rows drops rejected rows silently.

6. Read path

select / select_all / select_with_options


[1] candidate_planner          — resolve metric + label matchers → Vec<SeriesId>
    │                            via PostingsIndex (postings bitmaps, regex → finite literal set)

[2] TieredQueryPlan            — compute which tiers to include (hot-only if within hot window,
    │                            else hot + warm + cold)

[3] QuerySnapshotContext       — take visibility-fenced snapshot:
    │                            acquire flush_visibility_lock (read)
    │                            snapshot active chunks, sealed chunks, persisted index
    │                            release lock

[4] Per-series read            — for each SeriesId:
    │    a. Scan active ChunkBuilders (in-memory, unsorted tail)
    │    b. Scan sealed EncodedChunks (flushed from active, pending segment write)
    │    c. Scan PersistedIndexState (mmap'd segment files on disk, hot tier)
    │    d. Optionally scan warm/cold tiers (object store or additional paths)

[5] Merge and decode           — ChunkSeriesCursor binary-searches each chunk's
    │                            TimestampSearchIndex, decodes only the needed range,
    │                            merge-sorts streams from all sources

[6] Aggregation / downsampling — optional bucket aggregation via query_aggregation


[7] Return SeriesPoints
The flush_visibility_lock is the consistency fence: writers hold a write lock during flush publication; readers hold a shared read lock for the duration of snapshot capture, guaranteeing a consistent view.

7. Write-Ahead Log (WAL)

The WAL lives under <data_path>/wal/ and is managed by the FramedWal struct.

Frame wire format

Offset  Size  Field
     0     4  Magic = b"TSFR"
     4     1  Frame type (1 = SeriesDef, 2 = Samples)
     5     4  CRC-32 of payload
     9     4  Payload length (max 64 MiB)
    13    11  Sequence number (monotone u64 + padding)
    24     N  Payload
SeriesDefinitionFrame — written once per new series: (series_id, metric, labels). SamplesBatchFrame — written per chunk flush: (series_id, lane, timestamp_codec_id, value_codec_id, point_count, base_timestamp, encoded_timestamps, encoded_values).

Durability modes

ModeBehavior
WalSyncMode::PerAppendfsync on every frame append. Crash-safe; throughput-limited.
WalSyncMode::Periodic(d)Background fsync every duration d. Higher throughput; up to d of data can be lost on hard crash.

Replay modes

ModeBehavior
WalReplayMode::StrictAbort if any WAL segment is corrupted or unreadable.
WalReplayMode::SalvageSkip damaged segments and continue replay.

High-water mark

WalHighWatermark { segment: u64, frame: u64 } tracks how far replay has been committed. Each persisted segment stores the WAL high-water mark at the time it was written, so recovery knows which WAL frames to replay and which to skip. The published high-water file (wal.published) uses magic b"TSHW".

8. Chunk encoding

Chunk lifecycle

ChunkBuilder (active)
    │  accumulates ChunkPoints in 64-point frozen blocks + mutable tail
    │  sealed when point count reaches DEFAULT_CHUNK_POINTS (2048)

EncodedChunk (sealed)
    │  raw points cleared; only encoded payload survives
    │  held in sealed_chunks until background flush

SegmentWriter → persisted segment file

Timestamp codecs

CodecUse case
FixedStepRleConstant-interval series (e.g. scrape-interval metrics).
DeltaVarintVariable-interval series with small deltas.
DeltaOfDeltaBitpackGorilla-style double-delta with bit-packing for irregular series.
The encoder auto-selects the best codec via choose_best_timestamp_codec after inspecting the actual delta distribution.

Value codecs

CodecUse case
GorillaXorF64XOR floating-point compression (Gorilla paper).
ZigZagDeltaBitpackI64Signed integer delta + bit-packing.
DeltaBitpackU64Unsigned integer delta + bit-packing.
ConstantRleRun-length encoding for constant-value series.
BoolBitpack1-bit packing for boolean streams.
BytesDeltaBlockBlock delta encoding for byte-string payloads.

Timestamp search index

Each sealed chunk builds a TimestampSearchIndex — an array of anchor entries every 64 points. Queries binary-search this index to seek directly into the encoded payload without full decompression, giving O(log N) range access.

9. Segments and on-disk index

Directory structure

<data_path>/
  lane_numeric/          — float/int/bool/histogram chunks
    l0/                  — freshly flushed segments
    l1/                  — first compaction level
    l2/                  — final compaction level
  lane_blob/             — bytes-value chunks (same level layout)
  wal/                   — WAL segment files
  series_index.bin       — binary RIDX v2 series registry snapshot
  series_index.delta.d/  — incremental series registry delta checkpoints
  series_index.catalog.json — fingerprint catalog for fast registry reload
  tombstones.json.store/ — sharded binary tombstone store (256 shards)
  .rollups/              — rollup state JSON (policies.json, state.json)
  .process.lock          — exclusive-open process lock

Segment manifest

Each segment directory contains:
  • Encoded chunk files (one per series × value lane).
  • A SegmentManifest (segment_id, level, chunk/point/series counts, min/max timestamps, wal_highwater).
  • A SegmentPostingsIndex for label/metric lookups within the segment.
  • Checksums; corrupted segments are quarantined automatically.

In-memory index

ChunkIndex — a flat sorted Vec<ChunkIndexEntry>, binary-searched for time-range queries. Each entry stores (series_id, min_ts, max_ts, chunk_offset, chunk_len, point_count, lane, codec IDs, level). PostingsIndex — a BTreeMap<label_name → BTreeMap<label_value → BTreeSet<SeriesId>>> used to resolve label-matcher queries to candidate series lists.

10. Series registry

The registry maps (metric, labels) pairs to stable numeric SeriesId values and maintains postings for fast label-based lookup.

Internals

  • 64-shard splitseries_shards, metric_postings_shards, label_postings_shards, all_series_shards; each shard protected by an independent RwLock. Reads and writes on different series never contend.
  • Interned string dictionariesmetric_dict, label_name_dict, label_value_dict store unique strings once; DictionaryId (u32) is used internally. Reduces memory for high-cardinality label sets.
  • Roaring Bitmaps — postings lists are RoaringTreemap — compact, fast union/intersection for label-matcher resolution.
  • SeriesValueFamily — inferred at first write (F64 | I64 | U64 | Bool | Blob | Histogram); used to select the appropriate codec path for all subsequent writes.

Series identity

canonical_series_identity(metric, labels) produces a deterministic binary key: labels are sorted lexicographically, then length-prefixed and concatenated with the metric name. stable_series_identity_hash applies xxh64 to this key for O(1) shard dispatch.

Persistence

The registry is persisted as a binary RIDX v2 file (series_index.bin) with incremental delta checkpoints in series_index.delta.d/. On startup, registry_catalog.rs validates per-segment xxh64 fingerprints against the catalog; a mismatch triggers a full registry rebuild from segment postings files.

11. Compaction

tsink uses a three-level LSM-inspired compaction scheme.
Active chunks
    │  background flush (250 ms default)

L0 segments    ← trigger: 4 L0 segments → compact → L1

L1 segments    ← trigger: 4 L1 segments → compact → L2

L2 segments    (cold-compacted, largest, longest time span)
Each compaction pass:
  1. Planning (compactor/planning.rs) — selects a window of up to 8 source segments at the triggering level.
  2. Execution (compactor/execution.rs) — merge-sorts all chunks, deduplicates overlapping samples, re-encodes with optimal codecs, writes new target-level segment.
  3. Atomic replacement — the replacement is staged in .compaction-replacements/ and renamed into place. If the process crashes mid-replacement, finalize_pending_compaction_replacements cleans up the staging area at next startup.
Compaction is tombstone-aware: tombstoned time ranges are excluded from the output segment, physically removing deleted data from disk.

12. Tiered storage and lifecycle

Tiers

TierLocationAccess
HotLocal filesystem (lane_numeric/, lane_blob/)Direct mmap reads.
WarmConfigurable local path or object storeLoaded on demand.
ColdObject storeLoaded on demand; full fetch before query.

Tier lifecycle

A background maintenance pass computes a PostFlushMaintenancePolicyPlan from RetentionTierPolicy:
  • expired_actions — delete segments that fall outside the retention window.
  • rewrite_actions — trim segment boundaries at time-split points.
  • move_actions — promote segments from Hot → Warm or Warm → Cold when they cross the configured retention cutoffs.

Query tier selection

TieredQueryPlan { start, end, include_warm, include_cold } is computed from the query time range vs. tier cutoffs. A query entirely within the hot window skips all remote tier I/O.

13. Visibility and MVCC fence

The problem

A flush publishes many new segment entries and visibility summaries at once. Without a fence, a reader could see a partially-published flush: some new segments visible, others not.

The solution

flush_visibility_lock is a RwLock<()>:
  • Writers acquire the exclusive write lock for the entire duration of flush publication.
  • Readers acquire a shared read lock during snapshot capture (step 3 of the read path).
This is a brief, bounded mutual-exclusion window — not a per-operation lock. Normal write ingest does not touch this lock.

Per-series visibility summaries

SeriesVisibilitySummary tracks up to SERIES_VISIBILITY_SUMMARY_MAX_RANGES = 32 disjoint time ranges per series. This handles series with gaps (e.g. intermittently reporting nodes) efficiently: readers can skip time ranges that are known to be empty without scanning segment files. Fields per summary:
  • latest_visible_timestamp — highest timestamp ingested for this series.
  • latest_bounded_visible_timestamp — highest timestamp whose containing chunk has been sealed and persisted.
  • exhaustive_floor_inclusive — data below this timestamp is complete (no more writes expected).
  • truncated_before_floor — all data before this point has been tombstoned.
  • ranges — list of (min, max) tuples defining continuous data windows.

14. PromQL engine

tsink includes a full in-process PromQL evaluator in src/promql/.

Components

ModuleRole
lexer.rsHand-written lexer; produces Vec<Token>.
parser.rsRecursive-descent Pratt parser; produces ast::Expr.
ast.rsComplete AST: VectorSelector, MatrixSelector, SubqueryExpr, BinaryExpr, AggregationExpr, CallExpr, AtModifier, all operators.
eval/Evaluator; implements all PromQL functions and aggregations.

Evaluator

Engine { storage, default_lookback_delta, timestamp_units_per_second }:
  • instant_query(query, time) — evaluates at a single timestamp.
  • range_query(query, start, end, step) — vectorized evaluation over a step range.
A PrefetchCache is keyed by metric name and populated via select_all before evaluation, amortising storage round-trips across all eval timestamps in a range query. Supported features: rate, irate, increase, delta, histogram_quantile, all standard aggregations (sum, avg, count, min, max, topk, bottomk, quantile), binary operators with on/ignoring/group_left/group_right, subqueries, @ modifier.

15. Rollups and downsampling

Rollup policies define persistent, automated downsampling of raw data.

Policy structure

RollupPolicy:
  • policy_id — stable identifier.
  • Source selector — metric name + optional label matchers.
  • aggregationSum | Min | Max | Avg | Count | ....
  • resolution — output bucket width.
  • Retention rules — when to expire rolled-up data.

Materialization

Materialized series are stored under synthetic metric names of the form __tsink_rollup__:<policy_id>:<original_metric>. A rollup worker runs every 5 seconds, reads raw data for each pending policy/source pair, computes aggregated buckets, and writes the result back via a normal insert_rows call. Rollup state is checkpointed as JSON files in .rollups/ with per-policy per-source progress cursors. When a tombstone is written to a source series, any rolled-up data covering the deleted range is invalidated and scheduled for re-materialization.

16. Server mode

The tsink-server crate wraps the library engine in a full HTTP server.

Protocol support

ProtocolEndpoint
Prometheus Remote WritePOST /api/v1/write (Snappy-framed protobuf)
Prometheus Remote ReadPOST /api/v1/read
Prometheus Text ExpositionPOST /api/v1/import/prometheus
InfluxDB Line Protocol v1/v2POST /write, POST /api/v2/write
OTLP HTTPPOST /v1/metrics (protobuf; gauges, sums, histograms, summaries)
StatsDUDP (counter, gauge, timer, set)
GraphiteTCP (plaintext)
PromQL instant queryGET /api/v1/query
PromQL range queryGET /api/v1/query_range

Connection handling

  • MAX_CONNECTIONS = 1024 simultaneous TCP connections.
  • KEEP_ALIVE_TIMEOUT = 30 s, HANDSHAKE_TIMEOUT = 10 s, SHUTDOWN_GRACE_PERIOD = 10 s.
  • TLS via tokio-rustls; HTTP/1.1 framing.

Multi-tenancy

Tenant identity is propagated via the x-tsink-tenant HTTP header and injected as a __tsink_tenant__ synthetic label on all writes and reads. Per-tenant admission semaphores govern ingest, query, metadata, and retention concurrency independently.

17. Clustering and replication

Clustering is an opt-in layer over the single-node server. All cluster logic lives in crates/tsink-server/src/cluster/.

Topology

Client
  │  writes / reads

Any cluster node (HTTP)

  ├── WriteRouter        — splits batch by consistent hash ring → local + remote shards
  │     │
  │     ├── local write  — directly into local ChunkStorage
  │     └── remote write — parallel RPC fanout to replica owners

  └── ReadFanoutExecutor — fans query to owner shards, merges responses

Consistent hash ring (cluster/ring.rs)

ShardRing { shard_count, replication_factor, virtual_nodes_per_node = 128 }:
  • Each node contributes 128 virtual tokens distributed by xxh64 hash around the ring.
  • Each shard is pre-assigned to replication_factor owner nodes.
  • stable_series_identity_hash(metric, labels) → deterministic shard selection — the same series always lands on the same shard regardless of which node receives the write.

Write routing (cluster/replication.rs)

WriteRouter decomposes an incoming batch into a WritePlan { local_rows, remote_batches }. Remote batches are sent in parallel via tokio::task::JoinSet (max 32 in-flight batches, max 1024 rows per batch). Failed remote writes are enqueued to HintedHandoffOutbox for retry. Consistency levelsClusterWriteConsistency { One | Quorum | All } — control how many replicas must acknowledge before the write is considered successful.

Control plane (cluster/consensus.rs)

A custom Raft-like log replication layer manages cluster membership and control state:
  • Tick interval: 2 s; max append entries: 64; snapshot every 128 entries.
  • Leader election with suspect timeout: 6 s; dead timeout: 20 s; leader lease: 6 s.
  • Control log persisted as NDJSON to tsink-control-log.

Anti-entropy repair (cluster/repair.rs)

DigestExchangeRuntime runs background anti-entropy:
  • Every 30 s, each node exchanges window digest hashes with its peers for up to 64 shards per tick.
  • A digest mismatch triggers a InternalRepairBackfillRequest to re-replicate missing data.
  • Rebalance moves shard ownership between nodes in 5 s intervals with configurable row-per-tick limits.

Deduplication

DedupeWindowStore provides idempotency-key deduplication to prevent re-ingestion of retried writes during hinted handoff re-delivery.

18. Security model

TLS

All external HTTP and peer-to-peer traffic can be protected by TLS. Certificates are loaded via tokio-rustls and support hot rotation without server restart via SecurityManager. Internal cluster traffic uses mTLS with a dedicated cluster CA to authenticate peer nodes.

Authentication

Two methods are supported:
  1. Bearer tokens — loaded from a file or via an exec command; verified on every request.
  2. OIDC JWT — RS256, ES256, or HS256 tokens issued by a configured identity provider; JWKS fetched at startup with 60 s clock skew tolerance.

RBAC (src/rbac.rs)

RbacRegistry assigns roles to principals. Actions are Read | Write; resource kinds are Tenant | Admin | System. Service accounts carry 32-byte random tokens (base64url-encoded, HMAC-backed) with configurable rotation schedules. An in-memory ring buffer retains the last 256 RBAC decisions for audit review.

Secret rotation

SecurityManager supports runtime rotation of:
  • Public and admin auth tokens.
  • Cluster internal auth token.
  • TLS listener certificate/key.
  • Cluster mTLS certificate/key.
A 300 s overlap grace period (configurable) allows in-flight requests to complete with the old credential before it is invalidated.

19. Observability and background runtime

Self-instrumentation

StorageObservabilityCounters provides lockless AtomicU64 counters grouped by subsystem:
Counter groupWhat it tracks
WalObservabilityCountersReplay runs, frames, points, errors, durations.
FlushObservabilityCountersFlush runs, backpressure events, active chunk stats.
CompactionObservabilityCountersRuns, source/output segment/chunk/point counts.
QueryObservabilityCountersHot/warm/cold tier plans, chunks read, fetch durations.
RollupObservabilityCountersMaterialization runs, points produced.
RetentionObservabilityCountersExpired series, segments, points.
HealthObservabilityStatefail_fast_triggered, last error strings.
observability_snapshot() converts raw counters + runtime state into the public StorageObservabilitySnapshot DTO. The server exposes /metrics in Prometheus exposition format and /healthz / /ready Kubernetes-compatible probes.

Background threads

The runtime (src/engine/runtime.rs) manages three background workers:
WorkerIntervalActivity
Flush250 msFlushes BackgroundEligible or BackgroundBounded active chunks to the sealed queue.
Compaction5 s (+ triggered)Runs Compactor::compact_once for numeric and blob lanes.
Rollup5 sExecutes pending rollup materializations.
Workers check three lifecycle states (STORAGE_OPEN / CLOSING / CLOSED) and stop themselves when the engine begins shutdown. If background_fail_fast = true (default), a background worker panic sets fail_fast_triggered and causes subsequent writes to return an error.

20. Configuration reference summary

Key engine knobs and their defaults:
OptionDefaultNotes
timestamp_precisionNanoseconds`Nsµsmss`
retention_window14 daysData older than this is eligible for expiry.
future_skew_window15 minWrites with future timestamps beyond this are rejected.
partition_window1 hourTime-bucket width for active partition heads.
max_active_partition_heads_per_series8Maximum concurrent open partitions per series.
max_writerscgroup CPU countWrite parallelism gate (Semaphore permits).
write_timeout30 sMaximum wait time for a write permit.
memory_budget_bytesu64::MAX (no limit)Total in-memory chunk budget.
cardinality_limitusize::MAX (no limit)Maximum unique series count.
chunk_points2048Points per sealed chunk.
compaction_interval5 sBackground compaction frequency.
flush_interval250 msBackground flush frequency.
background_fail_fasttrueWorker panic triggers engine failure mode.
Container-aware defaults: cgroup.rs reads /sys/fs/cgroup/cpu.max and /sys/fs/cgroup/memory.max to detect CPU and memory limits. The TSINK_MAX_CPUS environment variable overrides the detected CPU count.