Query to batch - Spiral Docs

Spiral’s long-term product model is query-to-batch: use SQL to define the rows you want, then let the engine plan when and how expensive values are materialized for previews, artifacts, or model-ready batches.

The current CLI exposes the relation part of that model. Extension and batch delivery pages are marked Experimental: Contact Us until they are implemented in the public client.

Result forms

A Spiral workflow can produce three result forms:

Result	Meaning
`Relation`	Rows and typed values, including deferred references.
`Artifact`	A persisted feature set, index, transformed dataset, or optimized layout.
`MorselStream`	Ordered, bounded CPU- or device-resident morsels.

The distinguishing rule is that destination is part of planning. A PyTorch batch stream is not a post-query export; it affects range reads, decode placement, memory bounds, layouts, and prefetch.

Planning layers


relation plan
  filtering, projection, joins, retrieval, alignment

materialization plan
  byte ranges, fetch, decrypt, decompress, decode, transform, cache

sample plan
  eligibility, split, balance, shuffle, epoch, assignment

batch plan
  shape policy, collation, device placement, allocation, prefetch

Deferred values

Deferred values are logically present before their bytes are fetched or decoded:


ImageRef
VideoFrameRef
AudioWindowRef
TensorRef
DocumentPageRef
PointCloudRef

Metadata queries can manipulate millions of references without reading payload bytes. Materialization is explicit and visible to the optimizer.

Why this matters

For ordinary SQL, a query result is usually just rows. For multimodal work, a row can point at a large payload: an image, a video frame, an audio window, a document page, or a tensor slice. Pulling those bytes too early wastes IO, decode time, memory, and device bandwidth.

Spiral treats those payloads as planned values. Metadata can be filtered and joined first. Materialization can happen later, close to the destination that needs the bytes.