Skip to Content
Anatomy of a Scan

Anatomy of a Scan

Before we dive into the details of table scans, let’s remind ourselves of the few design decisions in a table format:

  • Column Groups are co-partitioned sets of columns. A partition is called fragment.
  • Fragments are stored as Vortex  files, sorted by primary key, but keys are stored separately.
  • Key Spaces store keys and are optimized for sort-merge joins (especially for dense runs of identical keys).
  • Manifests store fragments metadata including an alignment between fragment and a key space.

Fragments

These design decisions have following implications for table scans:

  • Metadata that is frequently used for filtering is partitioned separately from data columns that are usually just projected (for example, it’s very rare to filter on audio bytes).
  • Fragments are partitioned by size (optimized for object storage). In practice, fragments that store metadata have a lot more rows compared to fragments that store projected data.

A look at the table

Our table contains audio data with following schema:

Schema({ audio_length=f64?, silence_ratio=f64? audio={ bytes=binary?, meta={ size=u64?, e_tag=utf8?, }? }? })

Table contains audio bytes in a column group called audio, and two “metadata” column groups, a root one and audio.meta with some additional source-specific metadata. A look at the manifests shows following (truncated):

Key Space manifest 131 fragments, total: 60.6MB, avg: 473.5KB, metadata: 431.7KB ┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓ ID Size (Metadata) ┃ Format ┃ Key Span ┃ Level ┃ Committed At ┃ Compacted At ┃ ┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩ m86rvtu9y6 260.5KB (3.2KB) │ vortex │ 0..10000 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ 0qaq87gz87 30.4MB (19.6KB) │ vortex │ 0..1294000 │ L0 │ 2025-11-06 18:20:00.224537+00:00 │ N/A │ dve28ce6z8 247.1KB (3.1KB) │ vortex │ 0..10000 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ hcjbwo31q5 237.4KB (3.1KB) │ vortex │ 0..10000 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ tzs9ssfgd7 238.0KB (3.2KB) │ vortex │ 0..10000 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ Column Group manifest for table_sl6o0u 6 fragments, total: 113.9MB, avg: 19.0MB, metadata: 111.5KB ┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓ ID Size (Metadata) ┃ Format ┃ Key Span ┃ Level ┃ Committed At ┃ Compacted At ┃ ┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩ pqjb420f8r 20.2MB (19.1KB) │ vortex │ 0..228261 │ L0 │ 2025-11-06 18:20:00.224537+00:00 │ N/A │ vrv19qwg1i 20.2MB (19.1KB) │ vortex │ 228261..456522 │ L0 │ 2025-11-06 18:20:00.224537+00:00 │ N/A │ 9lhiuacvoi 20.0MB (19.1KB) │ vortex │ 456522..684783 │ L0 │ 2025-11-06 18:20:00.224537+00:00 │ N/A │ yq7dqeed9r 20.1MB (19.1KB) │ vortex │ 684783..913044 │ L0 │ 2025-11-06 18:20:00.224537+00:00 │ N/A │ 2tbvh0v6td 20.1MB (19.1KB) │ vortex │ 913044..1141305 │ L0 │ 2025-11-06 18:20:00.224537+00:00 │ N/A │ Column Group manifest for table_sl6o0u.audio 1165 fragments, total: 137.6GB, avg: 120.9MB, metadata: 2.7MB ┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓ ID Size (Metadata) ┃ Format ┃ Key Span ┃ Level ┃ Committed At ┃ Compacted At ┃ ┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩ xn6rq15db4 129.5MB (2.4KB) │ vortex │ 0..1177 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ vn4d04pp2r 128.7MB (2.4KB) │ vortex │ 1177..2354 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ yqgn69c1rh 126.2MB (2.4KB) │ vortex │ 2354..3531 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ 9mmqc78utg 129.3MB (2.4KB) │ vortex │ 3531..4708 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ ffsgw7yf5m 126.6MB (2.4KB) │ vortex │ 4708..5885 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ Column Group manifest for table_sl6o0u.audio.meta 130 fragments, total: 63.8MB, avg: 502.5KB, metadata: 891.4KB ┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓ ID Size (Metadata) ┃ Format ┃ Key Span ┃ Level ┃ Committed At ┃ Compacted At ┃ ┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩ lkbhpyhmhq 507.0KB (6.9KB) │ vortex │ 0..10000 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ y052mol59q 514.9KB (6.9KB) │ vortex │ 0..10000 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ w128y2udmu 504.6KB (6.9KB) │ vortex │ 0..10000 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ v36ealhiys 502.9KB (6.9KB) │ vortex │ 0..10000 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │ a03enoo1j5 502.7KB (6.9KB) │ vortex │ 0..10000 │ L0 │ 2025-11-06 20:10:41.359348+00:00 │ N/A │

A typical scan

A typical scan over this table might look like this:

sp.scan(table["audio.bytes"], where=table["silence_ratio"] < 0.1)

When executing this scan, Spiral client performs following steps:

Load Fragment Manifest(s)

Client identifies that the scan involves two column groups:

  • root, for filtering on silence_ratio
  • audio, for projecting audio.bytes

Client scans the table’s manifests to identify relevant fragments for both column groups.

Client determines that silence_ratio filter can be pushed-down into the root column group scan, and prunes manifests using statistics metadata about fragments.

Fragments

Scan Filtered Column Group(s)

Client scans the fragments of the column group involved in filtering (root).

Client applies the filter silence_ratio < 0.1 and produces a row mask (and the projected columns but in this case no columns are being projected from this column group). In practice, this is a columnar file scan over lots of rows (very efficient!).

Row mask indicates which rows satisfy the filter condition, and when combined with alignment metadata from manifests, client can determine which keys correspond to the filtered rows.

Filtered

Join Needed Key Spaces

Client identifies the key spaces needed for join between the filtered column group and the projected column group.

Client loads key spaces, applies row mask that is the result of the filtering, and joins them together to produce a new row mask that indicates which value rows are needed for the projected column group.

Keys

Take Projected Column Group(s)

Client scans the fragments of the column group involved in projection, audio in this case.

Client applies the row mask obtained from the previous step to read only the needed rows from the projected column group.

Take

This is only possible because of the random access performance of Vortex files.

About Performance

Scans are optimized for high-throughput.

  • Filters are pushed down into Vortex file scans over large number of rows (10-20x faster that Parquet!).
  • Keys are joined efficiently using partitioned Merkle hash tries.
  • Projections are evaluated as random access Vortex file reads (100x faster than Parquet!).

This last point means that each batch of rows out of a scan is expressed only as masked Vortex array, enabling zero-copy & zero-decompression transfer of data from storage all the way to the end user. And since Vortex arrays can decompress on the GPU, this means that data can be transferred directly into GPU memory!

Last updated on